今天我们将搭建的是Hadoop 2.0分布式环境,包括真实主机和虚拟机的环境。

集群环境:

1 NameNode(实在主机):

Linux yan-Server 3.4.36-gentoo #3 SMP Mon Apr 1 14:09:12 CST 2013 x86_64 AMD Athlon(tm) X4 750K Quad Core Processor AuthenticAMD GNU/Linux

2 DataNode1(虚拟机):

Linux node1 3.5.0-23-generic #35~precise1-Ubuntu SMP Fri Jan 25 17:13:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

3 DataNode2(虚拟机):

Linux node2 3.5.0-23-generic #35~precise1-Ubuntu SMP Fri Jan 25 17:13:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

4 DataNode3(虚拟机):

Linux node3 3.5.0-23-generic #35~precise1-Ubuntu SMP Fri Jan 25 17:13:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

1.装置VirtualBox虚拟机

Gentoo下直接指令编译装置,或许官网下载二进制装置包直接装置:

emerge -av virtualbox

2.虚拟机下装置Ubuntu 12.04 LTS

运用Ubuntu镜像装置完成后,然后再克隆别的两台虚拟主机(这儿会遇到克隆的主机发动的时分主机名和MAC地址会是相同的,局域网会形成抵触)

主机名修正文件

/etc/hostname

MAC地址修正需求先删去文件

/etc/udev/rules.d/70-persistent-net.rules

然后在发动之前设置VirtualBox虚拟机的MAC地址

Hadoop 2.0:分布式环境建立装置装备(hadoop分布式环境搭建)  2.0 第1张

发动后会主动生成删去的文件,装备网卡的MAC地址。

为了更便利的在各主机之间同享文件,能够发动主机yan-Server的NFS,将指令参加/etc/rc.local中,让客户端主动挂载NFS目录。

删去各虚拟机的NetworkManager,手动设置静态的IP地址,例如node2主机的/etc/network/interfaces文件装备如下:

  1. autolo
  2. ifaceloinetloopback
  3. autoeth0
  4. ifaceeth0inetstatic
  5. address192.168.137.202
  6. gateway192.168.137.1
  7. netmask255.255.255.0
  8. network192.168.137.0
  9. broadcast192.168.137.255

主机的根本环境设置结束,下面是主机对应的IP地址

类型

主机名

IP

NameNode

yan-Server

192.168.137.100

DataNode

node1

192.168.137.201

DataNode

node2

192.168.137.202

DataNode

node3

192.168.137.203

为了节约资源,能够设置虚拟机默许发动字符界面,然后经过主机的TERMINAL ssh长途登录。(SSH现已发动服务,答应长途登录,装置办法不再赘述)

设置方法是修正/etc/default/grub文件将下面的一行免除注释

GRUB_TERMINAL=console

然后update-grub即可。

3.Hadoop环境的装备

3.1装备JDK环境(之前就做好了,这儿不再赘述)

3.2在官网下载Hadoop,然后解压到/opt/目录下面(这儿运用的是Hadoop-2.0.4-alpha)

然后进入目录/opt/hadoop-2.0.4-alpha/etc/hadoop,装备hadoop文件

修正文件hadoop-env.sh

  1. exportHADOOP_FREFIX=/opt/hadoop-2.0.4-alpha
  2. exportHADOOP_COMMON_HOME=${HADOOP_FREFIX}
  3. exportHADOOP_HDFS_HOME=${HADOOP_FREFIX}
  4. exportPATH=$PATH:$HADOOP_FREFIX/bin
  5. exportPATH=$PATH:$HADOOP_FREFIX/sbin
  6. exportHADOOP_MAPRED_HOME=${HADOOP_FREFIX}
  7. exportYARN_HOME=${HADOOP_FREFIX}
  8. exportHADOOP_CONF_HOME=${HADOOP_FREFIX}/etc/hadoop
  9. exportYARN_CONF_DIR=${HADOOP_FREFIX}/etc/hadoop
  10. exportJAVA_HOME=/opt/jdk1.7.0_21

修正文件hdfs-site.xml

  1. <configuration>
  2. <property>
  3. <name>dfs.namenode.name.dir</name>
  4. <value>file:/opt/hadoop-2.0.4-alpha/workspace/name</value>
  5. <description>DetermineswhereonthelocalfilesystemtheDFSnamenodeshouldstorethe
  6. nametable.Ifthisisacomma-delimitedlistofdirectories,thennametableis
  7. replicatedinallofthedirectories,forredundancy.</description>
  8. <final>true</final>
  9. </property>
  10. <property>
  11. <name>dfs.datanode.data.dir</name>
  12. <value>file:/opt/hadoop-2.0.4-alpha/workspace/data</value>
  13. <description>DetermineswhereonthelocalfilesystemanDFSdatanodeshould
  14. storeitsblocks.Ifthisisacomma-delimitedlistofdirectories,thendatawill
  15. bestoredinallnameddirectories,typicallyondifferentdevices.Directoriesthatdonotexistareignored.
  16. </description>
  17. <final>true</final>
  18. </property>
  19. <property>
  20. <name>dfs.replication</name>
  21. <value>1</value>
  22. </property>
  23. <property>
  24. <name>dfs.permission</name>
  25. <value>false</value>
  26. </property>
  27. </configuration>

修正文件mapred-site.xml

  1. <configuration>
  2. <property>
  3. <name>mapreduce.framework.name</name>
  4. <value>yarn</value>
  5. </property>
  6. <property>
  7. <name>mapreduce.job.tracker</name>
  8. <value>hdfs://yan-Server:9001</value>
  9. <final>true</final>
  10. </property>
  11. <property>
  12. <name>mapreduce.map.memory.mb</name>
  13. <value>1536</value>
  14. </property>
  15. <property>
  16. <name>mapreduce.map.java.opts</name>
  17. <value>-Xmx1024M</value>
  18. </property>
  19. <property>
  20. <name>mapreduce.reduce.memory.mb</name>
  21. <value>3072</value>
  22. </property>
  23. <property>
  24. <name>mapreduce.reduce.java.opts</name>
  25. <value>-Xmx2560M</value>
  26. </property>
  27. <property>
  28. <name>mapreduce.task.io.sort.mb</name>
  29. <value>512</value>
  30. </property>
  31. <property>
  32. <name>mapreduce.task.io.sort.factor</name>
  33. <value>100</value>
  34. </property>
  35. <property>
  36. <name>mapreduce.reduce.shuffle.parallelcopies</name>
  37. <value>50</value>
  38. </property>
  39. <property>
  40. <name>mapred.system.dir</name>
  41. <value>file:/opt/hadoop-2.0.4-alpha/workspace/systemdir</value>
  42. <final>true</final>
  43. </property>
  44. <property>
  45. <name>mapred.local.dir</name>
  46. <value>file:/opt/hadoop-2.0.4-alpha/workspace/localdir</value>
  47. <final>true</final>
  48. </property>
  49. </configuration>

修正文件yarn-env.xml

  1. exportHADOOP_FREFIX=/opt/hadoop-2.0.4-alpha
  2. exportHADOOP_COMMON_HOME=${HADOOP_FREFIX}
  3. exportHADOOP_HDFS_HOME=${HADOOP_FREFIX}
  4. exportPATH=$PATH:$HADOOP_FREFIX/bin
转载请说明出处
知优网 » Hadoop 2.0:分布式环境建立装置装备(hadoop分布式环境搭建)

发表评论

您需要后才能发表评论