Hadoop集群配置
hadoop-01 | hadoop-02 | hadoop-03 | ResourceManager | ResourceManager | | NodeManager | NodeManager | NodeManager | NameNode | NameNode | | DataNode | DataNode | DataNode | DFSZKFailoverController | DFSZKFailoverController | | JournalNode | JournalNode | JournalNode | QuorumPeerMain | QuorumPeerMain | QuorumPeerMain |
1. 配置主机名以及解析(3台)
????????新建三台虚拟机:hadoop1、hadoop2、hadoop3
????????配置host映射:
vi /etc/hosts
ip hadoop1
ip hadoop2
ip hadoop3
????????配置完成后可以使用ping +主机名
2.ssh免密操作
????????????生成密钥对(默认在/root/.ssh/目录下)
ssh-keygen
????????????????拷贝密钥到其他节点
ssh-copy-id hadoop-01
ssh-copy-id hadoop-02
ssh-copy-id hadoop-03
3. 安装hadoop
????????3.1 修改配置文件
????????(1)将hadoop安装包解压,重命名为hadoop,拷贝到/export/software/下
? ? ? ? (2)修改各配置文件,所有的配置文件在/export/software/hadoop/etc/hadoop目录下
????????????????修改core-site.xml,内容如下:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://cluster:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/export/software/hadoop-2.4.1/tmp</value>
</property>
<property>
<name>hadoop.native.lib</name>
<value>false</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>master:2181,slave1:2181,slave2:2181</value>
</property>
</configuration>
修改hdfs-site.xml,内容如下:
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>cluster</value>
</property>
<property>
<name>dfs.ha.namenodes.cluster</name>
<value>nn01,nn02</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster.nn01</name>
<value>master:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster.nn01</name>
<value>master:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster.nn02</name>
<value>slave1:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster.nn02</name>
<value>slave1:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://master:8485;slave1:8485;slave2:8485/cluster</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/export/data/hadoop/journaldata</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.cluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence
shell(/bin/true)
</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/export/software/hadoop-2.4.1/tmp/dfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/export/software/hadoop-2.4.1/tmp/dfs/data</value>
</property>
</configuration>
修改yarn-site.xml,内容如下:
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yrc</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>master</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>slave1</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>master:2181,slave1:2181,slave2:2181</value>
</property>
</configuration>
修改mapred-site.xml(该文件不存在,需要手动创建),cp mapred-site.xml.template mapred-site.xml,内容如下:
<configuration>
<!-- 采用yarn作为mapreduce的资源调度框架 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
修改slaves文件,内容如下:
hadoop-01
hadoop-02
hadoop-03
修改hadoop-env.sh文件,指定jdk的地址
# The java implementation to use.
export JAVA_HOME=/export/software/jdk1.8.0_131
配置hadoop环境变量,vi /etc/profile内容如下
vi /etc/profile
export HADOOP_HOME=/export/software/hadoop
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
:wq
source /etc/profile
3.2 拷贝复制到其它机器
????????
scp -r /export/software/hadoop hadoop2:/export/software
scp -r /export/software/hadoop hadoop3:/export/software
3.3 启动hadoop
? ? ? ? 格式化命令
hadoop namenode -format
hdfs zkfc -formatZK
? ? ? ?格式完之后将hadoop1得tmp文件传给hadoop2
scp -r /export/software/hadoop/tmp hadoop2:/export/software/hadoop/
? ? ? ? 启动命令
hadoop-daemon.sh start journalnode //需手动启动
zkServer.sh start //启动Zookeeper
start-all.sh
? ? ? ? 附加自己手动启动(系统没启动时可指定启动)
hadoop-daemon.sh start datanode
yarn-daemon.sh start resourcemanager?
3.4?启动hadoop后查看进程? ?jps
????????
????????
?????????
?3.5测试namenode高可用
(1)在hadoop-01上kill掉namenode进程,然后通过浏览器查看hadoop-02的状态,发现状态变为active,说明高可用测试成功 (2)重新启动hadoop-01的namenode进程,sh start-dfs.sh,浏览器访问hadoop-01,此时hadoop-01的状态为standby
|