1.zookeeper集群的安装
前置条件:jdk安装配置完成、免密自连,虚拟机优化(参考文章http://t.csdn.cn/spiKA)
解压:[root@master01 download]# tar -zxf apache-zookeeper-3.6.3-bin.tar.gz -C /opt/software/
重命名:[root@master01 software]# mv apache-zookeeper-3.6.3-bin zookeeper-3.6.3
新建数据存储目录:[root@master01 zookeeper-3.6.3]# mkdir data
重命名zoo_sample.cfg文件:[root@master01 conf]# mv zoo_sample.cfg zoo.cfg

?修改zoo.cfg文件配置:
????????修改dataDir的路径为创建的data
? ? ? ? 在clientPort下方增加:
server.1=master01:2888:3888
server.2=master02:2888:3888
server.3=worker01:2888:3888

?在创建的data目录下新建myid文件
[root@master01 data]# vim myid
(master01为1,master02为2,worker01为3)

?配置并激活环境变量
[root@master01 data]# vim /etc/profile.d/my.sh

?source /etc/profile
分发zookeeper-3.6.3文件到master02及worker01
分发完成注意修改各主机对应的myid文件
启动zookeeper集群,zkServer.sh start(在三台机器上都执行此命令)

jps查看进程,三台机器都有QuorumPeerMain即表示成功

zookeeper集群搭建成功!
2.hadoop集群的搭建
2.1.解压
在root目录下的/opt/目录下新建一个download目录用于存放压缩包,一个software目录用于解压文件
将保存在/opt/download/目录下的hadoop-3.1.3.tar.gz压缩文件解压至/opt/software/目录下
tar -zxvf /opt/download/hadoop-3.1.3.tar.gz -C /opt/software
2.2.配置环境变量并激活
vim /etc/profile.d/my.sh
# hadoop
export HADOOP_HOME=/opt/software/hadoop-3.1.3
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/lib
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export HDFS_JOURNALNODE_USER=root
export HDFS_ZKFC_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_CONF_DIR=$HADOOP_HOME
export HADOOP_LIBEXEC_DIR=$HADOOP_HOME/libexec
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
激活:source /etc/profile
2.3.新建hadoop数据存储目录
[root@master01 ~]# cd /opt/software/hadoop-3.1.3/ [root@master01 hadoop-3.1.3]#?mkdir data

2.4.hadoop 内部文件配置
cd /opt/software/hadoop-3.1.3/etc/hadoop
1.hadoop-env.sh(java依赖)
vim hadoop-env.sh
找到export JAVA_HOME取消此行注释并修改为本地配置的jdk
export JAVA_HOME=/opt/software/jdk1.8.0_171
2.新建workers
vim workers
master01
master02
worker01
(须进行主机名的映射)
[root@master01 ~]# vim /etc/hosts
192.168.xxx.xxx master01
192.168.xxx.xxx master02
192.168.xxx.xxx worker01
3.配置core-site.xml
<!-- vim core-site.xml -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
<description>逻辑名称,必须hdfs-site.xml中dfs.nameservices值保持一致</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop/mycluster</value>
<description>namenode上本地的hadoop临时文件夹</description>
</property>
<property>
<name>hadoop.http.staticuser.user</name>
<value>root</value>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>1048576?</value>
<description>Size of read/write SequenceFiles buffer: 128K</description>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>master01:2181,master02:2181,worker01:2181</value>
</property>
<property>
<name>hadoop.zk.address</name>
<value>master01:2181,master02:2181,worker01:2181</value>
</property>
<property>
<name>ha.zookeeper.session-timeout.ms</name>
<value>10000</value>
<description>hadoop链接zookeeper的超时时长设置ms</description>
</property>
4.配置hdfs-site.xml
<!-- vim hdfs-site.xml -->
<property>
<name>dfs.replication</name>
<value>2</value>
<description>Hadoop中每个block的备份数</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/software/hadoop-3.1.3/data/dfs/name</value>
<description>namenode上存储hdfs名字空间元数据 </description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/software/hadoop-3.1.3/data/dfs/data</value>
<description>datanode上数据块的物理存储位置</description>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master01:9869</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
<description>指定hdfs的nameservice,需要和core-site.xml中的保持一致</description>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
<description>mycluster为集群逻辑名称,映射两个namenode逻辑名称</description>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>master01:8020</value>
<description>master01的RPC通信地址</description>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>master01:9870</value>
<description>master01的http通信地址</description>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>master02:8020</value>
<description>master02的RPC通信地址</description>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>master02:9870</value>
<description>master02的http通信地址</description>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://master01:8485;master02:8485;worker01:8485/mycluster</value>
<description>指定NameNode的edits元数据的共享存储位置(JournalNode列表)</description>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/tmp/hadoop/journaldata</value>
<description>指定JournalNode在本地磁盘存放数据的位置</description>
</property>
<!--容错-->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
<description>开启NameNode失败自动切换</description>
</property>
<property>
<name>dfs.client.failover.proxy.provider.kgccluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
<description>配置失败自动切换实现方式</description>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
<description>脑裂处理</description>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
<description>使用sshfence隔离机制时,需要ssh免密登陆</description>
</property>
<!--权限设定避免因权限问题导致操作失败异常-->
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
<description>关闭权限验证</description>
</property>
<!--限流将更多的内存和带宽让给job-->
<property>
<name>dfs.image.transfer.bandwidthPerSec</name>
<value>1048576</value>
</property>
<property>
<name>dfs.block.scanner.volume.bytes.per.second</name>
<value>1048576</value>
</property>
<property>
<name>dfs.datanode.balance.bandwidthPerSec</name>
<value>20m</value>
</property>
5.配置mapred-site.xml
<!-- vim mapred-site.xml -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>job执行框架:local, classic or yarn.</description>
<final>true</final>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>/opt/software/hadoop-3.1.3/etc/hadoop:/opt/software/hadoop-3.1.3/share/hadoop/common/lib/*:/opt/software/hado op-3.1.3/share/hadoop/common/*:/opt/software/hadoop-3.1.3/share/hadoop/hdfs:/opt/software/hadoop-3.1.3/share/ hadoop/hdfs/lib/*:/opt/software/hadoop-3.1.3/share/hadoop/hdfs/*:/opt/software/hadoop-3.1.3/share/hadoop/mapr educe/lib/*:/opt/software/hadoop-3.1.3/share/hadoop/mapreduce/*:/opt/software/hadoop-3.1.3/share/hadoop/yarn: /opt/software/hadoop-3.1.3/share/hadoop/yarn/lib/*:/opt/software/hadoop-3.1.3/share/hadoop/yarn/*</value>
</property>
<!--job history单节点配置即可-->
<property>
<name>mapreduce.jobhistory.address</name>
<value>master01:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master01:19888</value>
</property>
<!--Container内存上限,由nodemanager读取并控制,实际使用超出时会被nodemanager kill Connection reset by peer-->
<property>
<name>mapreduce.map.memory.mb</name>
<value>1024</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>2048</value>
</property>
注:mapreduce.application.classpath的value
hadoop classpath即可得出

6. 配置yarn-site.xml
注意:各个主机对应其自身的Node Manager Config配置,即master01,maser02,worker01
分发完成注意修改!
<!-- 容错 -->
<property>
<name>yarn.resourcemanager.connect.retry-interval.ms</name>
<value>10000</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- ResourceManager重启容错 -->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
<description>RM 重启过程中不影响正在运行的作业</description>
</property>
<!-- 应用的状态信息存储方案:ZK -->
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
<description>应用的状态等信息保存方式:ha只支持ZKRMStateStore</description>
</property>
<!-- yarn集群配置 -->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>mycluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<name>yarn.resourcemanager.work-preserving-recovery.enabled</name>
<value>true</value>
</property>
<!-- rm1 configs -->
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>master01</value>
</property>
<property>
<name>yarn.resourcemanager.address.rm1</name>
<value>master01:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.rm1</name>
<value>master01:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.https.address.rm1</name>
<value>master01:8090</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>master01:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm1</name>
<value>master01:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.rm1</name>
<value>master01:8033</value>
</property>
<!-- rm2 configs -->
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>master02</value>
</property>
<property>
<name>yarn.resourcemanager.address.rm2</name>
<value>master02:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.rm2</name>
<value>master02:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.https.address.rm2</name>
<value>master02:8090</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>master02:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm2</name>
<value>master02:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.rm2</name>
<value>master02:8033</value>
</property>
<!-- Node Manager Configs 每个节点都要配置 -->
<property>
<description>Address where the localizer IPC is. ********* </description>
<name>yarn.nodemanager.localizer.address</name>
<value>master01:8040</value>
</property>
<property>
<description>Address where the localizer IPC is. ********* </description>
<name>yarn.nodemanager.address</name>
<value>master01:8050</value>
</property>
<property>
<description>NM Webapp address. ********* </description>
<name>yarn.nodemanager.webapp.address</name>
<value>master01:8042</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/tmp/hadoop/yarn/local</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/tmp/hadoop/yarn/log</value>
</property>
<!--资源优化-->
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>2</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>2048</value>
</property>
<!--日志聚合-->
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>86400</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.application.classpath</name>
<value>/opt/software/hadoop-3.1.3/etc/hadoop:/opt/software/hadoop-3.1.3/share/hadoop/common/lib/*:/opt/software/hadoop-3.1.3/share/hadoop/common/*:/opt/software/hadoop-3.1.3/share/hadoop/hdfs:/opt/software/hadoop-3.1.3/share/hadoop/hdfs/lib/*:/opt/software/hadoop-3.1.3/share/hadoop/hdfs/*:/opt/software/hadoop-3.1.3/share/hadoop/mapreduce/lib/*:/opt/software/hadoop-3.1.3/share/hadoop/mapreduce/*:/opt/software/hadoop-3.1.3/share/hadoop/yarn:/opt/software/hadoop-3.1.3/share/hadoop/yarn/lib/*:/opt/software/hadoop-3.1.3/share/hadoop/yarn/*</value>
</property>
7.?分发
将配置好的hadoop-3.1.3文件分发到集群的另两台机器上,注意修改yarn-site.xml文件
2.5.初始化集群并启动
1.启动zookeeper集群(hadoop高可用是基于zookeeper管理的)
zkServer.sh start? ?三台主机都需要开启
2.查看zookeeper状态??# 1 leader + 2 followers (一个leader,两个follower)
zkServer.sh status? ? ?
3.启动 journalnode 集群,三台主机都需要开启 ?? ?hdfs --daemon start journalnode? ? ?# *3? ?

三台机器都出现JournalNode??
4.格式化zkfc(master01即可) ?? ?hdfs zkfc -formatZK?


Successfully 即表示成功!?
5.主NN节点格式化(master01执行即可) ?? ?hdfs namenode -format


完成后此时可以启动集群start-all.sh 但此时master02并没有同步master01,即此时master02没有namenode服务,下面需要进行master02的namenode的同步与启动
可以start-all.sh 启动集群,jps查看服务
6.从NN节点格式化和启动(master02),首次启用,只执行一次 ?? ?hdfs namenode -bootstrapStandby ?? ?hdfs --daemon start namenode

7.启动集群 ?? ?start-all.sh(即start-dfs.sh 和 start-yarn.sh)
8.查看服务

9.web端查看
web端的端口号为9870
浏览器输入? 192.168.xxx.xxx:9870即可进入
想要利用主机名需要在本地进行ip映射

在文件最后添加主机名ip的映射(需要以管理员权限才能修改此文件)

?

10.验证高可用 ?? ?关闭主节点(master01)的namenode,查看master02的namenode是否被激活(standby->active) ?? ?hdfs --daemon stop namenode ?? ? ?? ?重启主节点(master01)的namenode,状态为standby ?? ?hdfs --daemon start namenode
End
|