1.基础设施(四台服务器,地址分别为111、112、113、114) ① 设置ip及主机名
vi /etc/sysconfig/network-scripts/ifcfg-ens33
vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=node01
vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=node02
vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=node03
vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=node04
设置本机ip与主机名之间的映射关系
vi /etc/hosts
192.168.0.111 node01
192.168.0.112 node02
192.168.0.113 node03
192.168.0.114 node04
② 关闭防火墙与selinux
service iptables stop
systemctl stop firewalld.service
chkconfig iptables off
systemctl disable firewalld.service
vi /etc/selinux/config
SELINUX=disabled
③ 设置时间同步
yum install ntp -y
vi /etc/ntp.conf
server ntp1.aliyun.com
service ntpd start
chkconfig ntpd on
④ 安装jdk 下载地址:https://www.oracle.com/java/technologies/javase/javase-jdk8-downloads.html
rpm -i jdk-8u181-linux-x64.rpm
vi /etc/profile
export JAVA_HOME=/usr/java/default
export CLASSPATH=.:${JAVA_HOME}/jre/lib/rt.jar:${JAVA_HOME}/lib/dt.jar:${JAVA_HOME}/lib/tools.jar
export PATH=$PATH:${JAVA_HOME}/bin
source /etc/profile
⑤ 设置ssh免秘钥
ssh localhost
ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
scp /root/.ssh/id_dsa.pub node02:/root/.ssh/node01.pub
cd ~/.ssh
cat node01.pub >> authorized_keys
scp /root/.ssh/id_dsa.pub node03:/root/.ssh/node01.pub
cd ~/.ssh
cat node01.pub >> authorized_keys
scp /root/.ssh/id_dsa.pub node04:/root/.ssh/node01.pub
cd ~/.ssh
cat node01.pub >> authorized_keys
2.部署配置 ① 安装Hadoop 下载地址:https://downloads.apache.org/hadoop/common/hadoop-3.2.2/
mkdir /opt/bigdata
rz -be
tar -zxvf hadoop-3.2.2.tar.gz
vi /etc/profile
export JAVA_HOME=/usr/java/default
# 放在path之前
export HADOOP_HOME=/opt/bigdata/hadoop-3.2.2
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source /etc/profile
② 配置hadoop的角色
cd /$HADOOP_HOME/etc/hadoop
vi hadoop-env.sh
export JAVA_HOME=/usr/java/default
vi core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://node01:9000</value>
</property>
vi hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/var/bigdata/hadoop/full/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/var/bigdata/hadoop/full/dfs/data</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>node02:50090</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>/var/bigdata/hadoop/full/dfs/secondary</value>
</property>
vi slaves
node02
node03
node04
③ 配置hadoop的分发
cd /opt
scp -r ./bigdata/ node02:`pwd`
scp -r ./bigdata/ node03:`pwd`
scp -r ./bigdata/ node04:`pwd`
3.初始化运行 ① 创建存储目录、初始化一个空的fsImage、创建NameNode(不会创建DataNode,DataNode在启动时自动创建)、生成集群id(clusterId、只需要格式化一次,再次格式化会生成新的clusterId)
hdfs namenode -format
② 初始化datanode与secondary角色。创建datanode与secondary的数据目录
start-dfs.sh
③ windows配置host映射
C:\Windows\System32\drivers\etc
192.168.0.111 node01
192.168.0.112 node02
192.168.0.113 node03
192.168.0.114 node04
④ 访问页面
http://node01:9870/
问题:初始化时出现如下问题
Starting namenodes on [node01]
ERROR: Attempting to operate on hdfs namenode as root
ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
Starting datanodes
ERROR: Attempting to operate on hdfs datanode as root
ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.
Starting secondary namenodes [node01]
ERROR: Attempting to operate on hdfs secondarynamenode as root
ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.
原因:缺少用户定义造成的 解决: ① 在/hadoop/sbin路径下在start-dfs.sh,stop-dfs.sh顶部添加如下内容
#!/usr/bin/env bash
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
② start-yarn.sh,stop-yarn.sh顶部添加如下内容
#!/usr/bin/env bash
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
|