本篇文章所有组件安装包均在/opt/software目录下
安装目录在/opt/module
一、组件版本
Name | Version |
---|
Centos | 7.9 | Hadoop | 2.7.7 | Spark | 2.1.1 | Flink | 1.10.2 | Flume | 1.7.0 | Hive | 2.3.4 | Zookeeper | 3.4.10 | Sqoop | 1.4.7 |
二、JDK
1、解压
[root@master software]
[root@master software]
[root@master module]
2、配置环境变量/etc/profile
export JAVA_HOME=/opt/module/jdk
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
激活环境变量
[root@master module]
3、分发到另外两个节点
[root@master module]
[root@master module]
[root@master module]
[root@master module]
[root@slave1 module]
[root@slave2 module]
三、Hadoop完全分布式部署
1、解压
[root@master software]
[root@master software]
[root@master module]
2、配置环境变量/etc/profile
export HADOOP_HOME=/opt/module/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
激活环境变量
[root@master module]
3、配置文件
[root@master module]
[root@master hadoop]
hadoop-env.sh
slaves
hdfs-site.xml
core-site.xml
mapred-site.xml
yarn-site.xml
hadoop-env.sh
修改为自己的jdk安装路径
export JAVA_HOME=/opt/module/jdk
slaves
master
slave1
slave2
hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>slave1:50090</value>
</property>
core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/module/hadoop/tmp</value>
</property>
[root@master hadoop]
mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
4、分发到另外两个节点
[root@master module]
[root@master module]
[root@master module]
[root@master module]
[root@slave1 module]
[root@slave2 module]
5、启动hadoop集群
格式化namenode
[root@master module]
看到successfully formatted字样说明格式化
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-SpQu3AHQ-1634900922729)(/home/xiaojia/snap/typora/42/.config/Typora/typora-user-images/image-20211022160526823.png)]
启动hdfs和yarn
[root@master module]
[root@master module]
分别在三个节点执行jps,查看进程
[root@master hadoop]
5618 NameNode
5715 DataNode
6389 Jps
6221 NodeManager
5998 ResourceManager
[root@slave1 hadoop]
2727 Jps
2457 DataNode
2540 SecondaryNameNode
2623 NodeManager
[root@slave2 hadoop]
2241 DataNode
2449 Jps
2345 NodeManager
四、Hive
1、解压
[root@master software]
[root@master software]
[root@master module]
2、配置环境变量/etc/profile
export HIVE_HOME=/opt/module/hive
export PATH=$PATH:$HIVE_HOME/bin
激活环境变量
[root@master module]
3、配置文件
[root@master hive]
[root@master conf]
hive-site.xml
添加以下内容
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionDriver</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&useSSL=false</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
</configuration>
[root@master conf]
hive-env.sh
修改以下内容
# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/opt/module/hadoop
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/opt/module/hive/conf
hive-log4j2.properties
[root@master conf]
# list of properties
property.hive.log.level = INFO
property.hive.root.logger = DRFA
property.hive.log.dir = /opt/module/hive/logs
property.hive.log.file = hive.log
property.hive.perflogger.log.level = INFO
4、将mysql驱动复制到hive/lib下
[root@master software]
5、初始化元数据库
[root@master hive]
6、启动Hive
[root@master hive]
hive (default)> show databases;
OK
database_name
default
Time taken: 3.57 seconds, Fetched: 1 row(s)
hive (default)>
五、Zookeeper
1、解压
[root@master software]
[root@master software]
[root@master module]
2、配置环境变量/etc/profile
export ZK_HOME=/opt/module/zookeeper
export PATH=$PATH:$ZK_HOME/bin
激活环境变量
[root@master module]
3、配置文件
[root@master conf]
zoo.cfg
dataDir=/opt/module/zookeeper/zkData
# 在最后添加这三行
server.1=master:2888:3888
server.2=slave1:2888:3888
server.3=slave2:2888:3888
[root@master zookeeper]
[root@master zkData]
4、分发到另外两个节点
[root@master module]
[root@master module]
[root@master module]
[root@master module]
[root@slave1 module]
[root@slave2 module]
5、启动zk
[root@master module]
[root@slave1 module]
[root@slave2 module]
[root@master module]
ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper/bin/../conf/zoo.cfg
Mode: follower
[root@slave1 module]
ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper/bin/../conf/zoo.cfg
Mode: leader
[root@slave2 module]
ZooKeeper JMX enabled by default
Using config: /opt/module/zookeeper/bin/../conf/zoo.cfg
Mode: follower
六、Kafka
1、解压
[root@master software]
[root@master software]
[root@master module]
2、配置环境变量/etc/profile
export KAFKA_HOME=/opt/module/kafka
export PATH=$PATH:$KAFKA_HOME/bin
激活环境变量
[root@master module]
3、配置文件
server.properties
delete.topic.enable=true
# The id of the broker. This must be set to a unique integer for each broker.
broker.id=0 # 这个值必须是唯一的
zookeeper.connect=master:2181,slave1:2181,slave2:2181
# A comma separated list of directories under which to store log files
log.dirs=/opt/module/kafka/logs
4、分发到另外两个节点
[root@master module]
[root@master module]
[root@master module]
[root@master module]
[root@slave1 module]
[root@slave2 module]
5、启动kafka
[root@master kafka]
[root@slave1 kafka]
[root@slave2 kafka]
[root@master ~]
Created topic "xiaojia".
七、Flume
1、解压
[root@master software]
[root@master software]
[root@master module]
2、配置环境变量/etc/profile
export FLUME_HOME=/opt/module/flume
export PATH=$PATH:$FLUME_HOME/bin
激活环境变量
[root@master module]
3、配置文件
flume-env.sh
# Enviroment variables can be set here.
export JAVA_HOME=/opt/module/jdk
4、写个flume脚本试试看
[root@master conf]
a1.sources=r1
a1.channels=c1
a1.sinks=k1
a1.sources.r1.type=netcat
a1.sources.r1.bind=localhost
a1.sources.r1.port=44444
a1.channels.c1.type=memory
a1.channels.c1.capacity=1000
a1.channels.c1.transactionCapacity=100
a1.sinks.k1.type=logger
a1.sources.r1.channels=c1
a1.sinks.k1.channel=c1
[root@master flume]
[root@master ~]
Trying ::1...
telnet: connect to address ::1: Connection refused
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
haha
OK
2021-10-22 18:03:05,450 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.source.NetcatSource.start(NetcatSource.java:169)] Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/127.0.0.1:44444]
2021-10-22 18:03:22,300 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)] Event: { headers:{} body: 68 61 68 61 0D haha. }
八、Sqoop
1、解压
[root@master software]
[root@master software]
[root@master module]
2、配置环境变量/etc/profile
export SQOOP_HOME=/opt/module/sqoop
export PATH=$PATH:$SQOOP_HOME/bin
激活环境变量
[root@master module]
3、配置文件
[root@master conf]
将mysql驱动文件拷贝到lib下
[root@master module]
sqoop-env.sh
#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/opt/module/hadoop
#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/opt/module/hadoop
#set the path to where bin/hbase is available
#export HBASE_HOME=
#Set the path to where bin/hive is available
export HIVE_HOME=/opt/module/hive
#Set the path for where zookeper config dir is
export ZOOCFGDIR=/opt/module/zookeeper
~
4、测试一下
[root@master module]
九、Spark
1、解压
[root@master software]
[root@master software]
[root@master module]
2、配置环境变量/etc/profile
export SPARK_HOME=/opt/module/spark
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
激活环境变量
[root@master module]
3、配置文件
[root@master conf]
[root@master conf]
slaves
slave1
slave2
spark-env.sh
export HADOOP_CONF_DIR=/opt/module/hadoop/etc/hadoop
export HADOOP_HOME=/opt/module/hadoop
export JAVA_HOME=/opt/module/jdk
export SPARK_MASTER_HOST=master
4、分发到另外两个节点
[root@master module]
[root@master module]
[root@master module]
[root@master module]
[root@slave1 module]
[root@slave2 module]
5、启动spark
[root@master module]
[root@master module]
[root@master module]
5618 NameNode
5715 DataNode
8598 Master
8695 Jps
7356 QuorumPeerMain
6221 NodeManager
7437 Kafka
5998 ResourceManager
[root@slave1 kafka]
3488 Worker
3539 Jps
2904 QuorumPeerMain
2457 DataNode
2540 SecondaryNameNode
2623 NodeManager
[root@slave2 kafka]
2241 DataNode
3239 Worker
2345 NodeManager
3290 Jps
2685 QuorumPeerMain
十、Flink
1、解压
[root@master software]
[root@master software]
[root@master module]
2、配置环境变量/etc/profile
export SPARK_HOME=/opt/module/spark
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
激活环境变量
[root@master module]
3、配置文件
[root@master conf]
[root@master conf]
slaves
slave1
slave2
spark-env.sh
jobmanager.rpc.address: master
parallelism.default: 4
4、分发到另外两个节点
[root@master module]
[root@master module]
[root@master module]
[root@master module]
[root@slave1 module]
[root@slave2 module]
5、启动spark
[root@master flink]
[root@master module]
5618 NameNode
5715 DataNode
9908 StandaloneSessionClusterEntrypoint
8598 Master
9992 Jps
7356 QuorumPeerMain
6221 NodeManager
7437 Kafka
5998 ResourceManager
[root@slave1 kafka]
3488 Worker
3922 TaskManagerRunner
2904 QuorumPeerMain
2457 DataNode
3993 Jps
2540 SecondaryNameNode
2623 NodeManager
[root@slave2 kafka]
2241 DataNode
3766 Jps
3239 Worker
3703 TaskManagerRunner
2345 NodeManager
2685 QuorumPeerMain
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ExZQsKqQ-1634900922733)(/home/xiaojia/snap/typora/42/.config/Typora/typora-user-images/image-20211022184314857.png)]
prowered by 小贾
|