机器规模、三台机器
hadoop3.1.2
spark使用最新版本?spark-3.1.2-bin-hadoop3.2
| henghe-051 | henghe-052 | henghe-053 | hadoop | namenode | namenode | | datanode | datanode | datanode | resourceManager | resourceManager | resourceManager | nodemanager | nodemanager | nodemanager | spark | master | master | | worker | worker | worker | zookeeper | | | zookeeper |
将spark-env.sh.template 改为?spark-env.sh
将spark-defaults.conf.template 改为?spark-defaults.conf
mv spark-defaults.conf.template spark-defaults.conf
mv spark-env.sh.template spark-env.sh
vim spark-env.sh?
export JAVA_HOME=/opt/jdk1.8
export SPARK_MASTER_WEBUI_PORT=8090
export SPARK_WORKER_WEBUI_PORT=8091
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER
-Dspark.deploy.zookeeper.url=henghe-053:2181
-Dspark.deploy.zookeeper.dir=/spark"
export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080 -Dspark.history.retainedApplications=30 -Dspark.history.fs.logDirectory=hdfs://henghe-052:8020/spark-log"
export HADOOP_HOME=/opt/moudle/hadoop-3.1.3
export HADOOP_CONF_DIR=/opt/moudle/hadoop-3.1.3/etc/hadoop
vim spark-defaults.conf
加上
spark.eventLog.enabled true
spark.eventLog.dir hdfs://henghe-052:8020/spark-log
spark.yarn.historyServer.address=henghe-052:18080
spark.history.ui.port=18080
将sbin/start-all.sh 改个名字和hadoop的start-all.sh重名了,启动可能出问题
mv start-all.sh start-spark-all.sh
henghe-052启动
sbin/start-spark-all.sh
?在henghe-051单独启动master
?在spark-web ui查看
启动history,前提是在hdfs已经建好了history目录
sbin/start-history-server.sh
?查看界面:hengeh-052:18080
?跑个案例验证下集群功能
bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://henghe-052:7077 ./examples/jars/spark-examples_2.12-3.1.2.jar 10
跑成功
单独启动worker命令
sbin/start-worker.sh ?spark://henghe-052:7077
sbin/start-worker.sh ?spark://henghe-052:7077
|