准备工作
安装Hadoop
见过程Hadoop集群安装
安装Hive
参考http://dblab.xmu.edu.cn/blog/2440-2/
安装Zookeeper集群
-
下载安装包 官网:https://zookeeper.apache.org/releases.html -
解压安装包 mkdir /opt/zookeeper
cd /opt/zookeeper
tar -zxvf apache-zookeeper-3.8.0-bin.tar.gz
mv apache-zookeeper-3.8.0-bin zookeeper
-
进入zookeeper目录,创建zkData,进入zkData,创建myid文件 cd zookeeper
mkdir zkData
cd zkData
vi myid
在myid 文件中填入编号0 -
进入zookeeper目录下的conf文件夹,修改zoo_sample.cfg名字为zoo.cfg cd conf
mv zoo_sample.cfg zoo.cfg
-
打开zoo.cfg,修改配置(内容按自己的目录和节点修改) 本机的节点用IP0.0.0.0 代替主机名 *版本3.5.的配置(用分号拼接端口号) server.1=172.36.97.152:2888:3888;2181
*版本3.4.及以前的配置 server.1=172.36.97.152:2888:3888
具体配置(使用的版本为3.8.0,所以配置如下) tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/zookeeper/zookeeper/zkData
clientPort=2181
server.0=0.0.0.0:2888:3888;2181
server.1=172.36.97.152:2888:3888;2181
server.2=172.36.97.153:2888:3888;2181
-
将/opt下的zookeeper文件复制到其他两个节点 scp -r /opt/zookeeper/zookeeper hadoop2:/opt/zookeeper/zookeeper
scp -r /opt/zookeeper/zookeeper hadoop3:/opt/zookeeper/zookeeper
- 注意:需要做一些修改
- 修改
myid 文件编号 - 修改
zoo.cfg 文件中本机的节点的IP为:0.0.0.0 -
在三个节点上分别启动zookeeper,在zookeeper目录下输入启动命令:bin/zkServer.sh start [root@dc6-80-273 zookeeper]# bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@dc6-80-275 zookeeper]# bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@dc6-80-273 zookeeper]# bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
-
启动之后检查状态 输入jps命令,可以查看到23051 QuorumPeerMain 进程 [root@dc6-80-273 zookeeper]# jps
23120 Jps
21602 NodeManager
21462 DataNode
23051 QuorumPeerMain
输入bin/zkServer.sh status 可以查看到状态 从节点 [root@dc6-80-273 zookeeper]# bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower
主节点 [root@dc6-80-275 zookeeper]# bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: leader
从节点 [root@dc6-80-273 zookeeper]# bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper/zookeeper/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower
安装kafka集群
-
下载安装包 官网:https://kafka.apache.org/downloads -
解压安装包 mkdir /opt/kafka
tar -zxvf kafka_2.12-3.1.0.tgz -C /opt/kafka
cd /opt/kafka
mv kafka_2.12-3.1.0 kafka
-
在/opt/kafka/kafka目录下创建logs目录 mkdir logs
-
修改/opt/kafka/kafka/conf目录下的配置文件 vim server.properties
#broker的全局唯一编号,不能重复
broker.id=0
#kafka运行日志存放的路径
log.dirs=/opt/kafka/kafka/logs
#配置连接Zookeeper集群地址
zookeeper.connect=172.36.97.151:2181,172.36.97.152:2181,172.36.97.153:2181
listeners=PLAINTEXT://0.0.0.0:9092
advertised.listeners=PLAINTEXT://172.36.97.151:9092
-
分发安装包到集群的另外两台节点 scp -r kafka hadoop2:/opt/kafka/kafka
scp -r kafka hadoop3:/opt/kafka/kafka
-
启动kafka集群 启动zookeeper集群之后再启动kafka集群 bin/kafka-server-start.sh -daemon config/server.properties
bin/kafka-server-start.sh -daemon config/server.properties
bin/kafka-server-start.sh -daemon config/server.properties
-
查看是否有kafka进程 [root@dc6-80-283 kafka]# jps
15232 NameNode
15584 SecondaryNameNode
25056 Kafka
25205 Jps
21483 QuorumPeerMain
15934 ResourceManager
-
测试是否可用(创建topics) bin/kafka-topics.sh --bootstrap-server 172.36.97.151:9092 --create --partitions 3 --replication-factor 3 --topic TestTopic
[root@dc6-80-283 kafka]# bin/kafka-topics.sh --bootstrap-server 172.36.97.151:9092 --create --partitions 3 --replication-factor 3 --topic TestTopic
Created topic TestTopic.
-
查看已存在的topics # 查看有哪些Topic
kafka-topics.sh --list --bootstrap-server 172.36.97.151:9092
# 查看具体的Topic
kafka-topics.sh --describe --bootstrap-server 172.36.97.151:9092 --topic TestTopic
执行结果 [root@dc6-80-283 kafka]# bin/kafka-topics.sh --list --bootstrap-server 172.36.97.151:9092
TestTopic
[root@dc6-80-283 kafka]# bin/kafka-topics.sh --describe --bootstrap-server 172.36.97.151:9092 --topic TestTopic
Topic: TestTopic TopicId: v_fLeI0yRGWftMAK-WQDXg PartitionCount: 3 ReplicationFactor: 3 Configs: segment.bytes=1073741824
Topic: TestTopic Partition: 0 Leader: 0 Replicas: 0,1,2 Isr: 0,1,2
Topic: TestTopic Partition: 1 Leader: 2 Replicas: 2,0,1 Isr: 2,0,1
Topic: TestTopic Partition: 2 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0
集群节点说明:
Topic: TestTopic PartitionCount: 3 ReplicationFactor:3 代表TestTopic有3个分区,3个副本节点;Topic : 代表主题名称Leader 代表主题节点号,Replicas 代表他的副本节点有Broker.id = 2、1、0(包括Leader Replica和Follower Replica,且不管是否存活),Isr 表示存活并且同步Leader节点的副本有Broker.id = 2、1、0 -
附.关闭kafka集群命令 bin/kafka-server-stop.sh stop
bin/kafka-server-stop.sh stop
bin/kafka-server-stop.sh stop
安装Hbase
hbase安装过程
-
下载:hbase-2.3.3-bin.tar.gz -
解压: # 存放位置
mkdir /opt/hbase
# 解压
tar -zxvf hbase-2.3.3-bin.tar.gz -C /opt/hbase
# 更改文件夹名称
mv hbase-2.3.3 hbase
cd hbase
-
配置环境变量 vim /etc/profile
/etc/profil 中加入如下内容 # Hbase
HBASE_HOME=/opt/hbase/hbase
export PATH=${HBASE_HOME}/bin:${PATH}
保存退出并使之生效 source /etc/profile
-
修改 hbase-env.sh 配置文件 [root@dc6-80-283 atlas]# echo $JAVA_HOME
/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.332.b09-1.el7_9.aarch6
[root@dc6-80-283 atlas]# vim conf/hbase-env.sh
hbase-env.sh 添加内容/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.332.b09-1.el7_9.aarch6 export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.332.b09-1.el7_9.aarch6
去掉内容项export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true" 的注释 -
修改 hbase-site.xml 配置文件 vim conf/hbase-site.xml
内容如下 <configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop1:9000/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/opt/hbase/data</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hadoop1,hadoop2,hadoop3</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
</property>
</configuration>
-
配置slave节点 vim conf/regionservers
加入如下内容 hadoop1
hadoop2
hadoop3
-
将Hbase分发到各个节点 scp -r hbase root@hadoop2:/opt/hbase/
scp -r hbase root@hadoop3:/opt/hbase/
-
启动Hbase bin/start-hbase.sh
查看Hbase的两个服务是否全部启动 [root@dc6-80-283 hbase]# jps
2722 SecondaryNameNode
3062 ResourceManager
5544 Jps
5275 HRegionServer
4941 HMaster
2399 NameNode
如果两个服务有未启动成功的,请查看具体日志 ll logs
hbase踩坑
-
启动报错:查看hbase/logs/下的日志输出 2022-06-28 17:18:14,001 WARN [RS-EventLoopGroup-1-2] concurrent.DefaultPromise: An exception was thrown by org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper$4.operationComplete()
java.lang.IllegalArgumentException: object is not an instance of declaring class
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hbase.io.asyncfs.ProtobufDecoder.<init>(ProtobufDecoder.java:69)
搜了一圈,有人说是Hadoop版本3.3.x高了导致的兼容问题,要么就是hdfs进入安全模式了,但是实际上通过对hbase/conf/hbase-env.sh 修改,去掉注释 export =HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true" 后,可以再试试,成功执行。 如果是安全模式的问题: hdfs dfsadmin -safemode leave
参考: https://blog.csdn.net/u011946741/article/details/122477894 -
重启hbase后报错 启动hbase时HMaster服务掉线,查看日志抛出如下异常 检查:hadoop 的core-site.xml 和hbase的hbase-site.xml 中的hdfs 的路径,发现正确。 2022-06-28 18:55:38,367 WARN [master/hadoop1:16000:becomeActiveMaster] regionserver.HRegion: Failed initialize of region= master:store,,1.1595e783b53d99cd5eef43b6debb2682., starting to roll back memstore
java.io.EOFException: Cannot seek after EOF
at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1648)
at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66)
at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:211)
at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:173)
at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:323)
at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:305)
at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293)
at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:429)
at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4863)
at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4769)
at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1013)
at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:955)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7500)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7458)
at org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:269)
at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:309)
at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:948)
at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2239)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:621)
at java.lang.Thread.run(Thread.java:750)
2022-06-28 18:55:38,383 INFO [master/hadoop1:16000:becomeActiveMaster] regionserver.HRegion: Drop memstore for Store proc in region master:store,,1.1595e783b53d99cd5eef43b6debb2682. , dropped memstoresize: [dataSize=0, getHeapSize=256, getOffHeapSize=0, getCellsCount=0 }
2022-06-28 18:55:38,384 INFO [master/hadoop1:16000:becomeActiveMaster] regionserver.HRegion: Closing region master:store,,1.1595e783b53d99cd5eef43b6debb2682.
2022-06-28 18:55:38,385 INFO [master/hadoop1:16000:becomeActiveMaster] regionserver.HRegion: Closed master:store,,1.1595e783b53d99cd5eef43b6debb2682.
2022-06-28 18:55:38,388 ERROR [master/hadoop1:16000:becomeActiveMaster] master.HMaster: Failed to become active master
java.io.EOFException: Cannot seek after EOF
at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1648)
at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66)
at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:211)
at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:173)
at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:323)
at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:305)
at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293)
at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:429)
at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4863)
at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4769)
at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1013)
at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:955)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7500)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7458)
at org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:269)
at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:309)
at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:948)
at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2239)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:621)
at java.lang.Thread.run(Thread.java:750)
2022-06-28 18:55:38,388 ERROR [master/hadoop1:16000:becomeActiveMaster] master.HMaster: ***** ABORTING master hadoop1,16000,1656413730951: Unhandled exception. Starting shutdown. *****
java.io.EOFException: Cannot seek after EOF
at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1648)
at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:66)
at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:211)
at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:173)
at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:323)
at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:305)
at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:293)
at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:429)
at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4863)
at org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4769)
at org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:1013)
at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:955)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7500)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegionFromTableDir(HRegion.java:7458)
at org.apache.hadoop.hbase.master.region.MasterRegion.open(MasterRegion.java:269)
at org.apache.hadoop.hbase.master.region.MasterRegion.create(MasterRegion.java:309)
at org.apache.hadoop.hbase.master.region.MasterRegionFactory.create(MasterRegionFactory.java:104)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:948)
at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2239)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:621)
at java.lang.Thread.run(Thread.java:750)
2022-06-28 18:55:38,388 INFO [master/hadoop1:16000:becomeActiveMaster] regionserver.HRegionServer: ***** STOPPING region server 'hadoop1,16000,1656413730951' *****
2022-06-28 18:55:38,388 INFO [master/hadoop1:16000:becomeActiveMaster] regionserver.HRegionServer: STOPPED: Stopped by master/hadoop1:16000:becomeActiveMaster
2022-06-28 18:55:38,781 INFO [hadoop1:16000.splitLogManager..Chore.1] hbase.ScheduledChore: Chore: SplitLogManager Timeout Monitor was stopped
2022-06-28 18:55:39,645 INFO [master/hadoop1:16000] ipc.NettyRpcServer: Stopping server on /10.208.156.159:16000
2022-06-28 18:55:39,650 INFO [master/hadoop1:16000] regionserver.HRegionServer: Stopping infoServer
2022-06-28 18:55:39,655 INFO [master/hadoop1:16000] handler.ContextHandler: Stopped o.e.j.w.WebAppContext@312b34e3{/,null,UNAVAILABLE}{file:/opt/hbase/hbase/hbase-webapps/master}
2022-06-28 18:55:39,658 INFO [master/hadoop1:16000] server.AbstractConnector: Stopped ServerConnector@30e9ca13{HTTP/1.1,[http/1.1]}{0.0.0.0:16010}
2022-06-28 18:55:39,659 INFO [master/hadoop1:16000] handler.ContextHandler: Stopped o.e.j.s.ServletContextHandler@5a2bd7c8{/static,file:///opt/hbase/hbase/hbase-webapps/static/,UNAVAILABLE}
2022-06-28 18:55:39,659 INFO [master/hadoop1:16000] handler.ContextHandler: Stopped o.e.j.s.ServletContextHandler@7efe7b87{/logs,file:///opt/hbase/hbase/logs/,UNAVAILABLE}
2022-06-28 18:55:39,660 INFO [master/hadoop1:16000] regionserver.HRegionServer: aborting server hadoop1,16000,1656413730951
2022-06-28 18:55:39,660 INFO [master/hadoop1:16000] regionserver.HRegionServer: stopping server hadoop1,16000,1656413730951; all regions closed.
2022-06-28 18:55:39,660 INFO [master/hadoop1:16000] hbase.ChoreService: Chore service for: master/hadoop1:16000 had [] on shutdown
2022-06-28 18:55:39,662 WARN [master/hadoop1:16000] master.ActiveMasterManager: Failed get of master address: java.io.IOException: Can't get master address from ZooKeeper; znode data == null
2022-06-28 18:55:39,662 INFO [master/hadoop1:16000] hbase.ChoreService: Chore service for: hadoop1:16000.splitLogManager. had [] on shutdown
2022-06-28 18:55:39,766 INFO [ReadOnlyZKClient-hadoop1:2181,hadoop2:2181,hadoop3:2181@0x487a6ea5] zookeeper.ZooKeeper: Session: 0x2006d2017f20013 closed
2022-06-28 18:55:39,766 INFO [ReadOnlyZKClient-hadoop1:2181,hadoop2:2181,hadoop3:2181@0x487a6ea5-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x2006d2017f20013
2022-06-28 18:55:39,866 INFO [master/hadoop1:16000] zookeeper.ZooKeeper: Session: 0x6d204c010009 closed
2022-06-28 18:55:39,866 INFO [main-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x6d204c010009
2022-06-28 18:55:39,866 INFO [master/hadoop1:16000] regionserver.HRegionServer: Exiting; stopping=hadoop1,16000,1656413730951; zookeeper connection closed.
2022-06-28 18:55:39,866 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: HMaster Aborted
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:244)
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:140)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:149)
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:3071)
解决办法: 删除hdfs上的hbase目录,命令如下: hdfs dfs -rm -r /hbase
再次启动hbase服务 start-hbase.sh
可以正常访问:http://节点名称:16010/master-status 注:参考来源:https://blog.csdn.net/m0_46565121/article/details/125247369
安装Solr
-
下载:solr-8.6.3.tgz -
解压: # 存放位置
mkdir /opt/solr
# 解压
tar -zxvf solr-8.6.3.tgz -C /opt/solr
# 更改文件夹名称
mv solr-8.6.3 solr
cd solr
-
修改solr配置文件 修改/opt/solr/solr/bin/solr.in.sh 文件中的以下属性 cd /opt/solr/solr/bin
sudo vim solr.in.sh
找到如下配置参数 ,删掉注释 ,然后修改 SOLR_HEAP="1024m"
ZK_HOST="172.36.97.151:2181,172.36.97.152:2181,172.36.97.153:2181"
SOLR_HOST="172.36.97.151"
SOLR_JAVA_STACK_SIZE="-Xss768k"
SOLR_TIMEZONE="UTC+8"
ENABLE_REMOTE_JMX_OPTS="false"
-
分发solr 到各个节点 scp -r solr root@hadoop2:/opt/solr/
scp -r solr root@hadoop3:/opt/solr/
-
启动Solr 三个节点分别启动solr bin/solr start -p 8983 -force
[root@dc6-80-283 solr]# bin/solr start -p 8983 -force
*** [WARN] *** Your open file limit is currently 1024.
It should be set to 65000 to avoid operational disruption.
If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh
Waiting up to 180 seconds to see Solr running on port 8983 [|]
Started Solr server on port 8983 (pid=24125). Happy searching!
-
查看状态 dc6-80-283 solr]# bin/solr status
Found 1 Solr nodes:
Solr process 27875 running on port 8983
{
"solr_home":"/opt/solr/solr/server/solr",
"version":"8.6.3 e001c2221812a0ba9e9378855040ce72f93eced4 - jasongerlowski - 2020-10-03 18:12:03",
"startTime":"2022-06-28T09:04:08.066Z",
"uptime":"0 days, 0 hours, 0 minutes, 44 seconds",
"memory":"122.9 MB (%12) of 1 GB",
"cloud":{
"ZooKeeper":"172.36.97.151:2181,172.36.97.152:2181,172.36.97.153:2181",
"liveNodes":"2",
"collections":"0"}}
编译Atlas
开始编译
tar xvfz apache-atlas-2.2.0-sources.tar.gz
cd apache-atlas-sources-2.2.0/
export MAVEN_OPTS="-Xms2g -Xmx2g"
mvn clean -DskipTests install
在具有Apache HBase 和 Apache Solr 实例的环境中创建用于部署的 Apache Atlas 包
mvn clean -DskipTests package -Pdist
编译报错
问题描述: 编译Atlas2.2.0 时报错:org.apache.atlas:atlas-buildtools:jar:1.0 was not found
解决办法:修改源码pom.xml 里面atlas-buildtools 的版本为0.8.1 。
编译成功
[INFO] Apache Atlas Server Build Tools .................... SUCCESS [ 0.655 s]
[INFO] apache-atlas ....................................... SUCCESS [ 2.450 s]
[INFO] Apache Atlas Integration ........................... SUCCESS [ 5.506 s]
[INFO] Apache Atlas Test Utility Tools .................... SUCCESS [ 2.451 s]
[INFO] Apache Atlas Common ................................ SUCCESS [ 1.755 s]
[INFO] Apache Atlas Client ................................ SUCCESS [ 0.106 s]
[INFO] atlas-client-common ................................ SUCCESS [ 0.791 s]
[INFO] atlas-client-v1 .................................... SUCCESS [ 1.245 s]
[INFO] Apache Atlas Server API ............................ SUCCESS [ 1.223 s]
[INFO] Apache Atlas Notification .......................... SUCCESS [ 2.603 s]
[INFO] atlas-client-v2 .................................... SUCCESS [ 0.843 s]
[INFO] Apache Atlas Graph Database Projects ............... SUCCESS [ 0.062 s]
[INFO] Apache Atlas Graph Database API .................... SUCCESS [ 0.910 s]
[INFO] Graph Database Common Code ......................... SUCCESS [ 0.843 s]
[INFO] Apache Atlas JanusGraph-HBase2 Module .............. SUCCESS [ 0.742 s]
[INFO] Apache Atlas JanusGraph DB Impl .................... SUCCESS [ 3.963 s]
[INFO] Apache Atlas Graph DB Dependencies ................. SUCCESS [ 1.324 s]
[INFO] Apache Atlas Authorization ......................... SUCCESS [ 1.301 s]
[INFO] Apache Atlas Repository ............................ SUCCESS [ 7.453 s]
[INFO] Apache Atlas UI .................................... SUCCESS [03:21 min]
[INFO] Apache Atlas New UI ................................ SUCCESS [02:21 min]
[INFO] Apache Atlas Web Application ....................... SUCCESS [01:00 min]
[INFO] Apache Atlas Documentation ......................... SUCCESS [ 0.650 s]
[INFO] Apache Atlas FileSystem Model ...................... SUCCESS [ 1.584 s]
[INFO] Apache Atlas Plugin Classloader .................... SUCCESS [ 0.573 s]
[INFO] Apache Atlas Hive Bridge Shim ...................... SUCCESS [ 2.217 s]
[INFO] Apache Atlas Hive Bridge ........................... SUCCESS [ 7.605 s]
[INFO] Apache Atlas Falcon Bridge Shim .................... SUCCESS [ 0.847 s]
[INFO] Apache Atlas Falcon Bridge ......................... SUCCESS [ 2.198 s]
[INFO] Apache Atlas Sqoop Bridge Shim ..................... SUCCESS [ 0.092 s]
[INFO] Apache Atlas Sqoop Bridge .......................... SUCCESS [ 5.026 s]
[INFO] Apache Atlas Storm Bridge Shim ..................... SUCCESS [ 0.654 s]
[INFO] Apache Atlas Storm Bridge .......................... SUCCESS [ 4.729 s]
[INFO] Apache Atlas Hbase Bridge Shim ..................... SUCCESS [ 1.683 s]
[INFO] Apache Atlas Hbase Bridge .......................... SUCCESS [ 5.131 s]
[INFO] Apache HBase - Testing Util ........................ SUCCESS [ 3.263 s]
[INFO] Apache Atlas Kafka Bridge .......................... SUCCESS [ 2.093 s]
[INFO] Apache Atlas classification updater ................ SUCCESS [ 1.141 s]
[INFO] Apache Atlas index repair tool ..................... SUCCESS [ 1.550 s]
[INFO] Apache Atlas Impala Hook API ....................... SUCCESS [ 0.085 s]
[INFO] Apache Atlas Impala Bridge Shim .................... SUCCESS [ 0.100 s]
[INFO] Apache Atlas Impala Bridge ......................... SUCCESS [ 4.011 s]
[INFO] Apache Atlas Distribution .......................... SUCCESS [01:01 min]
[INFO] atlas-examples ..................................... SUCCESS [ 0.058 s]
[INFO] sample-app ......................................... SUCCESS [ 0.965 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 09:07 min
[INFO] Finished at: 2022-06-07T14:55:22+08:00
[INFO] ------------------------------------------------------------------------
查看编译生成的文件,在apache-atlas-sources/distro/target 目录下
[root@dc6-80-283 atlas]# ll distro/target/
total 932640
-rw-r--r--. 1 root root 28056 Jun 7 14:55 apache-atlas-2.2.0-atlas-index-repair.zip
-rw-r--r--. 1 root root 462446761 Jun 7 14:55 apache-atlas-2.2.0-bin.tar.gz
-rw-r--r--. 1 root root 29556 Jun 7 14:55 apache-atlas-2.2.0-classification-updater.zip
-rw-r--r--. 1 root root 8454073 Jun 7 14:54 apache-atlas-2.2.0-falcon-hook.tar.gz
-rw-r--r--. 1 root root 10371412 Jun 7 14:54 apache-atlas-2.2.0-hbase-hook.tar.gz
-rw-r--r--. 1 root root 10472250 Jun 7 14:54 apache-atlas-2.2.0-hive-hook.tar.gz
-rw-r--r--. 1 root root 10422677 Jun 7 14:54 apache-atlas-2.2.0-impala-hook.tar.gz
-rw-r--r--. 1 root root 4170481 Jun 7 14:54 apache-atlas-2.2.0-kafka-hook.tar.gz
-rw-r--r--. 1 root root 365827100 Jun 7 14:54 apache-atlas-2.2.0-server.tar.gz
-rw-r--r--. 1 root root 15303697 Jun 7 14:55 apache-atlas-2.2.0-sources.tar.gz
-rw-r--r--. 1 root root 8440987 Jun 7 14:54 apache-atlas-2.2.0-sqoop-hook.tar.gz
-rw-r--r--. 1 root root 58914646 Jun 7 14:54 apache-atlas-2.2.0-storm-hook.tar.gz
drwxr-xr-x. 2 root root 6 Jun 7 14:54 archive-tmp
-rw-r--r--. 1 root root 102718 Jun 7 14:54 atlas-distro-2.2.0.jar
drwxr-xr-x. 2 root root 4096 Jun 7 14:54 bin
drwxr-xr-x. 5 root root 265 Jun 7 14:54 conf
drwxr-xr-x. 2 root root 28 Jun 7 14:54 maven-archiver
drwxr-xr-x. 3 root root 22 Jun 7 14:54 maven-shared-archive-resources
drwxr-xr-x. 2 root root 55 Jun 7 14:54 META-INF
-rw-r--r--. 1 root root 3194 Jun 7 14:54 rat.txt
drwxr-xr-x. 3 root root 22 Jun 7 14:54 test-classes
安装Atlas
解压Atlas安装包
将编译生成的文件apache-atlas-2.2.0-server.tar.gz 解压到/opt/atlas 目录下
# 解压
tar -zxvf apache-atlas-2.2.0-server.tar.gz -C /opt/atlas
# 重命名
mv apache-atlas-2.2.0 atlas
Atlas集成Hbase
-
修改/opt/atlas/atlas/conf/atlas-application.properties 配置文件中的以下参数 atlas.graph.storage.hostname=172.36.97.151:2181,172.36.97.152:2181,172.36.97.153:2181
-
修改/opt/module/atlas/conf/atlas-env.sh 配置文件,增加以下内容 export HBASE_CONF_DIR=/opt/hbase/hbase/conf
Atlas集成Solr
-
修改/opt/atlas/atlas/conf/atlas-application.properties 配置文件中的以下参数 atlas.graph.index.search.backend=solr
atlas.graph.index.search.solr.mode=cloud
atlas.graph.index.search.solr.zookeeperurl=172.36.97.151:2181,172.36.97.152:2181,172.36.97.153:2181
-
创建solr collection cd /opt/solr/solr
bin/solr create -c vertex_index -d /opt/atlas/atlas/conf/solr -shards 3 -replicationFactor 2 -force
bin/solr create -c edge_index -d /opt/atlas/atlas/conf/solr -shards 3 -replicationFactor 2 -force
bin/solr create -c fulltext_index -d /opt/atlas/atlas/conf/solr -shards 3 -replicationFactor 2 -force
[root@dc6-80-283 solr]# bin/solr create -c vertex_index -d /opt/atlas/atlas/conf/solr -shards 3 -replicationFactor 2 -force
Created collection 'vertex_index' with 3 shard(s), 2 replica(s) with config-set 'vertex_index'
[root@dc6-80-283 solr]# bin/solr create -c edge_index -d /opt/atlas/atlas/conf/solr -shards 3 -replicationFactor 2 -force
Created collection 'edge_index' with 3 shard(s), 2 replica(s) with config-set 'edge_index'
[root@dc6-80-283 solr]# bin/solr create -c fulltext_index -d /opt/atlas/atlas/conf/solr -shards 3 -replicationFactor 2 -force
Created collection 'fulltext_index' with 3 shard(s), 2 replica(s) with config-set 'fulltext_index'
可以看到collettions数量为3: "collections":"3" [root@dc6-80-283 solr]# bin/solr status
Found 1 Solr nodes:
Solr process 27875 running on port 8983
{
"solr_home":"/opt/solr/solr/server/solr",
"version":"8.6.3 e001c2221812a0ba9e9378855040ce72f93eced4 - jasongerlowski - 2020-10-03 18:12:03",
"startTime":"2022-06-28T09:04:08.066Z",
"uptime":"0 days, 0 hours, 8 minutes, 56 seconds",
"memory":"67 MB (%6.5) of 1 GB",
"cloud":{
"ZooKeeper":"172.36.97.151:2181,172.36.97.152:2181,172.36.97.153:2181",
"liveNodes":"3",
"collections":"3"}}
Atlas集成Kafka
-
修改atlas/conf/atlas-application.properties 配置文件中的以下参数 atlas.notification.embedded=false
atlas.kafka.data=/opt/kafka/kafka/data
atlas.kafka.zookeeper.connect=172.36.97.151:2181,172.36.97.152:2181,172.36.97.153:2181/kafka
atlas.kafka.bootstrap.servers=172.36.97.151:9092,172.36.97.152:9092,172.36.97.153:9092
启动Atlas
-
启动命令 bin/atlas_start.py
-
验证Apache Atlas 服务是否已启动并运行,运行 curl 命令,如下所示: [root@dc6-80-283 logs]# curl -u admin:admin http://localhost:21000/api/atlas/admin/version
{"Description":"Metadata Management and Data Governance Platform over Hadoop","Revision":"release","Version":"2.2.0","Name":"apache-atlas"}[root@dc6-80-283 logs]#
-
访问WEB-UI: http://IP:21000 成功显示页面!!!
其他集成,后续
- Atlas 与 Hive 集成
- Atlas 与 Sqoop 集成
|