1.什么是NIFI?
- Apache NIFI是一个易于使用,功能强大且可靠的数据拉去,数据处理和分发系统
- NIFI原本是NSA的一个项目,目前代码已经开源,是Apache开源基金会的顶级项目之一
- NIFI是基于JAVA的,使用Maven管理支持包
- NIFI基于web方式工作,无需编写代码,图像化操作数据流
2. NIFI安装
官方下载地址:https://nifi.apache.org/download.html 这里的tar.gz是linux下的安装包,Windows用户下载zip包安装即可。
[linfei@localhost ~]$ cd /opt/software/
[linfei@localhost software]$ tar zxvf nifi-1.13.2-bin.tar.gz -C /opt/module
[linfei@localhost software]$ cd ../module/
[linfei@localhost module]$ ls
hadoop-3.1.3 kafka nifi-1.9.2 zookeeper-3.5.7
jdk1.8.0_212 nifi-1.13.2 spark-yarn
[linfei@localhost module]$ cd nifi-1.9.2/
[linfei@localhost nifi-1.9.2]$ ls
bin docs LICENSE README
conf extensions logs run
content_repository flowfile_repository NOTICE state
database_repository lib provenance_repository work
[linfei@localhost nifi-1.9.2]$ cd conf
[linfei@localhost conf]$ ls
archive logback.xml
authorizers.xml login-identity-providers.xml
bootstrap.conf nifi.properties
bootstrap-notification-services.xml state-management.xml
flow.xml.gz zookeeper.properties
[linfei@localhost conf]$ vim nifi.properties
只修改host和port即可:
nifi.web.war.directory=./lib
nifi.web.http.host=192.168.5.170
nifi.web.http.port=58080
nifi.web.http.network.interface.default=
nifi.web.https.host=
nifi.web.https.port=
nifi.web.https.network.interface.default=
nifi.web.jetty.working.directory=./work/jetty
nifi.web.jetty.threads=200
nifi.web.max.header.size=16 KB
nifi.web.proxy.context.path=
nifi.web.proxy.host=
启动NIFI:
[linfei@localhost nifi-1.9.2]$ ls
bin docs LICENSE README
conf extensions logs run
content_repository flowfile_repository NOTICE state
database_repository lib provenance_repository work
[linfei@localhost nifi-1.9.2]$ ./bin/nifi.sh start
Java home: /opt/module/jdk1.8.0_212
NiFi home: /opt/module/nifi-1.9.2
Bootstrap Config File: /opt/module/nifi-1.9.2/conf/bootstrap.conf
[linfei@localhost nifi-1.9.2]$ ./bin/nifi.sh status
Java home: /opt/module/jdk1.8.0_212
NiFi home: /opt/module/nifi-1.9.2
Bootstrap Config File: /opt/module/nifi-1.9.2/conf/bootstrap.conf
2021-10-08 20:10:09,380 INFO [main] org.apache.nifi.bootstrap.Command Apache NiFi is currently running, listening to Bootstrap on port 38787, PID=30149
浏览器查看NIFI(输入上面配置的ip和端口,我这里是:http://192.168.5.170:58080/nifi/): 可以看到nifi正常启动
3.NIFI与kafka集群通信
首先启动kafka集群,这里假设已经安装配置好了kafka集群,用jpsall命令查看kafka集群是否成功启动:
[linfei@localhost nifi-1.9.2]$ jpsall
=============== spark170 ===============
30848 HistoryServer
28737 ConsoleProducer
30113 RunNiFi
28690 NameNode
28866 DataNode
30403 Jps
29444 JobHistoryServer
26437 ZooKeeperMain
27957 Kafka
30149 NiFi
29226 NodeManager
24813 QuorumPeerMain
=============== spark171 ===============
21744 Kafka
27409 ResourceManager
27186 DataNode
27685 NodeManager
23094 Jps
17871 QuorumPeerMain
=============== spark172 ===============
4514 ConsoleConsumer
23654 DataNode
23894 NodeManager
4987 Jps
23788 SecondaryNameNode
2813 QuorumPeerMain
3662 Kafka
在kafka集群创建名为first的topic,生产消息:
[linfei@localhost kafka]$ ./bin/kafka-topics.sh --zookeeper spark170:2181 --list
__consumer_offsets
__transaction_state
nifi-topic
[linfei@localhost kafka]$ ./bin/kafka-topics.sh --zookeeper spark170:2181 --create --replication-factor 3 --partitions 1 --topic first
Created topic "first".
[linfei@localhost kafka]$ ./bin/kafka-topics.sh --zookeeper spark170:2181 --list
__consumer_offsets
__transaction_state
first
nifi-topic
[linfei@localhost kafka]$ ./bin/kafka-console-producer.sh --broker-list spark170:9092 --topic first
>hello
>hello world
>
打开另一个客户端,接受消息:
[linfei@localhost ~]$ cd /opt/module/kafka/
[linfei@localhost kafka]$ ls
bin config libs LICENSE logs NOTICE site-docs
[linfei@localhost kafka]$ ./bin/kafka-console-consumer.sh --bootstrap-server spark170:9092 --topic first
hello
hello world
可以看到正常,接下来使用nifi与kafka进行通信 用到的四个处理器如下: 右键start所有组件,从下图可以看到通信正常: 在ssh里打开一个消费者,正确接收到了nifi发过来的数据
|