自定义监控(进程,日志,MySQL)
自定义监控进程:
写脚本,脚本放到统一的位置 修改被监控机的zabbix_agentd . conf配置文件
- UnsafeParameters=1
- Use rPa rameter=<key> , <command>
重启zabbix_agent 在web界面配置监控项和触发器
监控httpd进程
(被监控机操作)查看一下httpd端口
[root@client ~]# ss -antl
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
LISTEN 0 128 0.0.0.0:10050 0.0.0.0:*
LISTEN 0 128 [::]:22 [::]:*
LISTEN 0 128 *:80 *:*
修改被监控机的配置文件
[root@client ~]# vim /usr/local/etc/zabbix_agentd.conf
......
319 # Mandatory: no
320 # Range: 0-1
321 # Default:
322 UnsafeUserParameters(不安全的用户参数)=1 //取消前面的注释并将0改为1启用
323
324 ### Option: UserParameter
325 # User-defined parameter to monitor. There can be several user-defined parameters.
326 # Format: (复制括号后面的内容到文件的最后进行修改)UserParameter=<key>,<shell command>
327 # See 'zabbix_agentd' directory for examples.
......
527 UserParameter=check_process[*](此key必须与web界面的创建监控项key保持一致),(脚本路径)/scripts/check_process.sh $1(传参数)
写一个脚本在被监控机执行
创建一个目录来放脚本
[root@client ~]# mkdir /scripts
[root@client ~]# cd /scripts/
[root@client scripts]# touch check_process.sh
[root@client scripts]# chmod +x check_process.sh
[root@client scripts]# ls
check_process.sh
[root@client ~]# vim /scripts/check_process.sh
#!/bin/bash
count=$(ps -ef | grep -Ev "grep|$0" |grep -c "$1"(传参数)) //如果过滤到的httpd个数为0则httpd没有运行,就输出1告警
if [ $count -eq 0 ];then
echo "1"
else
echo "0"
fi
[root@client ~]# bash /scripts/check_process.sh httpd(传的参数)
0
查看一下httpd进程
[root@client scripts]# ps -aux |grep httpd
root 28232 0.0 0.6 282884 11552 ? Ss 01:36 0:00 /usr/sbin/httpd -DFOREGROUND
apache 28308 0.0 0.4 296760 8552 ? S 01:36 0:00 /usr/sbin/httpd -DFOREGROUND
apache 28309 0.0 0.7 1944424 14264 ? Sl 01:36 0:00 /usr/sbin/httpd -DFOREGROUND
apache 28310 0.0 0.6 1813296 12216 ? Sl 01:36 0:00 /usr/sbin/httpd -DFOREGROUND
apache 28311 0.0 0.6 1813296 12216 ? Sl 01:36 0:00 /usr/sbin/httpd -DFOREGROUND
root 84385 0.0 0.0 9208 1176 pts/0 S+ 02:06 0:00 grep --color=auto httpd
//重启zabbix_agentd
[root@client ~]# pkill zabbix
[root@client ~]# zabbix_agentd
[root@client ~]# ss -antl
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
LISTEN 0 128 0.0.0.0:10050 0.0.0.0:*
LISTEN 0 128 [::]:22 [::]:*
LISTEN 0 128 *:80
web网页配置
创建自定义监控项
服务端执行这个命令,能获取到数字说明配置文件没有问题 注意防火墙和selinux
[root@server ~]# zabbix_get -s(来源) 192.168.220.10(客户端IP) -k(客户端配置文件中的key) check_process[httpd]
0
创建触发器
被监控机停掉httpd进程,查看web界面是否告警 注意防火墙和selinux
[root@client ~]# systemctl stop httpd
监控日志
自定义监控进程:
写脚本,脚本放到统一的位置 修改被监控机的zabbix_agentd . conf配置文件
- UnsafeParameters=1
- Use rPa rameter=<key> , <command>
重启zabbix_agent 在web界面配置监控项和触发器
(被监控机操作)需要使用一个python脚本,安装python3
yum -y install python3
将log.py脚本放到/scripts/目录下
[root@client scripts]# cd /scripts
[root@client scripts]# ls
check_process.sh log.py
[root@client scripts]# vim log.py
1 #!/usr/bin/env python3
2 import sys
3 import re
4
5 def prePos(seekfile):
6 global curpos
7 try:
8 cf = open(seekfile)
9 except IOError:
10 curpos = 0
11 return curpos
12 except FileNotFoundError:
13 curpos = 0
14 return curpos
15 else:
16 try:
17 curpos = int(cf.readline().strip())
18 except ValueError:
19 curpos = 0
20 cf.close()
21 return curpos
22 cf.close()
23 return curpos
24
25 def lastPos(filename):
26 with open(filename) as lfile:
27 if lfile.readline():
28 lfile.seek(0,2)
29 else:
30 return 0
31 lastPos = lfile.tell()
32 return lastPos
33
34 def getSeekFile():
35 try:
36 seekfile = sys.argv[2]
37 except IndexError:
38 seekfile = '/tmp/logseek'
39 return seekfile
40
41 def getKey():
42 try:
43 tagKey = str(sys.argv[3])
44 except IndexError:
45 tagKey = 'Error'
46 return tagKey
47
48 def getResult(filename,seekfile,tagkey):
49 destPos = prePos(seekfile)
50 curPos = lastPos(filename)
51
52 if curPos < destPos:
53 curpos = 0
54
55 try:
56 f = open(filename)
57 except IOError:
58 print('Could not open file: %s' % filename)
59 except FileNotFoundError:
60 print('Could not open file: %s' % filename)
61 else:
62 f.seek(destPos)
63
64 while curPos != 0 and f.tell() < curPos:
65 rresult = f.readline().strip()
66 global result
67 if re.search(tagkey, rresult):
68 result = 1
69 break
70 else:
71 result = 0
72
73 with open(seekfile,'w') as sf:
74 sf.write(str(curPos))
75 finally:
76 f.close()
77 return result
78
79 if __name__ == "__main__":
80 result = 0
81 curpos = 0
82 tagkey = getKey()
83 seekfile = getSeekFile()
84 result = getResult(sys.argv[1],seekfile,tagkey)
85 print(result)
脚本作用:
log.py 作用: 检查日志文件中是否有指定的关键字 第一个参数为日志文件名(必须有,相对路径、绝对路径均可) 第二个参数为seek position文件的路径 (可选项,若不设置则默认为/tmp/logseek文件。相对路径、绝对路径均可) 第三个参数为搜索关键字,默认为Error
修改被监控机的配置文件/usr/local/etc/zabbix_agentd.conf
[root@client scripts]# vim /usr/local/etc/zabbix_agentd.conf
......
526
527 UserParameter=check_process[*],/scripts/check_process.sh $1
528 UserParameter=check_log[*],/scripts/log.py $1 $2 $3
//重启zabbix_agentd
[root@client scripts]# pkill zabbix
[root@client scripts]# zabbix_agentd
配置web界面
创建日志监控项
创建日志触发器
给文件权限让zabbix用户能够访问(被监控机)
[root@client scripts]# chmod 755 /var/log/httpd
向日志文件/var/log/httpd/error_log文件中追加error,测试告警(被监控机) 注意防火墙和selinux
[root@client scripts]# echo "error" >> /var/log/httpd/error_log
监控MySQL主从状态
MySQL主从配置以前的文章有,这里就省略
编写脚本
[root@localhost scripts]# cat mysql_thread.sh
#!/bin/bash
count=$(mysql -e "show slave status\G;" 2>/dev/null |grep _Running: |grep -c Yes)
if [ $count -ne 2 ];then //如果过滤到的Yes个数不等于2就输出1,表示有问题
echo "1"
else
echo "0"
fi
配置被监控机的配置文件
[root@localhost etc]# vim zabbix_agentd.conf
······
# Default: SOMAXCONN (hard-coded constant, depends on system)
# ListenBacklog=
527 UserParameter=check_process[*],/scripts/check_process.sh $1
528 UserParameter=check_log[*],/scripts/log.py $1 $2 $3
529 UserParameter=mysql.slave[*],/scripts/mysql_thread.sh $1 $2 //添加
[root@client scripts]# chmod +x /scripts/mysql_thread.sh
//重启
[root@localhost etc]# pkill zabbix_agentd
[root@localhost etc]# zabbix_agentd
查看从库状态
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.220.17
Master_User: rhel
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql_bin.000001
Read_Master_Log_Pos: 15651689
Relay_Log_File: mysql_relay.000002
Relay_Log_Pos: 15651855
Relay_Master_Log_File: mysql_bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
服务端测试
[root@sever ~]# zabbix_get -s 192.168.220.10 -k mysql.slave[io]
0
[root@sever ~]# zabbix_get -s 192.168.220.10 -k mysql.slave[io,sql]
0
配置web界面
从库停止,查看web界面是否告警
mysql> stop slave;
Query OK, 0 rows affected (0.01 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 192.168.220.17
Master_User: rhel
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql_bin.000001
Read_Master_Log_Pos: 15767640
Relay_Log_File: mysql_relay.000002
Relay_Log_Pos: 15767806
Relay_Master_Log_File: mysql_bin.000001
Slave_IO_Running: No
Slave_SQL_Running: No
成功
监控MySQL主从延迟
(被监控机)写脚本取值
[root@client ~]# cd /scripts/
[root@client scripts]# touch delay_mysql.sh
[root@client scripts]# chmod +x delay_mysql.sh
[root@client scripts]# cat delay_mysql.sh
#!/bin/bash
delay=$(mysql -uzabbix -pzabbix -e 'show slave status\G;' 2>/dev/null |grep '_Behind' |awk '{print $2}')
if [ $delay != NULL ];then
echo $delay
else
echo "0"
fi
//授权
[root@client ~]# mysql -uroot -p123456
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 252
Server version: 5.7.34 MySQL Community Server (GPL)
Copyright (c) 2000, 2021, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>grant SUPER, REPLICATION CLIENT on *.* to 'zabbix'@'localhost' identified by 'zabbix';
//修改被监控机的配置文件
# Default: SOMAXCONN (hard-coded constant, depends on system)
# ListenBacklog=
527 UserParameter=check_process[*],/scripts/check_process.sh $1
528 UserParameter=check_log[*],/scripts/log.py $1 $2 $3
529 UserParameter=mysql.slave[*],/scripts/mysql_thread.sh $1 $2
530 UserParameter=check_delay(与web界面创建监控项时的key一致),/scripts/delay_mysql.sh //添加
//重启zabbix_agentd
[root@client ~]# pkill zabbix
[root@client ~]# zabbix_agentd
//服务端测试
[root@sever ~]# zabbix_get -s 192.168.220.10 -k check_delay
0
//查看被监控机的延迟
[root@client scripts]# mysql -e "show slave status\G;"
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.220.17
......
Exec_Master_Log_Pos: 154
Relay_Log_Space: 3988378
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL //延迟
......
web界面配置
创建监控项 创建触发器,当值不等于0时触发告警 因为我使用的环境不会出现延迟,所以不会告警,但是步骤是没有问题的。
|