一、巡检表
二、巡检参考
2.1、centos巡检
1> 身份鉴别:确保root是唯一的UID为0的账户,除root以外其他UID为0的用户都应该删除,或者为其分配新的UID;查看命令:
cat /etc/passwd | awk -F: ‘($3 == 0) { print $1 }’|grep -v ‘^root$’
2> 身份鉴别:密码复杂度检查,检查密码长度和密码是否使用多种字符类型
编辑/etc/security/pwquality.conf,把minlen(密码最小长度)设置为9-32位,把minclass(至少包含小写字母、大写字母、数字、特殊字符等4类字符中等3类或4类)设置为3或4。如: minlen=10 minclass=3.
3> 身份鉴别:检查密码重用是否受限制,强制用户不使用最近使用的密码,降低猜测密码攻击风险
在/etc/pam.d/password-auth和/etc/pam.d/system-auth中password sufficient pam_unix.so 这行的末尾配置remember参数为5-24之间,原来的内容不用更改, 只在末尾加了remember=5。
4> 服务配置:确保ssh loglevel设置为info,记录登录和注销活动
编辑 /etc/ssh/sshd_config 文件以按如下方式设置参数(取消注释): LogLevel INFO
5> 服务配置:确保SSH MaxAuthTries设置为3到6之间,设置较低的Max AuthTrimes参数将降低SSH服务器被暴力攻击成功的风险
在/etc/ssh/sshd_config中取消MaxAuthTries注释符号#,设置最大密码尝试失败次数3-6,建议为4:MaxAuthTries 4
6> 服务配置:禁止ssh空密码账户登录
编辑文件/etc/ssh/sshd_config,将PermitEmptyPasswords配置为no:PermitEmptyPasswords no
7> 服务配置:设置ssh空闲超时退出时间,可降低未授权用户访问其他用户ssh会话的风险
编辑/etc/ssh/sshd_config,将ClientAliveInterval 设置为300到900,即5-15分钟,将ClientAliveCountMax设置为0-3。
8> 安全审计:确保rsyslog服务已启用,记录日志以供审计
ps -ef|grep -v grep|grep rsyslog
运行以下命令启用rsyslog服务:
systemctl enable rsyslog systemctl start rsyslog
9> 查看可以日志记录
命令:lastlog
查看最近 登录的ip地址列表中是否存在可疑ip
last -f /var/log/wtmp
10> 防火墙
查看状态:systemctl status firewalld 开启:service firewalld start 重启:service firewalld restart 关闭:service firewalld stop
#查询端口是否开放 firewall-cmd --query-port=8080/tcp #开放80端口 firewall-cmd --permanent --add-port=80/tcp #移除端口 firewall-cmd --permanent --remove-port=8080/tcp
11> 高危漏洞
查看可以升级的软件:yum check-update
更新软件:yum upgrade
注意update和upgrade的区别:
update侧重更新的意思,主要是为原有的东西增加新功能;yum -y update 升级所有包,改变软件设置和系统设置,系统版本内核都升级。
upgrade侧重升级的意思,是指从较低级版本升级到高级的版本;yum -y upgrade升级所有包,不改变软件设置和系统设置,系统版本升级,内核不改变。
ubuntu: sudo apt update:只检查更新源里的软件版本列表,不更新(已安装的软件包是否有可用的更新,给出汇总报告)
sudo apt upgrade 软件包名:更新已安装的软件包
12> 检查确认服务器时间同步 Yum install ntp
/etc/rc.d/init.d/ntpd start
ntpdate 203.117.180.36
crontab –e:
*/1 * * * root /usr/sbin/ntpdate 203.117.180.36 > /dev/null 2>&1
/etc/init.d/crond restart
并且打开防火墙UDP 123 端口保证ntpdate服务正常运行。
13> 其他注意事项
主机名一定不要使用软件名称,比如:mysql,nginx,php等作为系统主机名,因为ps –ef|grep nginx查看服务状态时,容易引起误导。
当网络状况不好,传文件及数据最好使用FTP。
update修改语句之前,一定要先select结果,并且记录,以便如果修改记录错误,能够恢复回数值来。
修改配置文件时候先备份配置文件。然后修改配置文件的时候,要先注释掉选项,然后新增选项,配置参数,注意观察下注释是;分号还是#井号。
2.2、weblogic中间件巡检流程
1.weblogic默认管理地址和用户名密码
http://ip:8001/console; //输入weblogic/密码
2.查看weblogic版本:java weblogic.version 3.登录管理页面查看部署应用 4.查看weblogic当前补丁集 cd /u01/Middleware/OPatch ./opatch lspatches -jre /u01/jdk1.7.0_80/jre
三、巡检脚本:请根据实际优化后使用
3.1、服务器巡检shell脚本
echo "`date '+%Y年%m月%d日 %H:%M:%S'` 数据库服务器硬件情况开始巡检。。。"
top -bn 6 >>top
grep -n "%id" top >> newtop
grep -n "zombie" top >> insisttop
top1=`cat newtop | awk '{print $5}' | sed -n 4p | sed 's/%//g' |sed 's/id,//g'`
top2=`cat newtop | awk '{print $5}' | sed -n 5p | sed 's/%//g' |sed 's/id,//g'`
top3=`cat newtop | awk '{print $5}' | sed -n 6p | sed 's/%//g' |sed 's/id,//g'`
top4=`cat insisttop | awk '{print $10}' | sed -n 2p | sed 's/%//g' |sed 's/id,//g'`
if [ $top4 -gt 0 ]
then
echo "`date '+%Y年%m月%d日 %H:%M:%S'` 采集处理服务器上出现僵尸进程,巡检程序将自动kill该进程,如需人工确认请执行命令top后再执行ps -A -ostat,ppid,pid,cmd | grep -e '^[Zz]'来确认是否将僵尸进程杀死" >> ./newreport.txt
ps -A -o stat,ppid,pid,cmd | grep -e '^[Zz]' | awk '{print $2}' | xargs kill -9
else
echo "`date '+%Y年%m月%d日 %H:%M:%S'` 采集处理服务器上无僵尸进程正常运行!"
fi
a=${top1:0:2}
b=${top2:0:2}
c=${top3:0:2}
echo "top1: $a"
echo "top2: $b"
echo "top3: $c"
if [ $a -lt 20 ]&&[ $b -lt 20 ]&&[ $c -lt 20 ] ; then
echo "`date '+%Y年%m月%d日 %H:%M:%S'` 数据库服务器CPU占用率不正常,top取到的值是$top1,$top2,$top3,小于参考值20,请及时处理!" >> ./newreport.txt
else
echo "CPU占用率正常!"
fi
rm -rf top
rm -rf newtop
rm -rf insisttop
free1=`free -g | awk '{print $4}' | sed -n 3p | sed 's/%//g' |sed 's/t//g'`
total=`free -g | awk '{print $2}' | sed -n 2p | sed 's/%//g' |sed 's/t//g'`
canshu=0.2
tempd=`echo $total $canshu |awk '{print $1*$2}'`
biaozhun=${tempd%.*}
if [ $free1 -le $biaozhun ] ; then
echo "`date '+%Y年%m月%d日 %H:%M:%S'` 数据库服务器内存占用率过高,free -g取到的值是$free1,小于等于参考值$biaozhun,请及时处理!" >> ./newreport.txt
else
echo "内存占用率正常!"
fi
df1=`df -h | awk '{print $5}' | sed -n 2p | sed 's/%//g'`
df2=`df -h | awk '{print $5}' | sed -n 3p | sed 's/%//g'`
df3=`df -h | awk '{print $5}' | sed -n 4p | sed 's/%//g'`
df4=`df -h | awk '{print $5}' | sed -n 5p | sed 's/%//g'`
df5=`df -h | awk '{print $5}' | sed -n 6p | sed 's/%//g'`
if [ $df1 -gt 90 ]||[ $df2 -gt 90 ]||[ $df3 -gt 90 ]||[ $df4 -gt 90 ]||[ $df5 -gt 90 ] ; then
echo "`date '+%Y年%m月%d日 %H:%M:%S'` 数据库服务器磁盘占用率过高!df -h取到的值是$df1,$df2,$df3,$df4,$df5,参考值是90,若其中一个或一个以上大于参考值,请及时处理!" >> ./newreport.txt
else
echo "磁盘占用率正常!"
fi
iostat -x 2 5 >>iostat.txt
scvtm1=" `cat iostat.txt | awk '{print $11}' | sed -n 16p | sed 's/%//g' `"
scvtm2="` cat iostat.txt | awk '{print $11}' | sed -n 17p | sed 's/%//g'`"
scvtm3="` cat iostat.txt | awk '{print $11}' | sed -n 18p | sed 's/%//g'`"
scvtm4="` cat iostat.txt | awk '{print $11}' | sed -n 19p | sed 's/%//g'`"
scvtm13="` cat iostat.txt | awk '{print $11}' | sed -n 25p | sed 's/%//g'`"
scvtm6=" `cat iostat.txt | awk '{print $11}' | sed -n 26p | sed 's/%//g' `"
scvtm7="` cat iostat.txt | awk '{print $11}' | sed -n 27p | sed 's/%//g'`"
scvtm8="` cat iostat.txt | awk '{print $11}' | sed -n 28p | sed 's/%//g'`"
scvtm9="` cat iostat.txt | awk '{print $11}' | sed -n 34p | sed 's/%//g'`"
scvtm10="` cat iostat.txt | awk '{print $11}' | sed -n 35p | sed 's/%//g'`"
scvtm11="` cat iostat.txt | awk '{print $11}' | sed -n 36p | sed 's/%//g'`"
scvtm12="` cat iostat.txt | awk '{print $11}' | sed -n 37p | sed 's/%//g'`"
util1="`cat iostat.txt | awk '{print $12}' | sed -n 16p | sed 's/%//g'`"
util2="` cat iostat.txt | awk '{print $12}' | sed -n 17p | sed 's/%//g'`"
util3="` cat iostat.txt | awk '{print $12}' | sed -n 18p | sed 's/%//g'`"
util4="` cat iostat.txt | awk '{print $12}' | sed -n 19p | sed 's/%//g'`"
util5="` cat iostat.txt | awk '{print $12}' | sed -n 25p | sed 's/%//g'`"
util6=" `cat iostat.txt | awk '{print $12}' | sed -n 26p | sed 's/%//g' `"
util7="` cat iostat.txt | awk '{print $12}' | sed -n 27p | sed 's/%//g'`"
util8="` cat iostat.txt | awk '{print $12}' | sed -n 28p | sed 's/%//g'`"
util9="` cat iostat.txt | awk '{print $12}' | sed -n 34p | sed 's/%//g'`"
util10="` cat iostat.txt | awk '{print $12}' | sed -n 35p | sed 's/%//g'`"
util11="` cat iostat.txt | awk '{print $12}' | sed -n 36p | sed 's/%//g'`"
util12="` cat iostat.txt | awk '{print $12}' | sed -n 37p | sed 's/%//g'`"
maxa=`echo "$scvtm1 $scvtm2 $scvtm3 $scvtm4" | awk '{for(i=1;i<=NF;i++)$i>a?a=$i:a}END{print a}'`
maxb=`echo "$scvtm13 $scvtm6 $scvtm7 $scvtm8" | awk '{for(i=1;i<=NF;i++)$i>a?a=$i:a}END{print a}'`
maxc=`echo "$scvtm9 $scvtm10 $scvtm11 $scvtm12" | awk '{for(i=1;i<=NF;i++)$i>a?a=$i:a}END{print a}'`
maxd=`echo "$util1 $util2 $util3 $util4" | awk '{for(i=1;i<=NF;i++)$i>a?a=$i:a}END{print a}'`
maxe=`echo "$util5 $util6 $util7 $util8" | awk '{for(i=1;i<=NF;i++)$i>a?a=$i:a}END{print a}'`
maxf=`echo "$util9 $util10 $util11 $util12" | awk '{for(i=1;i<=NF;i++)$i>a?a=$i:a}END{print a}'`
m=${maxa:0:1}
n=${maxb:0:1}
h=${maxc:0:1}
k=${maxd:0:1}
l=${maxe:0:1}
o=${maxf:0:1}
if [ $m -ge 15 ]&&[ $k -ge 99 ]&&[ $k -lt 100 ]$$[ $n -ge 15 ]&&[ $l -ge 99 ]&&[ $l -lt 100 ]&&[ $h -ge 15]&&[ $o -ge 99 ]&&[ $o -lt 100 ]
then
echo "`date '+%Y年%m月%d日 %H:%M:%S'` 数据库服务器磁盘IO存在瓶颈,请及时处理!" >> ./newreport.txt
else
echo "磁盘IO正常!"
fi
rm -rf ./iostat.txt
network1=`ping -s 4096 -c 5 135.0.51.15 | awk '{print $6}' | sed -n 9p | sed 's/%//g' |sed 's/t//g'`
if [ $network1 -gt 0 ]
then
echo "`date '+%Y年%m月%d日 %H:%M:%S'` 数据库服务器到该目标IP之间的网络不稳定,ping取到的值是$network1,大于参考值是0,系统存在风险,请及时处理!" >> ./newreport.txt
else
echo "网络连通性正常!"
fi
echo "`date '+%Y年%m月%d日 %H:%M:%S'` 数据库服务器硬件情况巡检结束!"
3.2、多台服务器自动巡检脚本
脚本过程:
1> 所有的服务器之间的网络都是在同一个局域网内,所有网络两两相通。
2> 在其中选择一台性能相对较好或者是服务器运行压力较小的服务器,作为巡检服务器。
3> 通过这一服务器来实现对其他服务器的巡检,然后把巡检结果记录到巡检服务器上。
4> 每台服务器巡检结果都以时间和ip做命名用来区分,最后将所有巡检结果压缩打包。
5> 每次维护人员只需要定时去取这个压缩包查看最后结果即可,免去了对每台服务器都需要登录和输入相同的命令进行查看。本次脚本的巡检是基于TELNET服务所以被检服务器必须开启TELNET服务
#! /bin/bash
echo "start running" | tee -a
LANG=en
set `date`
path="/home/check"
echo "start running" | tee -a $path/log/$1-$2-$3.log
if [ -d /home/check/result/$1-$2-$3 ];
then
echo ''
else
mkdir -p /home/check/result/$1-$2-$3
echo `date +"%Y/%m/%d-%H:%M:%S"` "create " "$1-$2-$3" "directory success "|tee -a $path/log/$1-$2-$3.log
fi
echo `date +"%Y/%m/%d-%H:%M:%S"` "starting reading linuxconfig.txt " |tee -a $path/log/$1-$2-$3.log
cat "$path"/config/linuxconfig.txt| while read line;
do
ip=`echo $line |cut -d '=' -f2`
echo `date +"%Y/%m/%d-%H:%M:%S"` "check LINUX " $ip " starting " |tee -a $path/log/$1-$2-$3.log
(
sleep 1
echo root
sleep 1
echo root
sleep 3
echo "free -k"
echo ""
echo "df -k"
echo ""
echo "ps -ef| grep java"
echo ""
echo "netstat -an|egrep -n '80|22|21|23|9043|9044|45331|45332|39194|19195'"
echo ""
echo "/sbin/ip ad"
echo ""
echo " tail -2000 /var/log/messages | grep -v snmp |grep -i error "
echo ""
echo "/bin/dmesg |grep -i error"
echo ""
echo "top -n1|sed -n '1,5p'"
echo "exit"
echo "/usr/bin/vmstat 1 3"
echo ""
sleep 5
)|telnet $ip >/home/check/result/$1-$2-$3/$ip-$1-$2-$3-$4.txt
echo `date +"%Y/%m/%d-%H:%M:%S"` "check LINUX " $ip " end" |tee -a $path/log/$1-$2-$3.log
echo "" | tee -a $path/log/$1-$2-$3.log
done
echo `date +"%Y/%m/%d-%H:%M:%S"` "end reading linuxconfig.txt " |tee -a $path/log/$1-$2-$3.log
echo `date +"%Y/%m/%d-%H:%M:%S"` "starting reading AIXconfig.txt " | tee -a $path/log/$1-$2-$3.log
cat "$path"/config/AIXconfig.txt| while read line;
do
ip=`echo $line |cut -d '=' -f2`
echo `date +"%Y/%m/%d-%H:%M:%S"` "check IBM AIX " $ip " starting " |tee -a $path/log/$1-$2-$3.log
(
sleep 1
echo root
sleep 1
echo root
sleep 5
echo ""
echo "df -g"
echo ""
echo "ps -ef| grep java"
echo ""
echo "netstat -an|egrep -n '80|22|21|23|9043|9044|45331|45332|39194|19195'"
echo ""
echo "ifconfig -a"
echo ""
echo "topas"
echo "exit"
sleep 5
)|telnet $ip >/home/check/result/$1-$2-$3/$ip-$1-$2-$3-$4.txt
echo `date +"%Y/%m/%d-%H:%M:%S"` "check IBM AIX " $ip " end " |tee -a $path/log/$1-$2-$3.log
echo "" | tee -a $path/log/$1-$2-$3.log
done
echo `date +"%Y/%m/%d-%H:%M:%S"` "end reading AIXconfig.txt " | tee -a $path/log/$1-$2-$3.log
zip -r /home/check/result/$1-$2-$3/$1-$2-$3.zip /home/check/result/$1-$2-$3/*
echo "End running "
3.3、Linux巡检脚本
IPADDR=$(ifconfig eth0 | grep '\<inet\>' | awk '{print $2}')
export PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
source /etc/profile
[ $(id -u) -gt 0 ] && echo "请用root用户执行此脚本!" && exit 1
centosVersion=$(awk '{print $(NF-1)}' /etc/redhat-release)
VERSION="2018.08.28"
#日志相关
LOGPATH="$PROGPATH/var/log/HostDailyCheck"
[ -e $LOGPATH ] || mkdir $LOGPATH
RESULTFILE="$LOGPATH/HostDailyCheck-$IPADDR-`date +%Y%m%d`.txt"
#定义报表的全局变量
report_DateTime="" #日期 ok
report_Hostname="" #主机名 ok
report_OSRelease="" #发行版本 ok
report_Kernel="" #内核 ok
report_Language="" #语言/编码 ok
report_LastReboot="" #最近启动时间 ok
report_Uptime="" #运行时间(天) ok
report_CPUs="" #CPU数量 ok
report_CPUType="" #CPU类型 ok
report_Arch="" #CPU架构 ok
report_MemTotal="" #内存总容量(MB) ok
report_MemFree="" #内存剩余(MB) ok
report_MemUsedPercent="" #内存使用率% ok
report_DiskTotal="" #硬盘总容量(GB) ok
report_DiskFree="" #硬盘剩余(GB) ok
report_DiskUsedPercent="" #硬盘使用率% ok
report_InodeTotal="" #Inode总量 ok
report_InodeFree="" #Inode剩余 ok
report_InodeUsedPercent="" #Inode使用率 ok
report_IP="" #IP地址 ok
report_MAC="" #MAC地址 ok
report_Gateway="" #默认网关 ok
report_DNS="" #DNS ok
report_Listen="" #监听 ok
report_Selinux="" #Selinux ok
report_Firewall="" #防火墙 ok
report_USERs="" #用户 ok
report_USEREmptyPassword="" #空密码用户 ok
report_USERTheSameUID="" #相同ID的用户 ok
report_PasswordExpiry="" #密码过期(天) ok
report_RootUser="" #root用户 ok
report_Sudoers="" #sudo授权 ok
report_SSHAuthorized="" #SSH信任主机 ok
report_SSHDProtocolVersion="" #SSH协议版本 ok
report_SSHDPermitRootLogin="" #允许root远程登录 ok
report_DefunctProsess="" #僵尸进程数量 ok
report_SelfInitiatedService="" #自启动服务数量 ok
report_SelfInitiatedProgram="" #自启动程序数量 ok
report_RuningService="" #运行中服务数 ok
report_Crontab="" #计划任务数 ok
report_Syslog="" #日志服务 ok
report_SNMP="" #SNMP OK
report_NTP="" #NTP ok
report_JDK="" #JDK版本 ok
function version(){
echo ""
echo ""
echo "系统巡检脚本:Version $VERSION"
}
function getCpuStatus(){
echo ""
echo ""
echo "############################ CPU检查 #############################"
Physical_CPUs=$(grep "physical id" /proc/cpuinfo| sort | uniq | wc -l)
Virt_CPUs=$(grep "processor" /proc/cpuinfo | wc -l)
CPU_Kernels=$(grep "cores" /proc/cpuinfo|uniq| awk -F ': ' '{print $2}')
CPU_Type=$(grep "model name" /proc/cpuinfo | awk -F ': ' '{print $2}' | sort | uniq)
CPU_Arch=$(uname -m)
echo "物理CPU个数:$Physical_CPUs"
echo "逻辑CPU个数:$Virt_CPUs"
echo "每CPU核心数:$CPU_Kernels"
echo " CPU型号:$CPU_Type"
echo " CPU架构:$CPU_Arch"
#报表信息
report_CPUs=$Virt_CPUs #CPU数量
report_CPUType=$CPU_Type #CPU类型
report_Arch=$CPU_Arch #CPU架构
}
function getMemStatus(){
echo ""
echo ""
echo "############################ 内存检查 ############################"
if [[ $centosVersion < 7 ]];then
free -mo
else
free -h
fi
#报表信息
MemTotal=$(grep MemTotal /proc/meminfo| awk '{print $2}') #KB
MemFree=$(grep MemFree /proc/meminfo| awk '{print $2}') #KB
let MemUsed=MemTotal-MemFree
MemPercent=$(awk "BEGIN {if($MemTotal==0){printf 100}else{printf \"%.2f\",$MemUsed*100/$MemTotal}}")
report_MemTotal="$((MemTotal/1024))""MB" #内存总容量(MB)
report_MemFree="$((MemFree/1024))""MB" #内存剩余(MB)
report_MemUsedPercent="$(awk "BEGIN {if($MemTotal==0){printf 100}else{printf \"%.2f\",$MemUsed*100/$MemTotal}}")""%" #内存使用率%
}
function getDiskStatus(){
echo ""
echo ""
echo "############################ 磁盘检查 ############################"
df -hiP | sed 's/Mounted on/Mounted/'> /tmp/inode
df -hTP | sed 's/Mounted on/Mounted/'> /tmp/disk
join /tmp/disk /tmp/inode | awk '{print $1,$2,"|",$3,$4,$5,$6,"|",$8,$9,$10,$11,"|",$12}'| column -t
diskdata=$(df -TP | sed '1d' | awk '$2!="tmpfs"{print}')
disktotal=$(echo "$diskdata" | awk '{total+=$3}END{print total}')
diskused=$(echo "$diskdata" | awk '{total+=$4}END{print total}')
diskfree=$((disktotal-diskused))
diskusedpercent=$(echo $disktotal $diskused | awk '{if($1==0){printf 100}else{printf "%.2f",$2*100/$1}}')
inodedata=$(df -iTP | sed '1d' | awk '$2!="tmpfs"{print}')
inodetotal=$(echo "$inodedata" | awk '{total+=$3}END{print total}')
inodeused=$(echo "$inodedata" | awk '{total+=$4}END{print total}')
inodefree=$((inodetotal-inodeused))
inodeusedpercent=$(echo $inodetotal $inodeused | awk '{if($1==0){printf 100}else{printf "%.2f",$2*100/$1}}')
report_DiskTotal=$((disktotal/1024/1024))"GB"
report_DiskFree=$((diskfree/1024/1024))"GB"
report_DiskUsedPercent="$diskusedpercent""%"
report_InodeTotal=$((inodetotal/1000))"K"
report_InodeFree=$((inodefree/1000))"K"
report_InodeUsedPercent="$inodeusedpercent""%"
}
function getSystemStatus(){
echo ""
echo ""
echo "############################ 系统检查 ############################"
if [ -e /etc/sysconfig/i18n ];then
default_LANG="$(grep "LANG=" /etc/sysconfig/i18n | grep -v "^#" | awk -F '"' '{print $2}')"
else
default_LANG=$LANG
fi
export LANG="en_US.UTF-8"
Release=$(cat /etc/redhat-release 2>/dev/null)
Kernel=$(uname -r)
OS=$(uname -o)
Hostname=$(uname -n)
SELinux=$(/usr/sbin/sestatus | grep "SELinux status: " | awk '{print $3}')
LastReboot=$(who -b | awk '{print $3,$4}')
uptime=$(uptime | sed 's/.*up \([^,]*\), .*/\1/')
echo " 系统:$OS"
echo " 发行版本:$Release"
echo " 内核:$Kernel"
echo " 主机名:$Hostname"
echo " SELinux:$SELinux"
echo "语言/编码:$default_LANG"
echo " 当前时间:$(date +'%F %T')"
echo " 最后启动:$LastReboot"
echo " 运行时间:$uptime"
#报表信息
report_DateTime=$(date +"%F %T") #日期
report_Hostname="$Hostname" #主机名
report_OSRelease="$Release" #发行版本
report_Kernel="$Kernel" #内核
report_Language="$default_LANG" #语言/编码
report_LastReboot="$LastReboot" #最近启动时间
report_Uptime="$uptime" #运行时间(天)
report_Selinux="$SELinux"
export LANG="$default_LANG"
}
function getServiceStatus(){
echo ""
echo ""
echo "############################ 服务检查 ############################"
echo ""
if [[ $centosVersion > 7 ]];then
conf=$(systemctl list-unit-files --type=service --state=enabled --no-pager | grep "enabled")
process=$(systemctl list-units --type=service --state=running --no-pager | grep ".service")
#报表信息
report_SelfInitiatedService="$(echo "$conf" | wc -l)" #自启动服务数量
report_RuningService="$(echo "$process" | wc -l)" #运行中服务数量
else
conf=$(/sbin/chkconfig | grep -E ":on|:启用")
process=$(/sbin/service --status-all 2>/dev/null | grep -E "is running|正在运行")
#报表信息
report_SelfInitiatedService="$(echo "$conf" | wc -l)" #自启动服务数量
report_RuningService="$(echo "$process" | wc -l)" #运行中服务数量
fi
echo "服务配置"
echo "--------"
echo "$conf" | column -t
echo ""
echo "正在运行的服务"
echo "--------------"
echo "$process"
}
function getAutoStartStatus(){
echo ""
echo ""
echo "############################ 自启动检查 ##########################"
conf=$(grep -v "^#" /etc/rc.d/rc.local| sed '/^$/d')
echo "$conf"
#报表信息
report_SelfInitiatedProgram="$(echo $conf | wc -l)" #自启动程序数量
}
function getLoginStatus(){
echo ""
echo ""
echo "############################ 登录检查 ############################"
last | head
}
function getNetworkStatus(){
echo ""
echo ""
echo "############################ 网络检查 ############################"
if [[ $centosVersion < 7 ]];then
/sbin/ifconfig -a | grep -v packets | grep -v collisions | grep -v inet6
else
#ip a
for i in $(ip link | grep BROADCAST | awk -F: '{print $2}');do ip add show $i | grep -E "BROADCAST|global"| awk '{print $2}' | tr '\n' ' ' ;echo "" ;done
fi
GATEWAY=$(ip route | grep default | awk '{print $3}')
DNS=$(grep nameserver /etc/resolv.conf| grep -v "#" | awk '{print $2}' | tr '\n' ',' | sed 's/,$//')
echo ""
echo "网关:$GATEWAY "
echo " DNS:$DNS"
IP=$(ip -f inet addr | grep -v 127.0.0.1 | grep inet | awk '{print $NF,$2}' | tr '\n' ',' | sed 's/,$//')
MAC=$(ip link | grep -v "LOOPBACK\|loopback" | awk '{print $2}' | sed 'N;s/\n//' | tr '\n' ',' | sed 's/,$//')
report_IP="$IP"
report_MAC=$MAC
report_Gateway="$GATEWAY"
report_DNS="$DNS"
}
function getListenStatus(){
echo ""
echo ""
echo "############################ 监听检查 ############################"
TCPListen=$(ss -ntul | column -t)
echo "$TCPListen"
report_Listen="$(echo "$TCPListen"| sed '1d' | awk '/tcp/ {print $5}' | awk -F: '{print $NF}' | sort | uniq | wc -l)"
}
function getCronStatus(){
echo ""
echo ""
echo "############################ 计划任务检查 ########################"
Crontab=0
for shell in $(grep -v "/sbin/nologin" /etc/shells);do
for user in $(grep "$shell" /etc/passwd| awk -F: '{print $1}');do
crontab -l -u $user >/dev/null 2>&1
status=$?
if [ $status -eq 0 ];then
echo "$user"
echo "--------"
crontab -l -u $user
let Crontab=Crontab+$(crontab -l -u $user | wc -l)
echo ""
fi
done
done
find /etc/cron* -type f | xargs -i ls -l {} | column -t
let Crontab=Crontab+$(find /etc/cron* -type f | wc -l)
report_Crontab="$Crontab"
}
function getHowLongAgo(){
datetime="$*"
[ -z "$datetime" ] && echo "错误的参数:getHowLongAgo() $*"
Timestamp=$(date +%s -d "$datetime")
Now_Timestamp=$(date +%s)
Difference_Timestamp=$(($Now_Timestamp-$Timestamp))
days=0;hours=0;minutes=0;
sec_in_day=$((60*60*24));
sec_in_hour=$((60*60));
sec_in_minute=60
while (( $(($Difference_Timestamp-$sec_in_day)) > 1 ))
do
let Difference_Timestamp=Difference_Timestamp-sec_in_day
let days++
done
while (( $(($Difference_Timestamp-$sec_in_hour)) > 1 ))
do
let Difference_Timestamp=Difference_Timestamp-sec_in_hour
let hours++
done
echo "$days 天 $hours 小时前"
}
function getUserLastLogin(){
username=$1
: ${username:="`whoami`"}
thisYear=$(date +%Y)
oldesYear=$(last | tail -n1 | awk '{print $NF}')
while(( $thisYear >= $oldesYear));do
loginBeforeToday=$(last $username | grep $username | wc -l)
loginBeforeNewYearsDayOfThisYear=$(last $username -t $thisYear"0101000000" | grep $username | wc -l)
if [ $loginBeforeToday -eq 0 ];then
echo "从未登录过"
break
elif [ $loginBeforeToday -gt $loginBeforeNewYearsDayOfThisYear ];then
lastDateTime=$(last -i $username | head -n1 | awk '{for(i=4;i<(NF-2);i++)printf"%s ",$i}')" $thisYear"
lastDateTime=$(date "+%Y-%m-%d %H:%M:%S" -d "$lastDateTime")
echo "$lastDateTime"
break
else
thisYear=$((thisYear-1))
fi
done
}
function getUserStatus(){
echo ""
echo ""
echo "############################ 用户检查 ############################"
pwdfile="$(cat /etc/passwd)"
Modify=$(stat /etc/passwd | grep Modify | tr '.' ' ' | awk '{print $2,$3}')
echo "/etc/passwd 最后修改时间:$Modify ($(getHowLongAgo $Modify))"
echo ""
echo "特权用户"
echo "--------"
RootUser=""
for user in $(echo "$pwdfile" | awk -F: '{print $1}');do
if [ $(id -u $user) -eq 0 ];then
echo "$user"
RootUser="$RootUser,$user"
fi
done
echo ""
echo "用户列表"
echo "--------"
USERs=0
echo "$(
echo "用户名 UID GID HOME SHELL 最后一次登录"
for shell in $(grep -v "/sbin/nologin" /etc/shells);do
for username in $(grep "$shell" /etc/passwd| awk -F: '{print $1}');do
userLastLogin="$(getUserLastLogin $username)"
echo "$pwdfile" | grep -w "$username" |grep -w "$shell"| awk -F: -v lastlogin="$(echo "$userLastLogin" | tr ' ' '_')" '{print $1,$3,$4,$6,$7,lastlogin}'
done
let USERs=USERs+$(echo "$pwdfile" | grep "$shell"| wc -l)
done
)" | column -t
echo ""
echo "空密码用户"
echo "----------"
USEREmptyPassword=""
for shell in $(grep -v "/sbin/nologin" /etc/shells);do
for user in $(echo "$pwdfile" | grep "$shell" | cut -d: -f1);do
r=$(awk -F: '$2=="!!"{print $1}' /etc/shadow | grep -w $user)
if [ ! -z $r ];then
echo $r
USEREmptyPassword="$USEREmptyPassword,"$r
fi
done
done
echo ""
echo "相同ID的用户"
echo "------------"
USERTheSameUID=""
UIDs=$(cut -d: -f3 /etc/passwd | sort | uniq -c | awk '$1>1{print $2}')
for uid in $UIDs;do
echo -n "$uid";
USERTheSameUID="$uid"
r=$(awk -F: 'ORS="";$3=='"$uid"'{print ":",$1}' /etc/passwd)
echo "$r"
echo ""
USERTheSameUID="$USERTheSameUID $r,"
done
report_USERs="$USERs"
report_USEREmptyPassword=$(echo $USEREmptyPassword | sed 's/^,//')
report_USERTheSameUID=$(echo $USERTheSameUID | sed 's/,$//')
report_RootUser=$(echo $RootUser | sed 's/^,//')
}
function getPasswordStatus {
echo ""
echo ""
echo "############################ 密码检查 ############################"
pwdfile="$(cat /etc/passwd)"
echo ""
echo "密码过期检查"
echo "------------"
result=""
for shell in $(grep -v "/sbin/nologin" /etc/shells);do
for user in $(echo "$pwdfile" | grep "$shell" | cut -d: -f1);do
get_expiry_date=$(/usr/bin/chage -l $user | grep 'Password expires' | cut -d: -f2)
if [[ $get_expiry_date = ' never' || $get_expiry_date = 'never' ]];then
printf "%-15s 永不过期\n" $user
result="$result,$user:never"
else
password_expiry_date=$(date -d "$get_expiry_date" "+%s")
current_date=$(date "+%s")
diff=$(($password_expiry_date-$current_date))
let DAYS=$(($diff/(60*60*24)))
printf "%-15s %s天后过期\n" $user $DAYS
result="$result,$user:$DAYS days"
fi
done
done
report_PasswordExpiry=$(echo $result | sed 's/^,//')
echo ""
echo "密码策略检查"
echo "------------"
grep -v "#" /etc/login.defs | grep -E "PASS_MAX_DAYS|PASS_MIN_DAYS|PASS_MIN_LEN|PASS_WARN_AGE"
}
function getSudoersStatus(){
echo ""
echo ""
echo "############################ Sudoers检查 #########################"
conf=$(grep -v "^#" /etc/sudoers| grep -v "^Defaults" | sed '/^$/d')
echo "$conf"
echo ""
report_Sudoers="$(echo $conf | wc -l)"
}
function getInstalledStatus(){
echo ""
echo ""
echo "############################ 软件检查 ############################"
rpm -qa --last | head | column -t
}
function getProcessStatus(){
echo ""
echo ""
echo "############################ 进程检查 ############################"
if [ $(ps -ef | grep defunct | grep -v grep | wc -l) -ge 1 ];then
echo ""
echo "僵尸进程";
echo "--------"
ps -ef | head -n1
ps -ef | grep defunct | grep -v grep
fi
echo ""
echo "内存占用TOP10"
echo "-------------"
echo -e "PID %MEM RSS COMMAND
$(ps aux | awk '{print $2, $4, $6, $11}' | sort -k3rn | head -n 10 )"| column -t
echo ""
echo "CPU占用TOP10"
echo "------------"
top b -n1 | head -17 | tail -11
report_DefunctProsess="$(ps -ef | grep defunct | grep -v grep|wc -l)"
}
function getJDKStatus(){
echo ""
echo ""
echo "############################ JDK检查 #############################"
java -version 2>/dev/null
if [ $? -eq 0 ];then
java -version 2>&1
fi
echo "JAVA_HOME=\"$JAVA_HOME\""
report_JDK="$(java -version 2>&1 | grep version | awk '{print $1,$3}' | tr -d '"')"
}
function getSyslogStatus(){
echo ""
echo ""
echo "############################ syslog检查 ##########################"
echo "服务状态:$(getState rsyslog)"
echo ""
echo "/etc/rsyslog.conf"
echo "-----------------"
cat /etc/rsyslog.conf 2>/dev/null | grep -v "^#" | grep -v "^\\$" | sed '/^$/d' | column -t
report_Syslog="$(getState rsyslog)"
}
function getFirewallStatus(){
echo ""
echo ""
echo "############################ 防火墙检查 ##########################"
if [[ $centosVersion < 7 ]];then
/etc/init.d/iptables status >/dev/null 2>&1
status=$?
if [ $status -eq 0 ];then
s="active"
elif [ $status -eq 3 ];then
s="inactive"
elif [ $status -eq 4 ];then
s="permission denied"
else
s="unknown"
fi
else
s="$(getState iptables)"
fi
echo "iptables: $s"
echo ""
echo "/etc/sysconfig/iptables"
echo "-----------------------"
cat /etc/sysconfig/iptables 2>/dev/null
report_Firewall="$s"
}
function getSNMPStatus(){
echo ""
echo ""
echo "############################ SNMP检查 ############################"
status="$(getState snmpd)"
echo "服务状态:$status"
echo ""
if [ -e /etc/snmp/snmpd.conf ];then
echo "/etc/snmp/snmpd.conf"
echo "--------------------"
cat /etc/snmp/snmpd.conf 2>/dev/null | grep -v "^#" | sed '/^$/d'
fi
report_SNMP="$(getState snmpd)"
}
function getState(){
if [[ $centosVersion < 7 ]];then
if [ -e "/etc/init.d/$1" ];then
if [ `/etc/init.d/$1 status 2>/dev/null | grep -E "is running|正在运行" | wc -l` -ge 1 ];then
r="active"
else
r="inactive"
fi
else
r="unknown"
fi
else
r="$(systemctl is-active $1 2>&1)"
fi
echo "$r"
}
function getSSHStatus(){
echo ""
echo ""
echo "############################ SSH检查 #############################"
pwdfile="$(cat /etc/passwd)"
echo "服务状态:$(getState sshd)"
Protocol_Version=$(cat /etc/ssh/sshd_config | grep Protocol | awk '{print $2}')
echo "SSH协议版本:$Protocol_Version"
echo ""
echo "信任主机"
echo "--------"
authorized=0
for user in $(echo "$pwdfile" | grep /bin/bash | awk -F: '{print $1}');do
authorize_file=$(echo "$pwdfile" | grep -w $user | awk -F: '{printf $6"/.ssh/authorized_keys"}')
authorized_host=$(cat $authorize_file 2>/dev/null | awk '{print $3}' | tr '\n' ',' | sed 's/,$//')
if [ ! -z $authorized_host ];then
echo "$user 授权 \"$authorized_host\" 无密码访问"
fi
let authorized=authorized+$(cat $authorize_file 2>/dev/null | awk '{print $3}'|wc -l)
done
echo ""
echo "是否允许ROOT远程登录"
echo "--------------------"
config=$(cat /etc/ssh/sshd_config | grep PermitRootLogin)
firstChar=${config:0:1}
if [ $firstChar == "#" ];then
PermitRootLogin="yes"
else
PermitRootLogin=$(echo $config | awk '{print $2}')
fi
echo "PermitRootLogin $PermitRootLogin"
echo ""
echo "/etc/ssh/sshd_config"
echo "--------------------"
cat /etc/ssh/sshd_config | grep -v "^#" | sed '/^$/d'
report_SSHAuthorized="$authorized"
report_SSHDProtocolVersion="$Protocol_Version"
report_SSHDPermitRootLogin="$PermitRootLogin"
}
function getNTPStatus(){
echo ""
echo ""
echo "############################ NTP检查 #############################"
if [ -e /etc/ntp.conf ];then
echo "服务状态:$(getState ntpd)"
echo ""
echo "/etc/ntp.conf"
echo "-------------"
cat /etc/ntp.conf 2>/dev/null | grep -v "^#" | sed '/^$/d'
fi
report_NTP="$(getState ntpd)"
}
function uploadHostDailyCheckReport(){
json="{
\"DateTime\":\"$report_DateTime\",
\"Hostname\":\"$report_Hostname\",
\"OSRelease\":\"$report_OSRelease\",
\"Kernel\":\"$report_Kernel\",
\"Language\":\"$report_Language\",
\"LastReboot\":\"$report_LastReboot\",
\"Uptime\":\"$report_Uptime\",
\"CPUs\":\"$report_CPUs\",
\"CPUType\":\"$report_CPUType\",
\"Arch\":\"$report_Arch\",
\"MemTotal\":\"$report_MemTotal\",
\"MemFree\":\"$report_MemFree\",
\"MemUsedPercent\":\"$report_MemUsedPercent\",
\"DiskTotal\":\"$report_DiskTotal\",
\"DiskFree\":\"$report_DiskFree\",
\"DiskUsedPercent\":\"$report_DiskUsedPercent\",
\"InodeTotal\":\"$report_InodeTotal\",
\"InodeFree\":\"$report_InodeFree\",
\"InodeUsedPercent\":\"$report_InodeUsedPercent\",
\"IP\":\"$report_IP\",
\"MAC\":\"$report_MAC\",
\"Gateway\":\"$report_Gateway\",
\"DNS\":\"$report_DNS\",
\"Listen\":\"$report_Listen\",
\"Selinux\":\"$report_Selinux\",
\"Firewall\":\"$report_Firewall\",
\"USERs\":\"$report_USERs\",
\"USEREmptyPassword\":\"$report_USEREmptyPassword\",
\"USERTheSameUID\":\"$report_USERTheSameUID\",
\"PasswordExpiry\":\"$report_PasswordExpiry\",
\"RootUser\":\"$report_RootUser\",
\"Sudoers\":\"$report_Sudoers\",
\"SSHAuthorized\":\"$report_SSHAuthorized\",
\"SSHDProtocolVersion\":\"$report_SSHDProtocolVersion\",
\"SSHDPermitRootLogin\":\"$report_SSHDPermitRootLogin\",
\"DefunctProsess\":\"$report_DefunctProsess\",
\"SelfInitiatedService\":\"$report_SelfInitiatedService\",
\"SelfInitiatedProgram\":\"$report_SelfInitiatedProgram\",
\"RuningService\":\"$report_RuningService\",
\"Crontab\":\"$report_Crontab\",
\"Syslog\":\"$report_Syslog\",
\"SNMP\":\"$report_SNMP\",
\"NTP\":\"$report_NTP\",
\"JDK\":\"$report_JDK\"
}"
curl -l -H "Content-type: application/json" -X POST -d "$json" "$uploadHostDailyCheckReportApi" 2>/dev/null
}
function check(){
version
getSystemStatus
getCpuStatus
getMemStatus
getDiskStatus
getNetworkStatus
getListenStatus
getProcessStatus
getServiceStatus
getAutoStartStatus
getLoginStatus
getCronStatus
getUserStatus
getPasswordStatus
getSudoersStatus
getJDKStatus
getFirewallStatus
getSSHStatus
getSyslogStatus
getSNMPStatus
getNTPStatus
getInstalledStatus
}
check > $RESULTFILE
如果巡检服务器是windows,可将其配置为rsync服务器,安装cwRsyncServer,
3.4、监控 WebLogic shell脚本
#!/bin/bash
CLASSPATH="/opt/Oracle/Middleware/wlserver_10.3/server/lib/weblogic.jar:$CLASSPATH"
PATH="/usr/java/jdk1.6.0_45/bin:$PATH"
URL="192.168.222.11:7020"
USER_NAME="weblogic"
PASS_WORD="weblogic1"
DOMAIN_NAME="MedRecDomain"
SERVER_NAME="MedRecAdmSvr"
STATE_ALL=$(java weblogic.Admin -url $URL -username $USER_NAME -password $PASS_WORD get -pretty -mbean "$DOMAIN_NAME:Location=$SERVER_NAME,Name=$SERVER_NAME,Type=ServerRuntime")
echo "$STATE_ALL" | grep -q "State: RUNNING"
if [ $? == 0 ]; then
echo "$URL $DOMAIN_NAME $SERVER_NAME running status is OK"
else
echo "$URL $DOMAIN_NAME $SERVER_NAME running status is not OK"
fi
echo "$STATE_ALL" | grep -q "State:HEALTH_OK"
if [ $? == 0 ]; then
echo "$URL $DOMAIN_NAME $SERVER_NAME health status is OK"
else
echo "$URL $DOMAIN_NAME $SERVER_NAME health status is not OK"
fi
SOCKET_MAX=200
SOCKET_NOW=$(echo "$STATE_ALL" | awk '/OpenSocketsCurrentCount/{print $2}')
if [ x$SOCKET_NOW == x ]; then
echo "$URL $DOMAIN_NAME $SERVER_NAME open sockets number is not OK: fail to get"
else
if [ $SOCKET_NOW -gt $SOCKET_MAX ]; then
echo "$URL $DOMAIN_NAME $SERVER_NAME health status is not OK: $SOCKET_NOW greater than $SOCKET_MAX"
else
echo "$URL $DOMAIN_NAME $SERVER_NAME health status is OK: $SOCKET_NOW not greater than $SOCKET_MAX"
fi
fi
3.5、weblogic日志按天生成压缩保存
#!/bin/bash
TODAY=`date -u +"%Y%m%d"`
/usr/bin/gzip -c /app/weblogic/Oracle/Middleware/user_projects/domains/base_domain/bin/AdminServer.log>/app/weblogic/Oracle/Middleware/user_projects/domains/base_domain/bin/AdminServer${TODAY}.out.gz
> /app/weblogic/Oracle/Middleware/user_projects/domains/base_domain/bin/AdminServer.log
TODAY=`date -u +"%Y%m%d"`
/usr/bin/gzip -c /root/Oracle/Middleware/user_projects/domains/base_domain/bin/nohup.out>/root/Oracle/Middleware/user_projects/domains/base_domain/bin/nohup${TODAY}.out.gz
> /root/Oracle/Middleware/user_projects/domains/base_domain/bin/nohup.out
3.6、weblogic状态监控脚本
echo "======================================welcome=============================================="
echo "==== ======"
echo "==== 此脚本是用来监控weblogic的domain运行状态主要的监控对象有 ======"
echo "==== server,Thread,Request,Jdbc State and Socckets ======"
echo "==== 使用时修改url, usernam,password 即可 ======"
echo "==== create by xxx at 2009=03=30 ======"
echo "==========================================================================================="
url=t3://***.***.***.***.***:8082
username=weblogic
password=123456
while [ true ]
do
echo "==============================weblogic的domain的名称========================================================="
java -cp /bea/weblogic81/server/lib/weblogic.jar weblogic.Admin -url $url -username $username -password $password GET -pretty -type Server -property Parent | awk '/^/t/' | awk 'NR==1{print $2}'
echo"=============================================================================================================="
echo "========================================目前空闲线程=========================================================="
java -cp /bea/weblogic81/server/lib/weblogic.jar weblogic.Admin -url $url -username $username -password $password GET -pretty -type ExecuteQueueRuntime -property ExecuteThreadCurrentIdleCount -property ExecuteThreadTotalCount -property ServicedRequestTotalCount
echo"=============================================================================================================="
echo "======================================= server_ip和port========================================================"
java -cp /bea/weblogic81/server/lib/weblogic.jar weblogic.Admin -url $url -username $username -password $password GET -pretty -type Server -property Name -property ListenAddress -property ListenPort
echo"=============================================================================================================="
echo "=======================================Server运行状态和OpenSocketsCurrentCount数量=============================="
java -cp /bea/weblogic81/server/lib/weblogic.jar weblogic.Admin -url $url -username $username -password $password GET -pretty -type ServerRuntime -property State -property Server -property OpenSocketsCurrentCount
echo"=============================================================================================================="
echo "============================JDBC连接池的初始化和最大各数以及已经发布再上面的server=============================="
java -cp /bea/weblogic81/server/lib/weblogic.jar weblogic.Admin -url $url -username $username -password $password GET -pretty -type JDBCConnectionPool -property Name -property InitialCapacity -property MaxCapacity -property Targets
echo"=============================================================================================================="
sleep 60
done
另外还可使用软件来监控:hyperic hq和Jennifer软件。
3.7、监控weblogic的python脚本
1
2
3
4
5
6
7
8
9
10
11 username='weblogic'
12 password='isp902isp'
13 url='t3://10.200.36.210:17101'
14 LOOPS=3
15 IntervalTime=30000
16 FILEPATH="e:/logs/"
17 newline = "\n"
18
19
20
21 def WriteToFile(ServerName, SubModule, LogString, LSTARTTIME, FILENAME):
22
23 if SubModule == "ServerCoreInfo":
24 HeadLineInfo = "DateTime,ServerName,ExecuteThreadIdleCount,StandbyThreadCount,ExecuteThreadTotalCount,busythread,HoggingThreadCount"
25 elif SubModule == "DataSourceInfo":
26 HeadLineInfo = "DateTime,ServerName,DataSourceName,ActiveConnectionsCurrentCount,CurrCapacity,WaitingForConnectionCurrentCount,WaitingForConnectionTotal"
27
28 if not os.path.exists(FILENAME):
29 print "path not exist, create log file by self."
30 f = open(FILENAME, "a+")
31 f.write(HeadLineInfo + newline)
32 f.write(LSTARTTIME + "," + ServerName + "," + LogString + newline)
33 f.close()
34 f = None
35 else:
36 f = open(FILENAME, "a+")
37 f.write(LSTARTTIME + "," + ServerName + "," + LogString + newline)
38 f.close()
39 f = None
40
41 def getCurrentTime():
42 s=SimpleDateFormat("yyyyMMdd HHmmss")
43 currentTime=s.format(Date())
44 return currentTime
45 def GetJdbcRuntimeInfo():
46 domainRuntime()
47 servers = domainRuntimeService.getServerRuntimes();
48 print ' ******************DATASOURCE CONNECTION POOL RUNTIME INFORMATION*******'
49 for server in servers:
50 print 'SERVER: ' + server.getName();
51 ServerName=server.getName()
52 jdbcRuntime = server.getJDBCServiceRuntime();
53 datasources = jdbcRuntime.getJDBCDataSourceRuntimeMBeans();
54 for datasource in datasources:
55 ds_name=datasource.getName()
56 print('-Data Source: ' + datasource.getName() + ', Active Connections: ' + repr(datasource.getActiveConnectionsCurrentCount()) + ', CurrCapacity: ' + repr(datasource.getCurrCapacity())+' , WaitingForConnectionCurrentCount: '+repr(datasource.getWaitingForConnectionCurrentCount())+' , WaitingForConnectionTotal: '+str(datasource.getWaitingForConnectionTotal()));
57 FILENAME=FILEPATH + ServerName +"_"+ ds_name + "_"+ "DataSourceInfo" +".csv"
58 LSTARTTIME=getCurrentTime()
59 dsLogString=ds_name +','+ str(datasource.getActiveConnectionsCurrentCount())+','+repr(datasource.getCurrCapacity())+','+str(datasource.getWaitingForConnectionCurrentCount())+','+str(datasource.getWaitingForConnectionTotal())
60 WriteToFile(ServerName, "DataSourceInfo", dsLogString, LSTARTTIME, FILENAME)
61
62 def GetThreadRuntimeInfo():
63 domainRuntime()
64 servers=domainRuntimeService.getServerRuntimes();
65 print ' ******************SERVER QUEUE THREAD RUNTIME INFOMATION***************'
66 for server in servers:
67 print 'SERVER: ' + server.getName()
68 ServerName=server.getName()
69 threadRuntime=server.getThreadPoolRuntime()
70 hoggingThreadCount = str(threadRuntime.getHoggingThreadCount())
71 idleThreadCount = str(threadRuntime.getExecuteThreadIdleCount())
72 standbycount = str(threadRuntime.getStandbyThreadCount())
73 threadTotalCount = str(threadRuntime.getExecuteThreadTotalCount())
74 busythread=str(threadRuntime.getExecuteThreadTotalCount()-threadRuntime.getStandbyThreadCount()-threadRuntime.getExecuteThreadIdleCount()-1)
75 print ('-Thread :' + 'idleThreadCount:' + idleThreadCount+' ,standbycount:'+standbycount+' , threadTotalCount: '+threadTotalCount+' , hoggingThreadCount:'+hoggingThreadCount+' ,busythread:'+busythread)
76 FILENAME=FILEPATH + ServerName +"_"+ "ServerCoreInfo" +".csv"
77 LSTARTTIME=getCurrentTime()
78 serLogString=idleThreadCount+','+standbycount+','+threadTotalCount+','+busythread+','+hoggingThreadCount
79 WriteToFile(ServerName, "ServerCoreInfo", serLogString, LSTARTTIME, FILENAME)
80
81
82
83 if __name__ == '__main__':
84 from wlstModule import *
85
86
87 from java.util import Date
88 from java.text import SimpleDateFormat
89 print 'starting the script ....'
90 connect(username,password, url);
91 try:
92 for i in range(LOOPS) :
93
94 GetThreadRuntimeInfo()
95 GetJdbcRuntimeInfo()
96 java.lang.Thread.sleep(IntervalTime)
97
98 except Exception, e:
99 print e
100 dumpStack()
101 raise
102 disconnect()
将上述脚本保存为ColletRuntime.py,放到weblogic的安装目录D:\weblogic\bea\wlserver_10.3\common\bin下,修改collectionRuntime.py中的weblogic的用户名,密码,IP,端口,日志路径修改成正确的数据。打开CMD窗口,切换到D:\weblogic\bea\wlserver_10.3\common\bin下,然后执行命令wlst.cmd CollectRuntime.py 在日志路径下会看到自动生成的CSV文件。能看到HoggingThread和busyThread的数据。
3.8、oracle自动化巡检脚本
#!/bin/bash
echo 'Instance Health Data'
echo '================================================'
echo 'The current database is $ORACLE_SID'
echo 'The current running processes for $ORACLE_SID are'
echo '================================================'
ps -ef|grep $ORACLE_SID
sqlplus -S /nolog <<EOF
connect / as sysdba
set feedback off
set heading off
select '00.instance information' from dual;
select '================================================' from dual;
set linesize 1000
set pagesize 1000
set heading on
select * from v\$instance;
set heading off
select '01:database created date and archive type' from dual;
select '================================================' from dual;
set heading on
Select Created, Log_Mode, Log_Mode From V\$Database;
set heading off
select '1.ulimit oracle' from dual;
select '================================================' from dual;
!ulimit -a
set heading off
select '2.installed production option' from dual;
select '================================================' from dual;
set linesize 1000
set pagesize 1000
set heading on
select * from v\$option;
set heading off
select '3.used production option' from dual;
select '================================================' from dual;
set linesize 1000
set pagesize 1000
col COMP_NAME for a40
set heading on
select COMP_ID, COMP_NAME, VERSION,STATUS from dba_registry;
set heading off
select '4.spfile' from dual;
select '================================================' from dual;
show parameter spfile
set heading off
select '5.not default parameter' from dual;
select '================================================' from dual;
col name for a40
col value for a40
set heading on
select name,value from v\$parameter where isdefault='FALSE';
set heading off
select '6.control file' from dual;
select '================================================' from dual;
show parameter control_files
set heading off
select '7.backup control file' from dual;
select '================================================' from dual;
alter database backup controlfile to trace;
set heading off
select '8.log file' from dual;
select '================================================' from dual;
set linesize 1000
set pagesize 1000
set heading on
select group#,thread#,bytes/1024/1024 size_MB , members, archived,status from
v\$Log;
set heading off
select '9.log file' from dual;
col MEMBER for a40
select '================================================' from dual;
set heading on
select * From v\$logfile order by 1;
set heading off
select '10.Archive log' from dual;
select '================================================' from dual;
Archive log list
select '11.data file' from dual;
select '================================================' from dual;
set heading on
select count(*),sum(bytes)/1024/1024/1024 ||'G' max_G from v\$datafile;
SELECT trunc(sum(sum_m-sum_free_m)/1024,2)||'G' used_G
FROM (
SELECT tablespace_name,sum(bytes)/1024/1024 AS sum_m FROM dba_data_files
where tablespace_name not like 'UNDO%' GROUP BY tablespace_name) df,
(SELECT tablespace_name,
sum(bytes)/1024/1024 AS sum_free_m
FROM dba_free_space GROUP BY tablespace_name ) fs
where df.tablespace_name=fs.tablespace_name;
set heading off
select '12.data file location' from dual;
select '================================================' from dual;
set heading on
select t1.TABLESPACE_NAME,t1.FILE_ID, t1.bytes/1024/1024
SIZE_MB,t1.AUTOEXTENSIBLE AUT,t2.status,t1.FILE_NAME
from dba_data_files t1,v\$datafile t2
where t1.file_id=t2.file#;
set heading off
select '13-1.temp data file' from dual;
select '================================================' from dual;
set heading on
select FILE_NAME,FILE_ID,TABLESPACE_NAME,BYTES/1024/1024
byte_MB,status,AUTOEXTENSIBLE from sys.dba_temp_files;
set heading off
select '13-2.temp tablespace' from dual;
select '================================================' from dual;
set heading on
col file_name for a30
col byte_MB for a20
col cached_MB for a20
SELECT d.file_name, v.status, TO_CHAR((d.bytes / 1024 / 1024), '99999990.000')
byte_MB,
TO_CHAR(NVL(t.bytes_cached, 0) / 1024 / 1024, '99999990.000') cached_MB,
d.autoextensible, d.increment_by, d.maxblocks
FROM sys.dba_temp_files d, v\$temp_extent_pool t, v\$tempfile v
WHERE (t.file_id (+)= d.file_id) AND (d.tablespace_name = 'TEMP') AND (d.file_id
= v.file#);
set heading off
select '14.system tablespace' from dual;
select '================================================' from dual;
set heading on
select owner,segment_type,segment_name from dba_segments where owner not
in('SYS','SYSTEM','MDSYS','ORDSYS','OUTLN','WMSYS') and
tablespace_name='SYSTEM' order by 1;
exit
EOF
ora_version=`sqlplus -S '/ as sysdba' <<EOF
set head off
select version from v\\\$instance;
exit;
EOF`
echo $ora_version
if [ `echo $ora_version|awk -F"." '{print $1}'` -ne 8 ]
then
sqlplus -S /nolog <<EOF
conn / as sysdba
set linesize 1000
set pagesize 1000
set heading off
select '15.tablespace fragmentation and free' from dual;
select '================================================' from dual;
col TABLESPACE_NAME for a30
col FREE_PCT for a20
set heading on
SELECT df.TABLESPACE_NAME,FILES, extent_management ,sum_m as
TOTAL_SIZE,--sum(largest) as "MAXFREE_MB",
sum_free_m as "FREE_MB",to_char(100*sum_free_m/sum_m, '999.99') AS
FREE_PCT--,sum(blocks) as "FREE_EXTENTS"
FROM ( SELECT tablespace_name,count(file_id) as files ,sum(bytes)/1024/1024 AS
sum_m FROM dba_data_files GROUP BY tablespace_name) df,
(SELECT tablespace_name,--max(bytes)/1024/1024 largest,
sum(bytes)/1024/1024 AS sum_free_m --,count(blocks) as blocks
FROM dba_free_space GROUP BY tablespace_name ) fs,(select
tablespace_name,extent_management from dba_tablespaces) ts
where df.tablespace_name=fs.tablespace_name and
fs.tablespace_name=ts.tablespace_name;
exit;
EOF
else
sqlplus -S /nolog <<EOF
conn / as sysdba
set linesize 1000
set pagesize 1000
select '16.tablespace fragmentation and free (8i)' from dual;
select '================================================' from dual;
col TABLESPACE_NAME for a30
col FREE_PCT for a20
set heading on
SELECT df.TABLESPACE_NAME,FILES, sum_m as TOTAL_SIZE,--sum(largest) as
"MAXFREE_MB",
sum_free_m as "FREE_MB",to_char(100*sum_free_m/sum_m, '999.99') AS
FREE_PCT--,sum(blocks) as "FREE_EXTENTS"
FROM ( SELECT tablespace_name,count(file_id) as files ,sum(bytes)/1024/1024 AS
sum_m FROM dba_data_files GROUP BY tablespace_name) df,
(SELECT tablespace_name,--max(bytes)/1024/1024 largest,
sum(bytes)/1024/1024 AS sum_free_m --,count(blocks) as blocks
FROM dba_free_space GROUP BY tablespace_name ) fs,(select tablespace_name from
dba_tablespaces) ts
where df.tablespace_name=fs.tablespace_name and
fs.tablespace_name=ts.tablespace_name;
exit;
EOF
fi
sqlplus -S /nolog <<EOF
conn / as sysdba
set linesize 1000
set pagesize 1000
set heading off
select '17.object list' from dual;
select '================================================' from dual;
col OBJECT_TYPE for a20
set heading on
select owner,replace(object_type,' ','_') as OBJECT_TYPE,count(*) from
dba_objects where
owner not in ('SYS','SYSTEM') group by owner,object_type order by
owner,object_type;
set heading off
select '18.invalid objects' from dual;
select '================================================' from dual;
col OBJECT_NAME for a40
col OBJECT_TYPE for a20
set heading on
select OWNER,OBJECT_NAME,replace(OBJECT_TYPE,' ','_') as
OBJECT_TYPE,STATUS,TIMESTAMP from dba_objects where status='INVALID';
set heading off
select '19.dblinks' from dual;
select '================================================' from dual;
col DB_LINK for a40
col OWNER for a10
col HOST for a20
set heading on
select * from dba_db_links;
set heading off
select '20.indexes' from dual;
select '================================================' from dual;
set heading on
select * From dba_indexes where BLEVEL>4;
set heading off
select '21.dba role' from dual;
select '================================================' from dual;
set heading on
select grantee,granted_role from dba_role_privs where granted_role='DBA';
set heading off
select '22.sysdba role' from dual;
select '================================================' from dual; set heading on
SELECT * FROM v\$pwfile_users order by username;
set head off
select '2-performance' from dual; select
'================================================================================================' from dual;
select '2-1.buffer cache hit ratio:(Higher than 80% is ok, high value does not alwasy mean good performance)' from dual;
select '================================================' from dual; set head on
select (1 - (sum(decode(name, 'physical reads', value, 0)) / (sum(decode(name, 'db block gets', value, 0)) +
sum(decode(name, 'consistent gets', value, 0))))) * 100 "Hit Ratio" from v\$sysstat;
set head off
select '2-2.data dictionary hit ratio:should >98%' from dual;
select '================================================' from dual; set head on
select (1 - (sum(getmisses) / sum(gets))) * 100 "Hit Ratio" from v\$rowcache;
set head off select '2-3.library cache hit ratio:(Should be kept over 90%, otherwise there mighe be too much reparse)' from dual;
select '================================================' from dual; set head on
select sum(pins) / (sum(pins) + sum(reloads)) * 100 "Hit Ratio" from v\$librarycache;
set head off
select '2-4.menory sort ratio:should >98%' from dual;
select '================================================' from dual; set head on
select a.value "Disk Sorts", b.value "Memory Sorts", round((100 * b.value) /
decode((a.value + b.value), 0, 1, (a.value + b.value)), 2) "Pct Memory Sorts"
from v\$sysstat a, v\$sysstat b where a.name = 'sorts (disk)' and b.name = 'sorts (memory)';
set head off
select '2-5.memory top 10 sql read ratio:should <5%' from dual;
select '================================================' from dual; set head on
select sum(pct_bufgets)
from (select rank() over(order by buffer_gets desc) as rank_bufgets, to_char(100 * ratio_to_report(buffer_gets) over(), '999.99') pct_bufgets from v\$sqlarea)
where rank_bufgets < 11;
set heading off
select '' from dual; select '2-6.Top 10 Wait Event (Time unit:Hundreths of a second, IO operations should be common wait event)' from dual;
select '================================================' from dual; set heading on
column event format a30
select * from (select event,total_waits,time_waited, average_wait from v\$system_event where
event not like 'SQL*Net%' and event not like '%ipc%' order by total_waits desc) where rownum<11;
set head off
select '2-7.memory top 10 sql' from dual;
select '================================================' from dual; set head on
set serveroutput on size 1000000 declare
top10 number;
text1 varchar2(4000); x number; len1 number; cursor c1 is
select buffer_gets, substr(sql_text, 1, 4000) from v\$sqlarea
order by buffer_gets desc; begin
dbms_output.put_line('------------' || ' ' || '-------------------'); open c1;
for i in 1 .. 10 loop
fetch c1 into top10, text1;
dbms_output.put_line('------------top sql No.' ||i||'-------------------');
dbms_output.put_line(rpad(to_char(top10), 9) ); len1 := length(text1); x := 1;
while len1 > x - 1 loop
dbms_output.put_line(' ' || substr(text1, x, 65)); x := x + 66; end loop; end loop; end; /
set head off
select '2-8.IO information' from dual;
select '================================================' from dual; set head on
Select phyrds,phywrts,d.name from v\$datafile d,v\$filestat f where f.file#=d.file# order by d.name;
set head off
select '2-9.full table scan' from dual;
select '================================================' from dual; set head on
Select name,value value1 from v\$sysstat where name like '%table scan%';
set head off
select '3-1.sys and system security' from dual;
select '================================================' from dual;
select username "User(s) with Default Password!",ACCOUNT_STATUS from dba_users where password in
('E066D214D5421CCC', -- dbsnmp
'24ABAB8B06281B4C', -- ctxsys
'72979A94BAD2AF80', -- mdsys
'C252E8FA117AF049', -- odm
'A7A32CD03D3CE8D5', -- odm_mtr
'88A2B2C183431F00', -- ordplugins
'7EFA02EC7EA6B86F', -- ordsys
'4A3BA55E08595C81', -- outln
'F894844C34402B67', -- scott
'3F9FBD883D787341', -- wk_proxy
'79DF7A1BD138CF11', -- wk_sys
'7C9BA362F8314299', -- wmsys
'88D8364765FCE6AF', -- xdb
'F9DA8977092B7B81', -- tracesvr
'9300C0977D7DC75E', -- oas_public
'A97282CE3D94E29E', -- websys
'AC9700FD3F1410EB', -- lbacsys
'E7B5D92911C831E1', -- rman
'AC98877DE1297365', -- perfstat
'66F4EF5650C20355', -- exfsys
'84B8CBCA4D477FA3', -- si_informtn_schema
'D4C5016086B2DC6A', -- sys
'D4DF7931AB130E37') -- system
;
exit
EOF
cd $ORACLE_HOME/network/admin/
echo '3-2.listener configure'
echo 'listener.ora================================================'
cat listener*.ora
sleep 2;
echo 'sqlnet.ora================================================='
cat sqlnet*.ora
sleep 2;
echo 'tnsnames.ora================================================'
cat tnsnames*.ora
sleep 2;
echo '3-3.controlfile dump============================================'
ora_dump=`sqlplus -S '/ as sysdba' <<EOF
set head off
select value
from v\\\$parameter
where name='user_dump_dest';
exit;
EOF`
cd $ora_dump
ls -lt|head -n 2|tail -n 1|awk '{print $9}'|xargs cat
sleep 2;
echo '3-4.Alert Log ORA- Warning
Error============================================'
ora_background_dump=`sqlplus -S '/ as sysdba' <<EOF
set head off
select value
from v\\\$parameter
where name='background_dump_dest';
exit;
EOF`
cd $ora_background_dump
tail -10000 alert_$ORACLE_SID.log|grep ORA-
sleep 2;
echo '3-5.Alert Log size============================================'
ls -l alert_$ORACLE_SID.log
echo '3-6.listener.log size============================================'
lsnrctl status|grep listener.log|awk '{print $4}'|xargs ls -l
echo '3-7.crontab info============================================'
crontab -l
echo '3-8.Alert Log tail 20000
nums============================================'
tail -20000 alert_$ORACLE_SID.log
SYSTEM=`uname -s`
export SYSTEM
echo '4.machine information============================================'
if [ $SYSTEM = "Linux" ] then
echo "----------------host name----------------"
hostname
echo ""
echo "----------------id----------------"
id
echo ""
echo '--- Current uptime,users and load averages ---'
uptime
echo ""
echo "----------------CPU number----------------"
cat /proc/cpuinfo
sleep 1;
echo ""
echo "----------------memory info----------------"
cat /proc/meminfo
sleep 1;
echo ""
echo "----------------disk info----------------"
df -k
sleep 1;
echo ""
echo "----------------kernel parameter----------------"
cat /etc/sysctl.conf
sleep 1;
echo ""
echo "----------------os lever----------------"
lsb_release -a
sleep 1;
echo ""
echo "----------------product type----------------"
dmidecode |grep Product
sleep 1;
echo ""
echo "----------------CPU memory usage----------------"
vmstat 5 5
sleep 1;
echo ""
echo "----------------top info----------------"
top -d 1 -n 20
sleep 5;
top -d 1 -n 20
sleep 5;
top -d 1 -n 20
sleep 5;
top -d 1 -n 20
sleep 5;
top -d 1 -n 20
elif [ $SYSTEM = "SunOS" ] then
echo "----------------host name----------------"
hostname
echo ""
echo "----------------id----------------"
id
echo "----------------CPU,memory number----------------"
/usr/platform/sun4u/sbin/prtdiag -v
echo "----------------os lever----------------"
cat /etc/release
echo "----------------Kernel parameter----------------"
/usr/sbin/sysdef |grep SHM
/usr/sbin/sysdef |grep SEM
cat /etc/system
echo "----------------disk info----------------"
df -k
echo "----------------IP info----------------"
ifconfig -a
sleep 1;
elif [ $SYSTEM = "AIX" ] then
echo "----------------host name----------------"
hostname
echo ""
echo "----------------id----------------"
id
echo ""
echo "----------------machine plat----------------"
uname -M
echo ""
echo "----------------CPU,memory number----------------"
prtconf
sleep 2;
echo ""
echo "----------------disk info----------------"
df -k
echo ""
echo "----------------os lever----------------"
oslevel -r
echo ""
echo "----------------kernel parameter----------------"
lsattr -El sys0
echo ""
echo "----------------HACMP----------------"
lslpp -l |grep cluster
echo ""
echo "----------------network parameter----------------"
no -a
echo ""
echo "----------------CPU memory usage----------------"
vmstat 5 5
sleep 1;
echo ""
echo "----------------IP info----------------"
ifconfig -a
sleep 1;
echo ""
echo "----------------view cluster----------------"
lssrc -g cluster
sleep 1;
echo ""
echo "----------------view VG----------------"
lsvg
sleep 1;
echo ""
elif [ $SYSTEM = "HP-UX" ] then
echo "----------------host name----------------"
hostname
echo ""
echo "----------------id----------------"
id
echo ""
echo "----------------machine plat----------------"
model
echo ""
echo "----------------CPU,memory number----------------"
machinfo
sleep 2;
echo ""
echo "----------------disk info----------------"
bdf
echo ""
echo "----------------os lever----------------"
oslevel -r
echo ""
echo "----------------HACMP----------------"
lslpp -l |grep cluster
echo ""
echo "----------------network parameter----------------"
no -a
echo ""
echo "----------------CPU memory usage----------------"
vmstat 5 5
sleep 1;
sar -du 5 5
echo ""
echo "----------------IP info----------------"
ifconfig -a
else
echo "What "
fi
3.9、服务器自动巡检脚本
#!/bin/bash
login_info=$1
gather_server_ip=$2
gather_server_password=$3
grep_ip=`ifconfig | grep "[[:digit:]]{1,3}\.[[:digit:]]{1,3}\.\{3\}[[:digit:]]\{1,3\}" --color=auto -o | sed -e "2,5d"`
GatherPath="/tmp/GatherLogDirectory"
CheckScriptPath="/tmp/CheckScript"
if [ $
echo -e "Parameters if fault!\n"
echo -e "Please using:$0 login_info gather_server_ip\n"
echo -e "For example: $0 IpAndPassword.txt $grep_ip\n"
exit;
fi
if [ ! -x "$GatherPath" ];then
mkdir "$GatherPath"
echo -e "The log"s path is: $GatherPath"
fi
cat $login_info | while read line
do
server_ip=`echo $line|awk "{print $1}"`
server_password=`echo $line|awk "{print $2}"`
login_server_command="ssh -o StrictHostKeyChecking=no root@$server_ip"
scp_gather_server_checksh="scp checksh.sh root@$server_ip:$CheckScriptPath"
/usr/bin/expect<
set timeout 20
spawn $login_server_command
expect {
"*yes/no" { send "yes\r"; exp_continue }
"*password:" { send "$server_password\r" }
}
expect "Permission denied, please try again." {exit}
expect "#" { send "mkdir $CheckScriptPath\r"}
expect eof
exit
EOF
/usr/bin/expect<
set timeout 20
spawn $scp_gather_server_checksh
expect {
"*yes/no" { send "yes\r"; exp_continue }
"*password:" { send "$server_password\r" }
}
expect "Permission denied, please try again." {exit}
expect "Connection refused" {exit}
expect "100%"
expect eof
exit
EOF
/usr/bin/expect<
set timeout 60
spawn $login_server_command
expect {
"*yes/no" { send "yes\r"; exp_continue }
"*password:" { send "$server_password\r" }
}
expect "Permission denied, please try again." {exit}
expect "#" { send "cd $CheckScriptPath;./checksh.sh $gather_server_ip $gather_server_password\r"}
expect eof
exit
EOF
done
checksh.sh#!/bin/bash
########################################################################################
#Function:
#This script checks the system"s information,disks"s information,performance,etc...of the
#server
#
#Author:
#By Jack Wang
#
#Company:
#ShaanXi Great Wall Information Co.,Ltd.
########################################################################################
########################################################################################
#
#GatherServerIpAddress is the server"s IP address that gather the checking log
GatherServerIpAddress=$1
GatherServerPassword=$2
GetTheIpCommand=`ifconfig | grep "[[:digit:]]{1,3}\.[[:digit:]]{1,3}\.\{3\}[[:digit:]]\{1,3\}" --color=auto -o | sed -e "2,5d"`
LogName=`ifconfig|grep "[[:digit:]]{1,3}\.[[:digit:]]{1,3}\.\{3\}[[:digit:]]\{1,3\}" --color=auto -o|sed -e "2,5d"``echo "-"``date +%Y%M%d`
GatherServerLogPath="/tmp/GatherLogDirectory"
LocalServerLogPath="/tmp/LocalServerLogDirectory"
LinuxOsInformation(){
Hostname=`hostname`
UnameA=`uname -a`
OsVersion=`cat /etc/issue | sed "2,4d"`
Uptime=`uptime|awk "{print $3}"|awk -F "," "{print $1}"`
ServerIp=`ifconfig|grep "inet"|sed "2,4d"|awk -F ":" "{print $2}"|awk "{print $1}"`
ServerNetMask=`ifconfig|grep "inet"|sed "2,4d"|awk -F ":" "{print $4}"|awk "{print $1}"`
ServerGateWay=`netstat -r|grep "default"|awk "{print $2}"`
SigleMemoryCapacity=`dmidecode|grep -P -A5 "Memory\s+Device"|grep "Size"|grep -v "Range"|grep "[0-9]"|awk -F ":" "{print $2}"|sed "s/^[ \t]*//g"`
MaximumMemoryCapacity=`dmidecode -t 16|grep "Maximum Capacity"|awk -F ":" "{print $2}"|sed "s/^[ \t]*//g"`
NumberOfMemorySlots=`dmidecode -t 16|grep "Number Of Devices"|awk -F ":" "{print $2}"|sed "s/^[ \t]*//g"`
MemoryTotal=`cat /proc/meminfo|grep "MemTotal"|awk "{printf("MemTotal:%1.0fGB\n",$2/1024/1024)}"|awk -F ":" "{print $2}"`
PhysicalMemoryNumber=`dmidecode|grep -A16 "Memory Device"|grep "Size:"|grep -v "No Module Installed"|grep -v "Range Size:"|wc -l`
ProductName=`dmidecode|grep -A10 "System Information"|grep "Product Name"|awk -F ":" "{print $2}"|sed "s/^[ \t]*//g"`
SystemCPUInfomation=`cat /proc/cpuinfo|grep "name"|cut -d: -f2|awk "{print "*"$1,$2,$3,$4}"|uniq -c|sed "s/^[ \t]*//g"`
echo -e "Hostname|$Hostname\nUnamea|$UnameA\nOsVersion|$OsVersion\nUptime|$Uptime\nServerIp|$ServerIp\nServerNetMask|$ServerNetMask\nServerGateWay|$ServerGateWay\nSigleMemoryCapacity|$SigleMemoryCapacity\nMaximumMemoryCapacity|$MaximumMemoryCapacity\nNumberOfMemorySlots|$NumberOfMemorySlots\nMemoryTotal|$MemoryTotal\nPhysicalMemoryNumber|$PhysicalMemoryNumber\nProductName|$ProductName\nSystemCPUInformation|$SystemCPUInfomation"
}
PerformanceInfomation (){
CPUIdle=`top -d 2 -n 1 -b|grep C[Pp][Uu]|grep id|awk "{print $5}"|awk -F "%" "{print $1}"`
CPUloadAverage=`top -d 2 -n 1 -b|grep "load average:"|awk -F ":" "{print $5}"|sed "s/^[ \t]*//g"`
ProcessNumbers=`top -d 2 -n 1 -b|grep "Tasks"|awk -F "[: ,]" "{print $3}"`
ProcessRunning=`top -d 2 -n 1 -b|grep "Tasks"|awk -F "[: ,]" "{print $8}"`
ProcessSleeping=`top -d 2 -n 1 -b|grep "Tasks"|awk -F "[: ,]" "{print $11}"`
ProcessStoping=`top -d 2 -n 1 -b|grep "Tasks"|awk -F "[: ,]" "{print $16}"`
ProcessZombie=`top -d 2 -n 1 -b|grep "Tasks"|awk -F "[: ,]" "{print $21}"`
UserSpaceCPU=`top -d 2 -n 1 -b|grep "C[Pp][Uu]"|head -1|awk -F "[: ,%]" "{print $4}"`
SystemSpaceCPU=`top -d 2 -n 1 -b|grep "C[Pp][Uu]"|head -1|awk -F "[: ,%]" "{print $8}"`
ChangePriorityCPU=`top -d 2 -n 1 -b|grep "C[Pp][Uu]"|head -1|awk -F "[: ,%]" "{print $12}"`
WaitingCPU=`top -d 2 -n 1 -b|grep "C[Pp][Uu]"|head -1|awk -F "[: ,%]" "{print $19}"`
HardwareIRQCPU=`top -d 2 -n 1 -b|grep "C[Pp][Uu]"|head -1|awk -F "[: ,%]" "{print $23}"`
SoftwareIRQCPU=`top -d 2 -n 1 -b|grep "C[Pp][Uu]"|head -1|awk -F "[: ,%]" "{print $27}"`
MemUsed=`top -d 2 -n 1 -b|grep "Mem"|awk -F "[: ,]" "{print $11}"|tr -d "a-zA-Z"|awk "{printf("%dM\n",$1/1024)}"`
MemFreeP=`top -d 2 -n 1 -b|grep "Mem"|awk -F "[: ,]" "{print $16}"|tr -d "a-zA-Z"|awk "{printf("%dM\n",$1/1024)}"`
MemBuffersP=` top -d 2 -n 1 -b|grep "Mem"|awk -F "[: ,]" "{print $22}"|tr -d "a-zA-Z"|awk "{printf("%dM\n",$1/1024)}"`
CacheCachedP=`top -d 2 -n 1 -b|grep "Swap"|awk -F "[: ,]" "{print $24}"|tr -d "a-zA-Z"|awk "{printf("%dM\n",$1/1024)}"`
CacheTotal=`top -d 2 -n 1 -b|grep "Swap"|awk -F "[: ,]" "{print $4}"|tr -d "a-zA-Z"|awk "{printf("%dM\n",$1/1024)}"`
CacheUsed=`top -d 2 -n 1 -b|grep "Swap"|awk -F "[: ,]" "{print $14}"|tr -d "a-zA-Z"|awk "{printf("%dM\n",$1/1024)}"`
CacheFree=`top -d 2 -n 1 -b|grep "Swap"|awk -F "[: ,]" "{print $18}"|tr -d "a-zA-Z"|awk "{printf("%dM\n",$1/1024)}"`
echo -e "CPUIdle|$CPUIdle\nCPUloadAverage|$CPUloadAverage\nProcessNumbers|$ProcessNumbers\nProcessRunning|$ProcessRunning\nProcessSleeping|$ProcessSleeping\nProcessStoping|$ProcessStoping\nProcessZombie|$ProcessZombie\nUserSpaceCPU|$UserSpaceCPU\nSystemSpaceCPU|$SystemSpaceCPU\nChangePriorityCPU|$ChangePriorityCPU\nWaitingCPU|$WaitingCPU\nHardwareIRQCPU|$HardwareIRQCPU\nSoftwareIRQCPU|$SoftwareIRQCPU\nMemUsed|$MemUsed\nMemFreeP|$MemFreeP\nMemBuffersP|$MemBuffersP\nCacheCachedP|$CacheCachedP\nCacheTotal|$CacheTotal\nCacheUsed|$CacheUsed\nCacheFree|$CacheFree\n"
}
OprateSystemSec () {
echo "======================UserLogin======================"
w
echo "======================FileUsed======================="
df -ah
echo "======================dmesgError====================="
dmesg | grep error
echo "======================dmesgFail======================"
dmesg | grep Fail
echo "======================BootLog========================"
more /var/log/boot.log | grep -V "OK" | sed "1,6d"
echo "======================route -n======================="
route -n
echo "======================iptables -L===================="
iptables -L
echo "======================netstat -lntp=================="
netstat -lntp
echo "======================netstat -antp=================="
netstat -antp
echo "======================BootLog========================"
netstat -s
echo "======================netstat -s====================="
last
echo "======================du -sh /etc/==================="
du -sh /etc/
echo "======================du -sh /boot/=================="
du -sh /boot/
echo "======================du -sh /dev/==================="
du -sh /dev/
echo "======================df -h=========================="
df -h
echo "======================mount | column -t=============="
mount | column -t
}
TopAndVmstat(){
top -d 2 -n 1 -b
vmstat 1 10
}
CheckGatherLog(){
if [ -f "$LocalServerLogPath/$GetTheIpCommand.log" ];then
rm -rf $LocalServerLogPath/$GetTheIpCommand.log
fi
if [ ! -x "$LocalServerLogPath" ];then
mkdir "$LocalServerLogPath"
fi
if [ ! -f "$LocalServerLogPath/$GetTheIpCommand.log" ];then
touch $LocalServerLogPath/$GetTheIpCommand.log
LinuxOsInformation>>$LocalServerLogPath/$GetTheIpCommand.log
PerformanceInfomation>>$LocalServerLogPath/$GetTheIpCommand.log
OprateSystemSec>>$LocalServerLogPath/$GetTheIpCommand.log
TopAndVmstat>>$LocalServerLogPath/$GetTheIpCommand.log
fi
}
CheckGatherLog
SCP_LOG_TO_GATHER_SERVER="scp $LocalServerLogPath/$GetTheIpCommand.log root@$GatherServerIpAddress:$GatherServerLogPath"
/usr/bin/expect<
set timeout 50
spawn $SCP_LOG_TO_GATHER_SERVER
expect {
"*yes/no)?"
{
send "yes\n"
"*password:*" {send "GatherServerPassword\n"}
}
"*password:"
{
send "$GatherServerPassword\n"
}
}
expect "*password:" { send "$GatherServerPassword\n" }
expect "100%"
expect eof
EOF
将上述脚本保存为shellsh.sh另外创建一个file.txt文档,格式按如下书写: IP password 192.168.182.143 123456
然后执行:shellsh.sh file.txt
3.10、linux服务器巡检脚本
#!/bin/bash
TIME=`date +"%Y-%m-%d-%H-%M"`
RED(){
val=$1
echo -e "\033[31m ${val} \033[0m"
}
GREEN(){
val=$1
echo -e "\033[32m ${val} \033[0m"
}
YELLOW(){
val=$1
echo -e "\033[33m ${val} \033[0m"
}
BLUE(){
val=$1
echo -e "\033[34m ${val} \033[0m"
}
PURPLE(){
val=$1
echo -e "\033[35m ${val} \033[0m"
}
DARKGREEN(){
val=$1
echo -e "\033[36m ${val} \033[0m"
}
commd(){
ssh -o "StrictHostKeyChecking no" $IP "$cmd" < /dev/null
}
eval $(/bin/grep disksize gather.conf)
eval $(/bin/grep cpusize gather.conf)
eval $(/bin/grep memsize gather.conf)
eval $(/bin/grep swapsize gather.conf)
eval $(/bin/grep dropsize gather.conf)
eval $(/bin/grep ntpsize gather.conf)
eval $(/bin/grep vss gather.conf)
eval $(/bin/grep vss_ip gather.conf)
eval $(/bin/grep ntp1_server_ip gather.conf)
eval $(/bin/grep ntp1_client gather.conf)
eval $(/bin/grep ntp2_server_ip gather.conf)
eval $(/bin/grep ntp2_client gather.conf)
eval $(/bin/grep sarip gather.conf)
eval $(/bin/grep mass gather.conf)
eval $(/bin/grep ipvsadm gather.conf)
eval $(/bin/grep losssize gather.conf)
eval $(/bin/grep bo gather.conf)
eval $(/bin/grep bo_ip gather.conf)
eval $(/bin/grep cdn gather.conf)
eval $(/bin/grep cdn_ip gather.conf)
eval $(/bin/grep VSS gather.conf)
eval $(/bin/grep VSS_IP gather.conf)
eval $(/bin/grep lvs gather.conf)
eval $(/bin/grep lvs_ip gather.conf)
eval $(/bin/grep portal gather.conf)
eval $(/bin/grep portal_ip gather.conf)
DISK(){
cmd='df -h'
disk=`commd $IP $cmd |awk '{if (NF==6){print $5","$6}else if (NF==5){print $4","$5}}'|grep -v -E "已用%|挂载点|shm|boot"`
for data in $disk
do
valnum=`echo $data|awk -F[,%] '{print $1}'`
diskname=`echo $data|awk -F[,%] '{print $3}'`
if [ $valnum -gt $disksize ] ;then
RED "($diskname),$valnum%"
else
echo "($diskname),$valnum%"
fi
done
}
CPU(){
cmd='top -b n 1'
CPUVAL=`commd $IP $cmd|grep "Cpu(s)"|awk -F, '{print $1,$2,$4}'`
cpuval=`echo $CPUVAL|awk -F[:%] -v cpusizes=$cpusize '{if($2>cpusizes){print 0}else {print 1}}'`
if [ -n "$CPUVAL" ];then
if [ $cpuval -eq 0 ];then
RED "$CPUVAL"
else
echo "$CPUVAL"
fi
fi
}
MEM(){
cmd='free -m'
mem=`commd $IP $cmd|grep "Mem"|awk '{sum=$2-$4-$6-$7}END{printf "%d\n", (sum/$2*100)}'`
if [ -n "$mem" ];then
if [ $mem -gt $memsize ];then
RED "MEM: $mem%"
else
echo "MEM: $mem%"
fi
fi
}
SWAP(){
cmd='free -m'
swap=`commd $IP $cmd|grep "Swap"|awk '{printf "%d\n",($3/$2)}'`
if [ -n "$swap" ];then
if [ $swap -gt $swapsize ];then
RED "SWAP: $swap%"
else
echo "SWAP: $swap%"
fi
fi
}
NET(){
cmd="ifconfig|grep -E 'eth|bond'"
network=`commd $IP $cmd|grep -E '^eth|^bond'|awk '{print $1}'`
for net in $network
do
net=`echo $net|grep -v ":"`
if [ "$net" != "" ];then
cmd="ifconfig $net"
DROPVAL=`commd $IP $PASSWD $cmd|grep "RX packets"|awk /dropped/'{print $4}'`
dropval=`echo $DROPVAL|awk -F: '{print $2}'`
cmd="ethtool $net|grep 'Link'"
UPDOWN=`commd $IP $PASSWD $cmd|grep 'detected'|awk '{print $NF}'`
statval=`echo $UPDOWN|sed 's/\r//g'`
if [ "$statval" == "no" ] || [ $dropval -gt $dropsize ] ;then
RED "$NAME,$IP,$net,$DROPVAL,$UPDOWN"
else
echo "$NAME,$IP,$net,$DROPVAL,$UPDOWN"
fi
fi
done
}
GATH(){
for IP in $1
do
NAME=`cat iplist.txt|grep "\<$IP\>"|awk '{print $3}'`
sip=`echo $IP|awk -F. '{print $1"."$2"."$3}'`
cmd="ip add|grep $sip|awk '{print \$2}'|uniq|wc -l"
num=`commd $IP $cmd`
NAMES=`echo $NAME|grep 'CDN'`
if [ "${NAMES}" != "" ];then
if [ $num -ge 4 ]; then
for role in $2
do
cmd="ps uax|grep ngod|grep $role|grep -v 'grep'"
val=`commd $IP $cmd|grep "$role"|grep -v "grep"`
if [ "$val" == "" ];then
RED "$NAME,$IP,$role,进程不存在!"
else
echo "$NAME,$IP,$role,进程存在!"
fi
done
else
for role in $2
do
roles=`echo $role|grep -E "rti|csi|cls"`
if [ "$roles" == "" ] ;then
cmd="ps uax|grep ngod|grep $role|grep -v 'grep'"
val=`commd $IP $cmd|grep "$role"|grep -v "grep"`
if [ "$val" == "" ];then
RED "$NAME,$IP,$role,进程不存在!"
else
echo "$NAME,$IP,$role,进程存在!"
fi
fi
done
fi
else
if [ $num -ge 2 ]; then
for role in $2
do
cmd="ps uax|grep $role|grep -v 'grep'"
val=`commd $IP $cmd|grep "?"`
if [ "$val" == "" ];then
RED "$NAME,$IP,$role,进程不存在!"
else
echo "$NAME,$IP,$role,进程存在!"
fi
done
else
echo "$NAME,$IP,备机器"
fi
fi
done
}
GATHS(){
for IP in $1
do
NAME=`cat iplist.txt|grep "\<$IP\>"|awk '{print $3}'`
for role in $2
do
cmd="ps aux|grep ngod|grep $role|grep -v 'grep'"
val=`commd $IP $cmd|grep "$role"|grep -v "grep"`
if [ "$val" == "" ];then
echo "$NAME,$IP,$role,进程不存在!"
else
echo "$NAME,$IP,$role,进程存在!"
fi
done
done
}
NTP(){
ntp1(){
if [ "$ntp1_server_ip" != "" ] && [ "$ntp1_client" != "" ] ;then
for IP in ${ntp1_client[@]}
do
NAME=`cat iplist.txt|grep "\<$IP\>"|awk '{print $3}'`
cmd="ntpdate -u ${ntp1_server_ip}"
ntpval=`commd $IP $cmd|awk /offset/'{print $10}'`
ntpnum=`echo $ntpval|awk -v ntpsizes=$ntpsize '{if ($1>ntpsizes){print 0}else {print 1}}'`
if [ $ntpnum -eq 0 ] ;then
RED "$NAME,$IP,$ntpval"
else
echo "$NAME,$IP,$ntpval"
fi
done
fi
}
ntp2(){
if [ "$ntp2_server_ip" != "" ] && [ "$ntp2_client" != "" ] ;then
for IP in ${ntp2_client[@]}
do
NAME=`cat iplist.txt|grep "\<$IP\>"|awk '{print $3}'`
cmd="ntpdate -u ${ntp2_server_ip}"
ntpval=`commd $IP $cmd|awk /offset/'{print $10}'`
ntpnum=`echo $ntpva2|awk -v ntpsizes=$ntpsize '{if ($1>ntpsizes){print 0}else {print 1}}'`
if [ $ntpnum -eq 0 ] ;then
RED "$NAME,$IP,$ntpval"
else
echo "$NAME,$IP,$ntpval"
fi
done
fi
}
ntp2
ntp1
}
PING(){
network(){
cmd="ping -I $1 -c 5 $2"
loss=`commd $IP $cmd|grep " packets"|awk '{print $6}'`
lossval=`echo $loss|awk -F% '{print $1}'`
if [ $lossval -gt $losssize ] ;then
RED "$NAME,$IP,$1,$loss"
else
echo "$NAME,$IP,$1,$loss"
fi
}
cat iplist.txt|grep -v "\#" |while read file
do
IP=`echo $file|awk '{print $1}'`
NAME=`echo $file|awk '{print $3}'`
cmd='route -n|grep -E "bond1|bond0"'
val=`commd $IP $cmd|grep "\<UG\>"|awk '{print $2,$NF}'|sort|uniq`
net_num=`echo $val|awk '{print NF}'`
if [ $net_num -eq 2 ];then
gw=`echo $val|awk '{print $1}'`
netname=`echo $val|awk '{print $2}'`
network $netname $gw
elif [ $net_num -eq 4 ] ; then
gw1=`echo $val|awk '{print $1}'`
gw2=`echo $val|awk '{print $3}'`
netname1=`echo $val|awk '{print $2}'`
netname2=`echo $val|awk '{print $4}'`
network $netname1 $gw1
network $netname2 $gw2
fi
done
}
SAR(){
if [ "$sarip" != "" ];then
sar(){
NAME=`cat iplist.txt|grep "\<$IP\>"|awk '{print $3}'`
cmd="export LANG=zh_CN.UTF-8;sar -n DEV 2 2"
GREEN "*********************************${NAME},${IP}*******************************************"
commd $IP $cmd|grep -E "平均时间:|Average"
}
for IP in ${sarip[@]}
do
sar
done
fi
}
MASS(){
if [ "$mass" != "" ];then
Date=`date +%Y-%m-%d_%H-%M-%S`
ipfile='disk_mass.csv'
WGET()
{
wget -O diskstatus.xml "http://${ip}:${port}/cmd?cmdname=GetSysInfo" >/dev/null 2>&1
}
echo "AREA,IPADDR,MASSSTATUS"
for serverinfo in `cat ${ipfile}|grep "$mass"`
do
area=`echo ${serverinfo}|cut -d, -f1`
ip=`echo ${serverinfo}|cut -d, -f2`
port=`echo ${serverinfo}|cut -d, -f3`
if ( ping -c 1 $ip >/dev/null );then
WGET
if [ `grep -c ok diskstatus.xml` -eq 1 ];then
allsize=`awk -F "<SysAllSize>" '{print $2}' diskstatus.xml |awk -F "</" '{print $1}'`
usesize=`awk -F "<SysUsedSize>" '{print $2}' diskstatus.xml |awk -F "</" '{print $1}'`
stat='ok'
let per=usesize*100/allsize
else
stat='bad'
fi
else
stat='bad'
fi
if [ "$stat" = "ok" ];then
echo "${area},${ip},${per}%"
elif [ "$stat" = "bad" ] ;then
RED "${area},${ip},error"
fi
done
rm -rf diskstatus.xml
fi
}
IPVSADM(){
if [ "$ipvsadm" != "" ];then
for IP in ${ipvsadm[@]}
do
NAME=`cat iplist.txt|grep "\<$IP\>"|awk '{print $3}'`
sip=`echo $IP|awk -F. '{print $1"."$2"."$3}'`
cmd="ip add|grep -c $sip"
num=`commd $IP $cmd`
if [ $num -gt 1 ];then
cmd='ipvsadm -ln'
GREEN "*********************************${NAME},${IP}************************************"
commd $IP $cmd|grep "Route"
else
cmd='/etc/init.d/keepalived status'
BLUE "**********************************${NAME},${IP}************************************"
commd $IP $cmd|grep "keepalived"
fi
done
fi
}
main(){
RED "####IP#########|################CPU#############|##MEM#####|####SWAP###|########################DISK###############################"
cat iplist.txt|grep -v "\#"|while read file
do
IP=`echo $file|awk '{print $1}'`
NAME=`echo $file|awk '{print $3}'`
disk=`DISK|xargs`
cpu=`CPU`
mem=`MEM`
swa=`SWAP`
echo "$NAME,$IP:| $cpu | $mem | $swa | DISK:$disk"
done
}
WNETS(){
cat iplist.txt|grep -v "#"|while read file
do
IP=`echo $file|awk '{print $1}'`
NAME=`echo $file|awk '{print $3}'`
NET
done
}
ips=${vss_ip[@]};gath=${vss[@]}
bos=${bo[@]};bo_ips=${bo_ip[@]}
cdns=${cdn[@]};cdn_ips=${cdn_ip[@]}
VSSS=${VSS[@]};VSS_IPS=${VSS_IP[@]}
lvss=${lvs[@]};lvs_ips=${lvs_ip[@]}
portals=${portal[@]};portal_ips=${portal_ip[@]}
Usage(){
echo """$0
-h help.
-z cpu,mem,swap...
-n ntp
-p ping
-s sar -n DEV 2 2
-m MASS
-i IPVSADM
-w NETWORK
-j 程序进程
-a 以上所有的。"""
}
if [ $
Usage
fi
while getopts ":h",":znpsmiwja" opt
do
case $opt in
"h")
Usage
exit -1
;;
"z")
main
;;
"n")
NTP
;;
"p")
PING
;;
"s")
SAR
;;
"m")
MASS
;;
"i")
IPVSADM
;;
"w")
WNETS
;;
"j")
GATH "$ips" "$gath"
GATHS "$bo_ips" "$bos"
GATH "$cdn_ips" "$cdns"
GATHS "$VSS_IPS" "$VSSS"
GATH "$lvs_ips" "$lvss"
GATHS "$portal_ips" "$portals"
;;
"a")
main
GREEN "NTP******************************************************************************"
NTP
YELLOW "PINT******************************************************************************"
PING
SAR
PURPLE "MASS******************************************************************************"
MASS
IPVSADM
DARKGREEN "NETWORK******************************************************************************"
WNETS
YELLOW "程序进程******************************************************************************"
GATH "$ips" "$gath"
GATHS "$bo_ips" "$bos"
GATH "$cdn_ips" "$cdns"
GATHS "$VSS_IPS" "$VSSS"
GATH "$lvs_ips" "$lvss"
GATHS "$portal_ips" "$portals"
;;
*)
echo "请输入参数"
exit -1
;;
esac
done
3.11、Linux服务器巡检脚本
#!/bin/sh
NUM_VERSION=$(uname -r)
function Check_OS(){
[[ $NUM_VERSION =~ el6 ]] && return 0||return 1
}
echo "######CPU使用情况######"
CPU_HARDWARE=$(cat /proc/cpuinfo | grep name |cut -f2 -d: | uniq -c)
CPU_NUMBER=$(cat /proc/cpuinfo | grep name |cut -f2 -d: | uniq -c | awk '{print $1}')
CPU_LOAD=$(uptime | awk '{for(i=6;i<=NF;i++) printf $i""FS;print ""}')
CPU_LOAD_NUMBER=$(uptime | awk -F"load average:" '{print $2}' | awk -F"," '{print $1}' | awk -F"." '{print $1}' |sed 's/^[ \t]*//g')
CPU_UTILIZ=$(top -n 1 | grep "Cpu(s)")
if [[ $CPU_LOAD_NUMBER -lt $CPU_NUMBER ]]
then
CPU_STATUS=正常
else
CPU_STATUS=不正常
fi
echo "$CPU_STATUS("$CPU_HARDWARE,$CPU_LOAD,$CPU_UTILIZ")"
echo -e
echo -e
echo "######磁盘使用情况######"
IFS="
"
for i in `df -hP | sed 1d | awk '{print $(NF-1)"\t"$NF"\t"$(NF-2)}'`
do
DISK_UTILIZ=$(echo $i |awk '{print $1}')
MOUNT_DISK=$(echo $i |awk '{print $2}')
DISK_FREE=$(echo $i |awk '{print $3}')
if [[ $(echo $DISK_UTILIZ | sed s/%//g) -gt 70 ]]
then
echo "不正常""("$MOUNT_DISK"的使用率"$DISK_UTILIZ"较大,请注意"")"
else
continue
fi
done
echo -e
echo "磁盘具体使用情况:"
df -hP | sed 1d | awk '{print $NF"分区""剩余空间"$(NF-2),"使用率"$(NF-1)}'
UMAIL_DIR=$(cat /usr/local/u-mail/config/custom.conf | grep "mailroot" | awk -F"=" '{print $2}' | sed 's/^[ \t]*//g')
echo "邮件数据存储在"$UMAIL_DIR
echo -e
echo -e
echo "######内存使用情况######"
Check_OS
RESULT=$?
if [ ${RESULT} -eq 0 ]
then
MEM_SUM_NUM=$(free -m | grep "Mem:" | awk -F" " '{print $2}')
MEM_SURPLUS_NUM=$(free -m | grep "Mem:" | awk '{for(i=4;i<=NF;i++) print $i""FS;}' | awk '{a+=$1}END{print a}')
MEM_SUM=$(free -m | grep "Mem:" | awk -F" " '{print $2"M"}')
MEM_SURPLUS=$(free -m | grep "Mem:" | awk '{for(i=4;i<=NF;i++) print $i""FS;}' | awk '{a+=$1}END{print a"M"}')
MEM_USED=$(echo $(($MEM_SUM_NUM-$MEM_SURPLUS_NUM)))
PERCENT=$(printf "%d%%" $(($MEM_USED*100/$MEM_SUM_NUM)))
PERCENT_NUM=$(echo $PERCENT|sed s/%//g)
if [[ $PERCENT_NUM -lt 70 ]]
then
MEM_STATUS=正常
else
MEM_STATUS=不正常
fi
echo "$MEM_STATUS(""总内存大小"$MEM_SUM,"剩余内存大小"$MEM_SURPLUS,"内存使用率"$PERCENT")"
else
MEM_SUM_NUM7=$(free -m | grep "Mem:" | awk -F" " '{print $2}')
MEM_SURPLUS_NUM7=$(free -m | grep "Mem:" | awk -F" " '{print $4}')
MEM_SUM7=$(free -m | grep "Mem:" | awk -F" " '{print $2"M"}')
MEM_SURPLUS7=$(free -m | grep "Mem:" | awk -F" " '{print $4"M"}')
MEM_USED7=$(echo $(($MEM_SUM_NUM7-$MEM_SURPLUS_NUM7)))
PERCENT7=$(printf "%d%%" $(($MEM_USED7*100/$MEM_SUM_NUM7)))
PERCENT_NUM7=$(echo $PERCENT7|sed s/%//g)
if [[ $PERCENT_NUM7 -lt 70 ]]
then
MEM_STATUS=正常
else
MEM_STATUS=不正常
fi
echo "$MEM_STATUS(""总内存大小"$MEM_SUM7,"剩余内存大小"$MEM_SURPLUS7,"内存使用率"$PERCENT7")"
fi
echo -e
echo -e
echo "######操作系统版本和邮件系统版本######"
OS_VERSION=$(cat /etc/redhat-release)
UMAILAPP_VERSION=$(rpm -qa | grep umail_app | awk -F"." '{print $1"."$2"."$3}')
UMAILWEB_VERSION=$(rpm -qa | grep umail_webmail | awk -F"." '{print $1"."$2"."$3}')
echo $OS_VERSION,$UMAILAPP_VERSION,$UMAILWEB_VERSION
echo -e
echo -e
echo "######系统基本操作是否正常######"
SSH_SUM=$(cat /var/log/secure | grep "authentication failure" | wc -l)
SSH_DIY=500
if [ $SSH_SUM -gt $SSH_DIY ]
then
echo "有人在试您root密码,请注意"
else
echo "正常"
fi
echo -e
echo -e
echo "######是否有可疑进程或后门######"
echo "正常"
echo -e
echo -e
echo "######是否安装杀毒软件防火墙######"
Check_OS
RESULT=$?
if [ ${RESULT} -eq 0 ]
then
/etc/init.d/iptables status 1>/dev/null 2>&1
RESULT_IPTABLES=$?
if [ ${RESULT_IPTABLES} -eq 0 ]
then
echo "操作系统自带防火墙已开启"
else
echo "操作系统自带防火墙未开启"
fi
else
systemctl status firewalld.service 1>/dev/null 2>&1
RESULT_FIREWALLD=$?
if [ ${RESULT_FIREWALLD} -eq 0 ]
then
echo "操作系统自带防火墙已开启"
else
echo "操作系统自带防火墙未开启"
fi
fi
Check_OS
RESULT=$?
if [ ${RESULT} -eq 0 ]
then
ps -ef | grep umail_clamd | grep -v grep 1>/dev/null 2>&1
RESULT_CLAMD6=$?
/etc/init.d/umail_clamd status 1>/dev/null 2>&1
RESULT_CLAMDSTATUS6=$?
if [ ${RESULT_CLAMD6} -eq 0 ] && [ ${RESULT_CLAMDSTATUS6} -eq 0 ]
then
echo "已安装CLAMD杀毒软件"
else
echo "未安装杀毒软件或者未启动成功"
fi
else
ps -ef | grep umail_clamd | grep -v grep 1>/dev/null 2>&1
RESULT_CLAMD7=$?
systemctl status umail_clamd.service 1>/dev/null 2>&1
RESULT_CLAMDSTATUS7=$?
if [ ${RESULT_CLAMD7} -eq 0 ] && [ ${RESULT_CLAMDSTATUS7} -eq 0 ]
then
echo "已安装CLAMD杀毒软件"
else
echo "未安装杀毒软件或者未启动成功"
fi
fi
echo -e
echo -e
echo "######开机时长######"
LINETIME=$(uptime | awk -F"up" '{print $2}' | awk -F", load average" '{print $1}')
echo "服务器开机时间为"$LINETIME
echo -e
echo -e
echo "######HTTP服务######"
APACHE6_STATUS=$(/etc/init.d/umail_apache status 1>/dev/null 2>&1)
NGINX6_STATUS=$(/etc/init.d/umail_nginx status 1>/dev/null 2>&1)
APACHE7_STATUS=$(systemctl status umail_apache.service 1>/dev/null 2>&1)
NGINX7_STATUS=$(systemctl status umail_nginx.service 1>/dev/null 2>&1)
APACHE_PROC=$(ps -ef | grep "/usr/local/u-mail/service/apache/bin/httpd" | grep -v grep 1>/dev/null 2>&1)
NGINX_PROC=$(ps -ef | grep "/usr/local/u-mail/service/nginx/sbin/nginx" | grep -v grep 1>/dev/null 2>&1)
Check_OS
RESULT=$?
if [ ${RESULT} -eq 0 ]
then
/etc/init.d/umail_apache status 1>/dev/null 2>&1
RESULT_APACHE6=$?
/etc/init.d/umail_nginx status 1>/dev/null 2>&1
RESULT_NGINX6=$?
ps -ef | grep "/usr/local/u-mail/service/apache/bin/httpd" | grep -v grep 1>/dev/null 2>&1
RESULT_APACHEPROC6=$?
ps -ef | grep "/usr/local/u-mail/service/nginx/sbin/nginx" | grep -v grep 1>/dev/null 2>&1
RESULT_NGINXPROC6=$?
if [ ${RESULT_APACHE6} -eq 0 ] && [ ${RESULT_NGINX6} -eq 0 ] && [ ${RESULT_APACHEPROC6} -eq 0 ] && [ ${RESULT_NGINXPROC6} -eq 0 ]
then
echo "HTTP服务启动成功"
else
echo "HTTP服务启动不成功"
fi
else
systemctl status umail_apache.service 1>/dev/null 2>&1
RESULT_APACHE7=$?
systemctl status umail_nginx.service 1>/dev/null 2>&1
RESULT_NGINX7=$?
ps -ef | grep "/usr/local/u-mail/service/apache/bin/httpd" | grep -v grep 1>/dev/null 2>&1
RESULT_APACHEPROC7=$?
ps -ef | grep "/usr/local/u-mail/service/nginx/sbin/nginx" | grep -v grep 1>/dev/null 2>&1
RESULT_NGINXPROC7=$?
if [ ${RESULT_APACHE7} -eq 0 ] && [ ${RESULT_NGINX7} -eq 0 ] && [ ${RESULT_APACHEPROC7} -eq 0 ] && [ ${RESULT_NGINXPROC7} -eq 0 ]
then
echo "HTTP服务启动成功"
else
echo "HTTP服务启动不成功"
fi
fi
echo -e
echo -e
echo "######SMTP服务######"
Check_OS
RESULT=$?
if [ ${RESULT} -eq 0 ]
then
netstat -anltp | grep ":25" 1>/dev/null 2>&1
RESULT_SMTP=$?
/etc/init.d/umail_postfix status 1>/dev/null 2>&1
RESULT_POSTFIX=$?
if [ ${RESULT_SMTP} -eq 0 ] && [ ${RESULT_POSTFIX} -eq 0 ]
then
echo "SMTP服务启动成功"
else
echo "SMTP服务启动不成功"
fi
else
netstat -anltp | grep ":25" 1>/dev/null 2>&1
RESULT_SMTP7=$?
systemctl status umail_postfix.service 1>/dev/null 2>&1
RESULT_POSTFIX7=$?
if [ ${RESULT_SMTP7} -eq 0 ] && [ ${RESULT_POSTFIX7} -eq 0 ]
then
echo "SMTP服务启动成功"
else
echo "SMTP服务启动不成功"
fi
fi
echo -e
echo -e
echo "######POP服务######"
Check_OS
RESULT=$?
if [ ${RESULT} -eq 0 ]
then
netstat -anltp | grep ":110" 1>/dev/null 2>&1
RESULT_POP=$?
/etc/init.d/umail_dovecot status 1>/dev/null 2>&1
RESULT_POPPROC=$?
if [ ${RESULT_POP} -eq 0 ] && [ ${RESULT_POPPROC} -eq 0 ]
then
echo "POP服务启动成功"
else
echo "POP服务启动不成功"
fi
else
netstat -anltp | grep ":110" 1>/dev/null 2>&1
RESULT_POP7=$?
systemctl status umail_dovecot.service 1>/dev/null 2>&1
RESULT_POPPROC7=$?
if [ ${RESULT_POP7} -eq 0 ] && [ ${RESULT_POPPROC7} -eq 0 ]
then
echo "POP服务启动成功"
else
echo "POP服务启动不成功"
fi
fi
echo -e
echo -e
echo "######IMAP服务######"
Check_OS
RESULT=$?
if [ ${RESULT} -eq 0 ]
then
netstat -anltp | grep ":143" 1>/dev/null 2>&1
RESULT_IMAP=$?
/etc/init.d/umail_dovecot status 1>/dev/null 2>&1
RESULT_IMAPPROC=$?
if [ ${RESULT_IMAP} -eq 0 ] && [ ${RESULT_IMAPPROC} -eq 0 ]
then
echo "IMAP服务启动成功"
else
echo "IMAP服务启动不成功"
fi
else
netstat -anltp | grep ":143" 1>/dev/null 2>&1
RESULT_IMAP7=$?
systemctl status umail_dovecot.service 1>/dev/null 2>&1
RESULT_IMAPPROC7=$?
if [ ${RESULT_IMAP7} -eq 0 ] && [ ${RESULT_IMAPPROC7} -eq 0 ]
then
echo "IMAP服务启动成功"
else
echo "IMAP服务启动不成功"
fi
fi
echo -e
echo -e
echo "######收发测试(web和客户端)######"
echo "正常"
echo -e
echo -e
echo "######管理后台功能测试######"
echo "正常"
echo -e
echo -e
echo "######反垃圾反病毒测试######"
echo "正常"
echo -e
echo -e
echo "######是否有密码泄露导致群发垃圾邮件现象######"
SMTP_SUM=$(cat /usr/local/u-mail/app/log/smtp.log | grep "from:" | awk -F " " '{ print $6 }' | sed 's/<//g' | sed 's/>,//g' | sort | uniq -c | sort -rn |sed 's/^[ \t]*//g' |head -n 1 | awk -F" " '{print $1}')
SMTP_USER=$(cat /usr/local/u-mail/app/log/smtp.log | grep "from:" | awk -F " " '{ print $6 }' | sed 's/<//g' | sed 's/>,//g' | sort | uniq -c | sort -rn |sed 's/^[ \t]*//g' |head -n 1 | awk -F" " '{print $2}')
SMTP_DIY=500
if [ $SMTP_SUM -gt $SMTP_DIY ]
then
echo "当天外发邮件数量最大的"$SMTP_USER"用户超过"$SMTP_DIY"封,请确认"
else
echo "正常"
fi
echo -e
echo -e
运行结果如下:
[root@localhost ~]
正常( 2 Intel(R) Xeon(R) CPU E5606 @ 2.13GHz,1 user, load average: 0.06, 0.02, 0.00 ,Cpu(s): 2.1%us, 0.8%sy, 0.2%ni, 96.5%id, 0.3%wa, 0.0%hi, 0.2%si, 0.0%st)
磁盘具体使用情况:
/分区剩余空间38G 使用率20%
/dev/shm分区剩余空间1.9G 使用率1%
/boot分区剩余空间425M 使用率7%
/home分区剩余空间434G 使用率38%
邮件数据存储在/home/mailbox
正常(总内存大小3952M,剩余内存大小3028M,内存使用率23%)
CentOS release 6.9 (Final),umail_app-2.2.44-2,umail_webmail-1.6.69-1
正常
正常
操作系统自带防火墙已开启
已安装CLAMD杀毒软件
服务器开机时间为 33 days, 6:29, 1 user
HTTP服务启动成功
SMTP服务启动成功
POP服务启动成功
IMAP服务启动成功
正常
正常
正常
正常
3.12、企业服务器巡检
#!/bin/bash
function system(){
echo "#########################系统信息#########################"
OS_TYPE=`uname`
OS_VER=`cat /etc/redhat-release`
OS_KER=`uname -a|awk '{print $3}'`
OS_TIME=`date +%F_%T`
OS_RUN_TIME=`uptime |awk '{print $3}'|awk -F, '{print $1}'`
OS_LAST_REBOOT_TIME=`who -b|awk '{print $2,$3}'`
OS_HOSTNAME=`hostname`
echo " 系统类型:$OS_TYPE"
echo " 系统版本:$OS_VER"
echo " 系统内核:$OS_KER"
echo " 当前时间:$OS_TIME"
echo " 运行时间:$OS_RUN_TIME"
echo "最后重启时间:$OS_LAST_REBOOT_TIME"
echo " 本机名称:$OS_HOSTNAME"
}
function network(){
echo "#########################网络信息#########################"
INTERNET=(`ifconfig|grep ens|awk -F: '{print $1}'`)
for((i=0;i<`echo ${#INTERNET[*]}`;i++))
do
OS_IP=`ifconfig ${INTERNET[$i]}|head -2|grep inet|awk '{print $2}'`
echo " 本机IP:${INTERNET[$i]}:$OS_IP"
done
curl -I http://www.baidu.com &>/dev/null
if [ $? -eq 0 ]
then echo " 访问外网:成功"
else echo " 访问外网:失败"
fi
}
function hardware(){
echo "#########################硬件信息#########################"
CPUID=`grep "physical id" /proc/cpuinfo |sort|uniq|wc -l`
CPUCORES=`grep "cores" /proc/cpuinfo|sort|uniq|awk -F: '{print $2}'`
CPUMODE=`grep "model name" /proc/cpuinfo|sort|uniq|awk -F: '{print $2}'`
echo " CPU数量: $CPUID"
echo " CPU核心:$CPUCORES"
echo " CPU型号:$CPUMODE"
MEMTOTAL=`free -m|grep Mem|awk '{print $2}'`
MEMFREE=`free -m|grep Mem|awk '{print $7}'`
echo " 内存总容量: ${MEMTOTAL}MB"
echo "剩余内存容量: ${MEMFREE}MB"
disksize=0
swapsize=`free|grep Swap|awk {'print $2'}`
partitionsize=(`df -T|sed 1d|egrep -v "tmpfs|sr0"|awk {'print $3'}`)
for ((i=0;i<`echo ${#partitionsize[*]}`;i++))
do
disksize=`expr $disksize + ${partitionsize[$i]}`
done
((disktotal=\($disksize+$swapsize\)/1024/1024))
echo " 磁盘总容量: ${disktotal}GB"
diskfree=0
swapfree=`free|grep Swap|awk '{print $4}'`
partitionfree=(`df -T|sed 1d|egrep -v "tmpfs|sr0"|awk '{print $5}'`)
for ((i=0;i<`echo ${#partitionfree[*]}`;i++))
do
diskfree=`expr $diskfree + ${partitionfree[$i]}`
done
((freetotal=\($diskfree+$swapfree\)/1024/1024))
echo "剩余磁盘容量:${freetotal}GB"
}
function secure(){
echo "#########################安全信息#########################"
countuser=(`last|grep "still logged in"|awk '{print $1}'|sort|uniq`)
for ((i=0;i<`echo ${#countuser[*]}`;i++))
do echo "当前登录用户:${countuser[$i]}"
done
md5sum -c --quiet /opt/passwd.db &>/dev/null
if [ $? -eq 0 ]
then echo " 用户异常:否"
else echo " 用户异常:是"
fi
}
function chksys(){
system
network
hardware
secure
}
3.13、定期的将每日服务器的检查结果发送到邮箱
#!/bin/bash
source /home/jack/.bash_profile
list=/home/jack/shell/monitor/serverlist
ip=`awk '{print $2}' $list `
log=/home/jack/shell/monitor/logs/check_$(date +%F).log
subject="服务器日常巡检结果"
if [ `/usr/bin/sudo ls /var/spool/mqueue/|wc -l` -ge 0 ];then
sudo rm -rf /var/spool/mqueue/*
fi
>$log
date|sed 's@CST@@g' >>$log
for i in $ip
do
ping -c 4 $i >/dev/null 2>&1
if [ $? -eq 0 ];then
echo "`cat $list|grep $i|awk '{print $1}'` 检测正常!" >>$log
else
echo "`cat $list|grep $i|awk '{print $1}'` 检测失败!" >>$log
fi
done
/bin/mail -s $subject <$log n3h3aaaaa@163.com
/etc/mail.rc中参数的设置如下:
set from=邮箱地址
set smtp=smtp服务器的地址
set smtp-auth-user=邮箱的用户名
set smtp-auth-password=邮箱的密码
set smtp-auth=login 设置登录方法
serverlist
服务器名称 服务器IP
3.14、WEB服务器巡检python脚本
from smtplib import SMTP
from email import MIMEText
from email import Header
from datetime import datetime
import httplib
web_servers = [('192.168.1.254', 80, 'index.html'),
('www.xxx.com', 80, 'index.html'),
('114.114.114.114', 9000, '/main/login.html'),
]
smtpserver = 'smtp.163.com'
sender = 'xxxx@xxx.com'
password = 'password'
receiver = ('收件人1','收件人2')
subject = u'WEB服务器告警邮件'
From = u'Web服务器'
To = u'服务器管理员'
error_log = '/tmp/web_server_status.txt'
def send_mail(context):
'''发送邮件'''
header = Header.Header
msg = MIMEText.MIMEText(context,'plain','utf-8')
msg['From'] = header(From)
msg['To'] = header(To)
msg['Subject'] = header(subject + '\n')
smtp = SMTP(smtpserver)
smtp.login(sender, password)
smtp.sendmail(sender, receiver, msg.as_string())
smtp.close()
def get_now_date_time():
'''获取当前的日期'''
now = datetime.now()
return str(now.year) + "-" + str(now.month) + "-" \
+ str(now.day) + " " + str(now.hour) + ":" \
+ str(now.minute) + ":" + str(now.second)
def check_webserver(host, port, resource):
'''检测WEB服务器状态'''
if not resource.startswith('/'):
resource = '/' + resource
try:
try :
connection = httplib.HTTPConnection(host, port)
connection.request('GET', resource)
response = connection.getresponse()
status = response.status
content_length = response.length
except :
return False
finally :
connection.close()
if status in [200,301] and content_length != 0:
return True
else:
return False
if __name__ == '__main__':
logfile = open(error_log,'a')
problem_server_list = []
for host in web_servers:
host_url = host[0]
check = check_webserver(host_url, host[1], host[2])
if not check:
temp_string = 'The Server [%s] may appear problem at %s\n' % (host_url,get_now_date_time())
print >> logfile, temp_string
problem_server_list.append(temp_string)
logfile.close()
if problem_server_list:
send_mail(''.join(problem_server_list))
3.15、linux服务器自动巡检Python脚本
import os
import sys
import paramiko
reload(sys)
sys.setdefaultencoding('utf-8')
def login_by_pubkey(serverHost,serverPort,userName,keyFile):
known_host = "~/.ssh/known_hosts"
ssh = paramiko.SSHClient();
ssh.load_system_host_keys(known_host)
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
print 'Connectting host %s......' % serverHost
ssh.connect(serverHost,serverPort,username = userName,key_filename = keyFile)
print 'Connect host %s sucess' % serverHost
fname = '~/xunjian/result_%s' % serverHost
f = file(fname,'w')
stdin, stdout, stderr = ssh.exec_command('df -h')
f.write('step1:check disk:\n')
for line in stdout.readlines():
if len(line) > 0:
print line
f.write(line)
vmstat_stdin,vmstat_stdout,vmstat_stderr = ssh.exec_command('vmstat 2 10')
f.write('step2:check system:\n')
for line in vmstat_stdout.readlines():
if len(line) > 0:
f.write(line)
process_stdin,process_stdout,process_stderr = ssh.exec_command('ps -aux | grep java | top 10')
f.write('step3:check process:\n')
for line in process_stdout.readlines():
if len(line) > 0:
f.write(line)
f.close()
ssh.close()
print 'say bye to host %s' % serverHost
print 'generate image file of %s' % serverHost
try:
java_cmd = '/usr/bin/env java -cp commons-io-2.1.jar:img.jar com.*.*.*.CeateCheckPic %s' % fname
os.system(java_cmd)
except Exception, e:
print 'error when generate image file of %s : %s' % (serverHost,e)
finally:
print '===generate image file of %s over===' % serverHost
def login_by_prikey():
pass
if __name__ == '__main__':
ips = ['#ip#,#port#,#user#,#pubkey_path#']
for ip in ips:
host,port,user,path = ip.split(',')
print '==========start %s============' % host
login_by_pubkey(host,int(port),user,path)
print '>>>>>>>>>>end %s<<<<<<<<<<<<<<' % host
|