MHA Mysql
集群架构
MHA工作原理
1 从宕机崩溃的master保存二进制日志事件(binlog events) 2 识别含有最新更新的slave 3 应用差异的中继日志(relay log)到其他的slave 4 应用从master保存的二进制日志事件(binlog events) 5 提升一个slave为新的master 6 使其他的slave连接新的master进行复制
MHA软件由两部分组成,Manager工具包和Node工具包
Manager工具包主要包括以下几个工具:
masterha_check_ssh 检查MHA的SSH配置状况 masterha_check_repl 检查MySQL复制状况 masterha_manger 启动MHA masterha_check_status 检测当前MHA运行状态 masterha_master_monitor 检测master是否宕机 masterha_master_switch 故障转移(自动或手动) masterha_conf_host 添加或删除配置的server信息
这些工具通常由MHA Manager的脚本触发,无需人为操作)主要包括以下几个工具:
save_binary_logs 保存和复制master的二进制日志 apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用于其他的slave filter_mysqlbinlog 去除不必要的ROLLBACK事件(MHA已不再使用此工具) purge_relay_logs 清除中继日志(不会阻塞SQL线程)
注意:为了尽可能的减少主库硬件损坏宕机造成的数据丢失,因此在配置MHA的同时建议配置成MySQL 5.5的半同步复制
- 自定义扩展:
secondary_check_script: 通过多条网络路由检测master的可用性 master_ip_ailover_script: 更新Application使用的masterip shutdown_script: 强制关闭master节点 report_script: 发送报告 init_conf_load_script: 加载初始配置参数 master_ip_online_change_script:更新master节点ip地址 - 配置文件:
global配置,为各application提供默认配置 application配置:为每个主从复制集群
准备工作
所需软件
http://www.zzq2525.club/mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
http://www.zzq2525.club/mha4mysql-node-0.58-0.el7.centos.noarch.rpm
?
分发密钥对,使所有主机可互相ssh免密登录
用此方法只需要在30.100上操作
ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.30.100
scp /root/.ssh root@192.168.30.101
scp /root/.ssh root@192.168.30.102
scp /root/.ssh root@192.168.30.103
所有主机配置时间同步
配置 chronyd :(192.168.30.103为ntp server)
vim /etc/chrony.conf
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
server ntp.aliyun.com prefer
systemctl start chronyd
systemctl enable chronyd
其他主机
vim /etc/chrony.conf
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
server 192.168.30.103 prefer
systemctl start chronyd
systemctl enable chronyd
查看时间同步源
chronyc sources -v
chronyc sourcestats -v
给master和master备用以及slave搭建基于GTID的主从复制
Master的配置
①修改配置文件
vim /etc/my.cnf
[mysqld]
basedir=/usr/local/mysql/
datadir=/usr/local/mysql/data
socket=/usr/local/mysql/mysql.sock
port=3306
log_error=/usr/local/mysql/logs/error.log
server-id = 1
binlog_format = row
expire_logs_days = 30
max_binlog_size = 100M
gtid_mode = ON
enforce_gtid_consistency = ON
log-bin = /usr/local/mysql/logs/mysql-bin
log_bin_index = /usr/local/mysql/logs/mysql-bin.index
log-slave-updates = ON
[mysqld_safe]
log_error=/usr/local/mysql/logs/error.log
pid-file=/usr/local/mysql/logs/mysql.pid
[client]
socket=/usr/local/mysql/mysql.sock
参数简介: server_id:主从集群里每个mysql唯一标识。 gtid_mode:开启GTID模式,每个事务有唯一ID标识。 enforce-gtid-consistency:用于禁用对GTID模式不安全的事务操作。 log-bin:记录binlog的路径,注意log是目录,binlog是文件名的前缀。 binlog_format:基于行复制,生产环境一般都这样,可靠。
②重启服务
[root@sdb1 ~]
③登陆mysql创建复制用户并且授权
GRANT REPLICATION SLAVE ON . to 'slave'@'192.168.30.%' IDENTIFIED BY '123456';
表示repl用户在30网段内都有复制的权限密码为123456 如果只是做主从复制 从主机可以不用创建授权用户但是如果基于MHA做高可用为了监控主机能监控复制能力,从主机也得创建相同的授权用户
④查看master状态
SHOW MASTER STATUS;
⑤查看是否开启GTID功能
SHOW VARIABLES LIKE '%gtid%';
master备用的配置(由于这个与slave配置相同所以这里只介绍master备用的配置方法)
①修改配置文件
vim /etc/my.cnf
[mysqld]
basedir=/usr/local/mysql/
datadir=/usr/local/mysql/data
socket=/usr/local/mysql/mysql.sock
port=3306
log_error=/usr/local/mysql/logs/error.log
server-id = 2
gtid_mode = ON
enforce_gtid_consistency = ON
log-slave-updates = ON
skip-slave-start = true
expire_logs_days = 30
max_binlog_size = 100M
read_only = ON
log-bin = /usr/local/mysql/logs/mysql-bin
log_bin_index = /usr/local/mysql/logs/mysql-bin.index
relay-log = /usr/local/mysql/logs/relay-log
relay-log-index = /usr/local/mysql/logs/relay-log-index
relay-log-info-file = /usr/local/mysql/logs/relay-log.info
master-info-repository = table
relay-log-info-repository = table
[mysqld_safe]
log_error=/usr/local/mysql/logs/error.log
pid-file=/usr/local/mysql/logs/mysql.pid
[client]
socket=/usr/local/mysql/mysql.sock
参数简介:
master-info-repository:此选项使服务器将其主信息日志写入文件或表。文件名默认为 master.info; 您可以使用–master-info-file服务器选项更改文件的名称 。默认值为 FILE。如果使用 TABLE,则会将日志写入数据库中的 slave_master_info表 mysql。 relay-log-info-repository:此选项的默认值为 FILE。如果使用 TABLE,则会将日志写入数据库中的 slave_relay_log_info表 mysql。该选项可用于使复制从属对临时意外暂停具有弹性。
注意server-id = 2 这个每台主机都不能一样
②登陆mysql创建授权用户(如果仅作主从复制这步可略)
GRANT REPLICATION SLAVE ON *.* to 'repl'@'%' IDENTIFIED BY 'uioP973@';
③配置以master做主
CHANGE MASTER TO MASTER_HOST='192.168.122.200',MASTER_PORT=3306,MASTER_USER='repl',MASTER_PASSWORD='uioP973@',MASTER_AUTO_POSITION=1;
④开启slave
start slave;
⑤查看状态
show slave status \G
准备mha环境(所有节点)
master和候选master以及slave的环境准备
①安装依赖包
[root@sdb1 ~]
②安装mha4mysql-node
yum -y install http://www.zzq2525.club/mha4mysql-node-0.58-0.el7.centos.noarch.rpm
③登陆mysql创建基于监控的用户并设置权限
grant all privileges on *.* to 'mha'@'192.168.30.%' identified by '123456';
部署监控节点
给监控主机搭建阿里yum源以及epel源头 ①把所有repo文件移入新的目录
cd /etc/yum.repos.d/
mkdir bak
mv * bak/
②搭建epel源
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
yum install epel-release
yum makecache
yum repolist
③安装依赖包
yum -y install perl-Config-Tiny epel-release perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes perl-DBD-MySQL
④安装manager包(要安装客户端才能安装管理端)
yum -y install http://www.zzq2525.club/mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
yum -y install http://www.zzq2525.club/mha4mysql-node-0.58-0.el7.centos.noarch.rpm
⑤创建配置文件需要的目录
mkdir -p /etc/mha
mkdir -p /var/log/mha/app1
⑥创建配置文件
vim /etc/mha/app1.cnf
[server default]
manager_log=/var/log/mha/mha1/manager
manager_workdir=/var/log/mha/mha1
master_binlog_dir=/usr/local/mysql/data
user=mha
password=123456
ping_interval=2
repl_user=slave
repl_password=123456
ssh_user=root
[server1]
hostname=192.168.74.230
port=3306
candidate_master=1
[server2]
hostname=192.168.74.231
port=3306
candidate_master=1
[server3]
hostname=192.168.74.232
port=3306
no_master=1
特别说明: 参数:candidate_master=1 解释:设置为候选master,如果设置该参数以后,发生主从切换以后会将此从库提升为 主库,即使这个主库不是集群中事件最新的slave 参数:check_repl_delay=0 解释:默认情况下如果一个slave落后master 100M的relay logs 的话,MHA将不会选 择该slave作为一个新的master,因为对于这个slave的恢复需要花费很长时间,通过设 置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延 时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换 的过程中一定是新的master
使用mha
- 做ssh健康检测
masterha_check_ssh --conf=/etc/mha/app1.cnf
- 做主从复制检测
masterha_check_repl --conf=/etc/mha/app1.cnf
- 启动MHA
masterha_manager --conf=/etc/mha/app1.cnf
nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/mha1/manager.log 2>&1 &
- 手动切换master
masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=192.168.122.200 --new_master_port=3306 --orig_master_is_new_slave --running_updates_limit=10000
恢复MHA故障
1.手动修复
1)修复挂掉的数据库
[root@db01 ~]
2)找到主从语句
[root@db03 ~]
Mon Nov 9 20:14:17 2020 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.16.1.52', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000007', MASTER_LOG_POS=120, MASTER_USER='rep', MASTER_PASSWORD='xxx';
3)修复的数据库执行change语句
mysql> CHANGE MASTER TO MASTER_HOST='172.16.1.52', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000007', MASTER_LOG_POS=120, MASTER_USER='rep', MASTER_PASSWORD='123';
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
4)查看主从状态
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 172.16.1.52
Master_User: rep
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000007
Read_Master_Log_Pos: 211
Relay_Log_File: db01-relay-bin.000002
Relay_Log_Pos: 374
Relay_Master_Log_File: mysql-bin.000007
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
5)修复MHA配置
[root@db03 ~]
[server default]
manager_log=/service/mha/manager
manager_workdir=/service/mha/app1
master_binlog_dir=/usr/local/mysql/data
password=mha
ping_interval=2
repl_password=123
repl_user=rep
ssh_user=root
user=mha
[server1]
hostname=172.16.1.51
port=3306
[server2]
hostname=172.16.1.52
port=3306
[server3]
hostname=172.16.1.53
port=3306
MHA VIP
使用自带脚本配置VIP漂移
需要修改master_ip_failover脚本。
注意修改脚本中的VIP地址,以及ifconfig命令的绝对路径!
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '10.0.102.110/22';
my $key = '0';
my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
注意要给脚本加上可执行权限
chmod +x master_ip_failover
然后修改配置文件:
[server default]
manager_log=/data/log/app1/manager.log
manager_workdir=/data/log/app1
master_binlog_dir=/data/mysql
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_failover
password=123456
ping_interval=3
remote_workdir=/data/log/masterha
repl_password=123456
repl_user=repl
report_script=/usr/local/bin/send_report
secondary_check_script=masterha_secondary_check -s test1 -s mgt01 --user=root --port=22 --master_host=test2 --master_port=3306
ssh_port=22
ssh_user=root
user=root
然后进行repl测试,若测试通过,在启动MHA监控!
使用自带的脚本做failover需要手动添加虚拟IP,在当前的master上添加VIP!
[root@test1 ~]# ifconfig eth0:0 10.0.102.110/22 #添加虚拟IP
|