前言
随着云服务的迅速发展,越来越多企业选择将服务托管在云服务中,在数据库领域,AWS RDS、Aliyun RDS等都是不错的选择,默认已经做了高可用,基础运维,可以为企业节省不少的运维成本。由于RDS物理数据、Root权限等其他对象对用户不开放,难免在自动化运维中有些壁垒。本文将围绕在AWS EC2结合MHA做MySQL的高可用,EC2不支持VIP,可通过绑定私有IP的方式来实现。
MHA简介
MHA是由日本MySQL专家youshimaton(现就职于Facebook公司)用Perl写的一套MySQL故障切换方案,以保障数据库的高可用性。在MySQL故障切换过程中,MHA能做到在0~30s之内实现主MySQL故障转移。
该软件由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)。MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署在一台slave节点上。MHA Node运行在每台MySQL服务器上,MHA Manager会定时探测集群中的master节点,当master出现故障时,它可以自动将最新数据的slave提升为新的master,然后将所有其他的slave重新指向新的master。整个故障转移过程对应用程序完全透明。
准备
操作系统:Amazon Linux
master:172.31.13.126
slave:172.31.9.182
vip:172.31.0.200
其中,使用两台机器,分别部署主库和从库,MHA默认部署在从库上,下方中的VIP通指私有IP。
系统初始化修改
修改主机名
主库修改主机名
1 | [root@master ~]# cat /etc/sysconfig/network |
从库修改主机名
1 | [root@slave ~]# cat /etc/sysconfig/network |
修改时区
Amazon Linux默认安装好后英国时间,需要在主库和从库修改时区1
2[root@master ~]# ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
[root@master ~]# /usr/sbin/ntpdate 0.centos.pool.ntp.org && /sbin/hwclock -w &>/dev/null
设置防火墙
主库设置,允许从库访问
1 | [root@master ~]# /etc/init.d/iptables status|grep 172.31.9.182 |
其中,172.31.9.182是允许从库的访问规则
从库设置,允许主库访问
1 | [root@slave ~]# /etc/init.d/iptables status|grep 172.31.13.126 |
其中,172.31.13.126是允许主库的访问规则
关闭SELINUX
由于使用的操作系统为Amazon Linux,通过查看无selinux设置。
建立SSH无密码登录
主库设置
修改服务器/etc/ssh/ssh_config文件,把参数GSSAPIAuthentication修改为no。
修改服务器/etc/ssh/sshd_config文件,把参数PasswordAuthentication修改为yes,参数PermitRootLogin修改为yes,重启ssh服务
修改服务器/etc/hosts.allow文件,允许从库连接1
2
3
4
5
6
7
8
9[root@master ~]# sed '/^#/d;/^$/d' /etc/ssh/ssh_config |grep GSSAPIAuthentication
GSSAPIAuthentication no
[root@master ~]# sed '/^#/d;/^$/d' /etc/ssh/sshd_config |grep -E 'PasswordAuthentication|PermitRootLogin'
PermitRootLogin yes
PermitRootLogin forced-commands-only
PasswordAuthentication yes
[root@master ~]# /etc/init.d/sshd restart
停止 sshd: [确定]
正在启动 sshd: [确定]
使用命令ssh-keygen生成公钥,发送到从库,同时尝试在主库无密码形式登录从库,而且也要保证本机无密码登录本机1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64[root# ssh-keygen ~]
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
43:a3:80:f1:8c:d4:5e:8d:3c:99:e5:39:39:e3:89:9a root
The key's randomart image is:
+--[ RSA 2048]----+
| o. . *. |
| . *. B..o |
| o.+. .X |
| .. = * |
| o S |
| o . |
| E |
| |
| |
+-----------------+
[root# ssh-copy-id -i ~/.ssh/id_rsa.pub root@172.31.9.182 ~]
The authenticity of host '172.31.9.182 (172.31.9.182)' can't be established.
ECDSA key fingerprint is 72:71:66:dc:6c:b0:31:e7:6c:77:4c:8d:32:69:e0:88.
Are you sure you want to continue connecting (yes/no)? yes
root password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'root@172.31.9.182'"
and check to make sure that only the key(s) you wanted were added.
[root# ssh-copy-id -i ~/.ssh/id_rsa.pub root@172.31.13.126 ~]
The authenticity of host '172.31.13.126 (172.31.13.126)' can't be established.
ECDSA key fingerprint is b6:ef:9f:f0:5e:fd:8f:49:ef:be:79:fb:44:ea:63:08.
Are you sure you want to continue connecting (yes/no)? yes
root password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'root@172.31.13.126'"
and check to make sure that only the key(s) you wanted were added.
[root# ssh 'root@172.31.9.182' ~]
Last login: Thu Jul 20 04:51:07 2017
__| __|_ )
_| ( / Amazon Linux AMI
___|\___|___|
https://aws.amazon.com/amazon-linux-ami/2017.03-release-notes/
[root# exit ~]
logout
Connection to 172.31.9.182 closed.
[root# ssh 'root@172.31.13.126' ~]
Last login: Thu Jul 20 04:37:13 2017
__| __|_ )
_| ( / Amazon Linux AMI
___|\___|___|
https://aws.amazon.com/amazon-linux-ami/2017.03-release-notes/
[root# exit ~]
logout
Connection to 172.31.13.126 closed.
从库设置
和主库的配置相同
修改服务器/etc/ssh/ssh_config文件,把参数GSSAPIAuthentication修改为no。
修改服务器/etc/ssh/sshd_config文件,把参数PasswordAuthentication修改为yes,参数PermitRootLogin修改为yes,重启ssh服务
修改服务器/etc/hosts.allow文件,允许从库连接1
2
3
4
5
6
7
8
9[root@slave ~]
GSSAPIAuthentication no
[root@slave ~]
PermitRootLogin yes
PermitRootLogin forced-commands-only
PasswordAuthentication yes
[root@slave ~]
停止 sshd: [确定]
正在启动 sshd: [确定]
使用命令ssh-keygen生成公钥,发送到主库,同时尝试在从库无密码形式登录主库,而且也要保证本机无密码登录本机1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86[root# ssh-keygen ~]
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
7d:57:15:e0:20:6f:6f:45:00:76:c4:ba:66:32:fd:b3 root
The key's randomart image is:
+--[ RSA 2048]----+
| . ++=ooo|
| + +.. .|
| o.. ..|
| .... .. |
| S o oo. |
| o *.. |
| = . |
| o |
| Eo |
+-----------------+
[root# ssh-keygen ~]
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
da:ea:e3:1c:fb:4b:f5:84:da:a3:47:ea:2c:fa:3c:a6 root
The key's randomart image is:
+--[ RSA 2048]----+
| |
| |
| |
| . |
| S o . |
| o +.o |
| o +oo . |
| o=*.... |
| EBO**o |
+-----------------+
[root# ssh-copy-id -i ~/.ssh/id_rsa.pub root@172.31.13.126 ~]
The authenticity of host '172.31.13.126 (172.31.13.126)' can't be established.
ECDSA key fingerprint is b6:ef:9f:f0:5e:fd:8f:49:ef:be:79:fb:44:ea:63:08.
Are you sure you want to continue connecting (yes/no)? yes
root password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'root@172.31.13.126'"
and check to make sure that only the key(s) you wanted were added.
[root# ssh-copy-id -i ~/.ssh/id_rsa.pub root@172.31.9.182 ~]
The authenticity of host '172.31.9.182 (172.31.9.182)' can't be established.
ECDSA key fingerprint is 72:71:66:dc:6c:b0:31:e7:6c:77:4c:8d:32:69:e0:88.
Are you sure you want to continue connecting (yes/no)? yes
root password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'root@172.31.9.182'"
and check to make sure that only the key(s) you wanted were added.
[root# ssh 'root@172.31.13.126' ~]
Last login: Thu Jul 20 05:36:57 2017 from 172.31.13.126
__| __|_ )
_| ( / Amazon Linux AMI
___|\___|___|
https://aws.amazon.com/amazon-linux-ami/2017.03-release-notes/
[root# exit ~]
logout
Connection to 172.31.13.126 closed.
[root# ssh 'root@172.31.9.182' ~]
Last login: Thu Jul 20 05:36:46 2017 from 172.31.13.126
__| __|_ )
_| ( / Amazon Linux AMI
___|\___|___|
https://aws.amazon.com/amazon-linux-ami/2017.03-release-notes/
[root# exit ~]
logout
Connection to 172.31.9.182 closed.
MySQL安装
使用的是自己打包的RPM包,分别登录主库和从库,直接rpm -ivh percona-server-5.7.18-15.x86_64.rpm 即可,也可使用其他方式安装MySQL1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16[root@master ~]# mv /home/zhouting/percona-server-5.7.18-15.x86_64.rpm /usr/src/
[root@master ~]# rpm -ivh /usr/src/percona-server-5.7.18-15.x86_64.rpm
Preparing... ################################# [100%]
Updating / installing...
1:percona-server-5.7.18-15 ################################# [100%]
error reading information on service /etc/rc.d/init.d/mysqld: No such file or directory
ERROR! MySQL (Percona Server) PID file could not be found!
Starting MySQL (Percona Server).. SUCCESS!
mysqladmin: [Warning] Using a password on the command line interface can be insecure.
Warning: Since password will be sent to server in plain text, use ssl connection to ensure password safety.
mysql: [Warning] Using a password on the command line interface can be insecure.
mysql: [Warning] Using a password on the command line interface can be insecure.
mysql: [Warning] Using a password on the command line interface can be insecure.
mysql: [Warning] Using a password on the command line interface can be insecure.
mysql: [Warning] Using a password on the command line interface can be insecure.
mysql: [Warning] Using a password on the command line interface can be insecure.
查看是否安装成功,分别登录主库和从库查看1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24[root@master ~]# mysql -uroot -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 11
Server version: 5.7.18-15-log PLD/Linux Distribution Percona Server RPM
Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)] 16:57:43 > select version();
+---------------+
| version() |
+---------------+
| 5.7.18-15-log |
+---------------+
1 row in set (0.00 sec)
MySQL [(none)] 16:57:47 >
MySQL互为主备
重要参数配置
主库必须包含以下参数1
2
3
4
5
6
7
8
9
10[root@master ~]# cat /etc/my.cnf
[mysqld]
server-id = 1
autocommit = 1
auto_increment_increment = 1
auto_increment_offset = 2
log_bin = mysql-bin
gtid_mode = on
enforce_gtid_consistency = 1
log_slave_updates
从库必须包含以下参数,注意slave需要动态设置read_only=1,1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28[root@slave ~]# cat /etc/my.cnf
server-id = 2
autocommit = 1
auto_increment_increment = 2
auto_increment_offset = 2
log_bin = mysql-bin
gtid_mode = on
enforce_gtid_consistency = 1
log_slave_updates
relay_log_purge=0
[root@slave ~]# mysql -uroot -p -e "show variables like 'read_only'"
Enter password:
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| read_only | OFF |
+---------------+-------+
[root@slave ~]# mysql -uroot -p -e "set global read_only=1"
Enter password:
[root@slave ~]# mysql -uroot -p -e "show variables like 'read_only'"
Enter password:
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| read_only | ON |
+---------------+-------+
[root@slave ~]#
复制帐号建立
由于MySQL版本使用的是5.7,创建用户的方法以之前有些不同
主库创建
1 | [root@master ~]# mysql -uroot -p |
从库创建
1 | [root@slave ~]# mysql -uroot -p |
复制帐号验证
主库登录从库1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16[root@master ~]# mysql -h172.31.9.182 -uslave01 -pslave123456
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 5.7.18-15-log PLD/Linux Distribution Percona Server RPM
Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)] 17:22:15 > exit
从库登录主库1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16[root@slave ~]# mysql -h172.31.13.126 -uslave01 -pslave123456
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 5.7.18-15-log PLD/Linux Distribution Percona Server RPM
Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)] 17:22:57 > exit
复制关系配置
主库配置
登录主库,设置复制关系,来源于从库的复制1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87[root@master ~]# mysql -uroot -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 17
Server version: 5.7.18-15-log PLD/Linux Distribution Percona Server RPM
Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)] 17:32:19 > change master to master_host='172.31.9.182',master_user='slave01',master_password='slave123456',master_auto_position=1;
Query OK, 0 rows affected, 2 warnings (0.04 sec)
MySQL [(none)] 17:33:03 > start slave;
Query OK, 0 rows affected (0.03 sec)
MySQL [(none)] 17:33:47 > show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 172.31.9.182
Master_User: slave01
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 810
Relay_Log_File: master-relay-bin.000002
Relay_Log_Pos: 1023
Relay_Master_Log_File: mysql-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 810
Relay_Log_Space: 1231
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
Master_UUID: a1c1d189-6b96-11e7-9825-02a13635a5ca
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: a1c1d189-6b96-11e7-9825-02a13635a5ca:1-3
Executed_Gtid_Set: 9a7b78fc-6b96-11e7-9856-0232d9c5deea:1-17,
a1c1d189-6b96-11e7-9825-02a13635a5ca:1-3
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
ERROR:
No query specified
MySQL [(none)] 17:33:54 >
从库配置
登录从库,设置复制关系,来源于主库的复制1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87[root@slave ~]# mysql -uroot -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 10
Server version: 5.7.18-15-log PLD/Linux Distribution Percona Server RPM
Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)] 17:34:41 > change master to master_host='172.31.13.126',master_user='slave01',master_password='slave123456',master_auto_position=1;
Query OK, 0 rows affected, 2 warnings (0.03 sec)
MySQL [(none)] 17:35:14 > start slave;
Query OK, 0 rows affected (0.03 sec)
MySQL [(none)] 17:35:15 > show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 172.31.0.139
Master_User: slave01
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 4455
Relay_Log_File: slave-relay-bin.000002
Relay_Log_Pos: 4059
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 4455
Relay_Log_Space: 4266
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_UUID: 9a7b78fc-6b96-11e7-9856-0232d9c5deea
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: 9a7b78fc-6b96-11e7-9856-0232d9c5deea:1-17
Executed_Gtid_Set: 9a7b78fc-6b96-11e7-9856-0232d9c5deea:1-17,
a1c1d189-6b96-11e7-9825-02a13635a5ca:1-3
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
ERROR:
No query specified
MySQL [(none)] 17:35:18 >
复制测试
主库登录测试
登录主库,插入测试数据1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34[root@master ~]# mysql -uroot -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 37
Server version: 5.7.18-15-log PLD/Linux Distribution Percona Server RPM
Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)] 17:35:48 > create database if not exists test11;use test11;
Query OK, 1 row affected (0.00 sec)
Database changed
MySQL [test11] 17:35:52 > create table if not exists t11(id int unsigned not null auto_increment,name varchar(20),primary key(`id`));
Query OK, 0 rows affected (0.02 sec)
MySQL [test11] 17:35:55 > insert into t11 values(null,'master11');
Query OK, 1 row affected (0.01 sec)
MySQL [test11] 17:35:58 > select * from t11;
+----+----------+
| id | name |
+----+----------+
| 1 | master11 |
+----+----------+
1 row in set (0.00 sec)
MySQL [test11] 17:36:02 >
登录从库,查看测试数据,此时数据已经复制过来1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24[root@slave ~]# mysql -uroot -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 29
Server version: 5.7.18-15-log PLD/Linux Distribution Percona Server RPM
Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)] 17:36:25 > select * from test11.t11;
+----+----------+
| id | name |
+----+----------+
| 1 | master11 |
+----+----------+
1 row in set (0.00 sec)
MySQL [(none)] 17:36:30 >
从库登录测试
登录从库,插入测试数据1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34[root@slave ~]# mysql -uroot -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 32
Server version: 5.7.18-15-log PLD/Linux Distribution Percona Server RPM
Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)] 17:37:59 > create database if not exists test11;use test11;
Query OK, 1 row affected, 1 warning (0.01 sec)
Database changed
MySQL [test11] 17:38:03 > create table if not exists t22(id int unsigned not null auto_increment,name varchar(20),primary key(`id`));
Query OK, 0 rows affected (0.03 sec)
MySQL [test11] 17:38:18 > insert into t22 values(null,'slave11');
Query OK, 1 row affected (0.00 sec)
MySQL [test11] 17:38:30 > select * from t22;
+----+---------+
| id | name |
+----+---------+
| 2 | slave11 |
+----+---------+
1 row in set (0.00 sec)
MySQL [test11] 17:38:35 >
登录主库,查看测试数据,此时数据已经复制过来1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24[root@master ~]# mysql -uroot -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 38
Server version: 5.7.18-15-log PLD/Linux Distribution Percona Server RPM
Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)] 17:39:09 > select * from test11.t22;
+----+---------+
| id | name |
+----+---------+
| 2 | slave11 |
+----+---------+
1 row in set (0.00 sec)
MySQL [(none)] 17:39:15 >
MHA帐号建立
主库创建
登录主库,创建允许主库和从库连接的MHA用户信息,再分别尝试使用MHA用户登录(省略)1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29[root@master ~]# mysql -uroot -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 41
Server version: 5.7.18-15-log PLD/Linux Distribution Percona Server RPM
Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MySQL [(none)] 09:36:55 > create user 'mha01'@'172.31.9.182' identified by 'mha123456';
Query OK, 0 rows affected (0.00 sec)
MySQL [(none)] 09:37:14 > grant all privileges on *.* to 'mha01'@'172.31.9.182';
Query OK, 0 rows affected (0.01 sec)
MySQL [(none)] 09:38:02 > create user 'mha01'@'172.31.13.126' identified by 'mha123456';
Query OK, 0 rows affected (0.00 sec)
MySQL [(none)] 09:38:14 > grant all privileges on *.* to 'mha01'@'172.31.13.126';
Query OK, 0 rows affected (0.01 sec)
MySQL [(none)] 09:38:18 > flush privileges;
Query OK, 0 rows affected (0.00 sec)
从库创建
登录从库,由于复制已经把在主库创建的MHA用户信息复制了过来,尝试登录即可(省略)
MHA安装
安装依赖
分别在主库和从库上安装,执行命令:
yum -y install gcc gcc-c++ make openssl-devel perl perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Config-IniFiles perl-Time-HiRes perl-Module-Install.noarch mailx jwhois
主库安装mha4mysql-node
MHA管理端安装在从库上,主库上只需要mha4mysql-node即可,由于MySQL使用5.7版本,MHA也使用最新的0.57版本,采用编译安装方式。下载地址:https://mega.nz/#F!G4oRjARB!SWzFS59bUv9VrKwdAeIGVw1
2
3
4
5
6
7
8
9
10
11[root@master ~]# cd /usr/src/
[root@master src]# tar -zxf mha4mysql-node-0.57.tar.gz
[root@master mha4mysql-node-0.57]# perl Makefile.PL #编译过程省略
[root@master mha4mysql-node-0.57]# make
[root@master mha4mysql-node-0.57]# make install
[root@master mha4mysql-node-0.57]# ll /usr/local/bin/ -t #查看安装,有以下文件则mha4mysql-node安装成功
total 1760
-r-xr-xr-x 1 root root 16381 Jul 21 09:41 apply_diff_relay_logs
-r-xr-xr-x 1 root root 8261 Jul 21 09:41 purge_relay_logs
-r-xr-xr-x 1 root root 7525 Jul 21 09:41 save_binary_logs
-r-xr-xr-x 1 root root 4807 Jul 21 09:41 filter_mysqlbinlog
从库安装mha4mysql-manager和mha4mysql-node
MHA管理端安装在从库,从库需要安装manager端和node端1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29[root@slave ~]# cd /usr/src/
[root@slave src]# tar -zxf mha4mysql-manager-0.57.tar.gz
[root@slave src]# cd mha4mysql-manager-0.57
[root@slave mha4mysql-manager-0.57]# perl Makefile.PL #编译过程省略
[root@slave mha4mysql-manager-0.57]# make
[root@slave mha4mysql-manager-0.57]# make install
[root@slave mha4mysql-manager-0.57]# ll /usr/local/bin/ -t #查看安装,有以下文件则mha4mysql-manager安装成功
total 1756
-r-xr-xr-x 1 root root 1779 Jul 21 09:48 masterha_check_ssh
-r-xr-xr-x 1 root root 2517 Jul 21 09:48 masterha_manager
-r-xr-xr-x 1 root root 2373 Jul 21 09:48 masterha_master_switch
-r-xr-xr-x 1 root root 5171 Jul 21 09:48 masterha_secondary_check
-r-xr-xr-x 1 root root 1995 Jul 21 09:48 masterha_check_repl
-r-xr-xr-x 1 root root 1865 Jul 21 09:48 masterha_check_status
-r-xr-xr-x 1 root root 3201 Jul 21 09:48 masterha_conf_host
-r-xr-xr-x 1 root root 2165 Jul 21 09:48 masterha_master_monitor
-r-xr-xr-x 1 root root 1739 Jul 21 09:48 masterha_stop
[root@slave mha4mysql-manager-0.57]# cd ..
[root@slave src]# tar -zxf mha4mysql-node-0.57.tar.gz
[root@slave src]# cd mha4mysql-node-0.57
[root@slave mha4mysql-node-0.57]# perl Makefile.PL #编译过程省略
[root@slave mha4mysql-node-0.57]# make
[root@slave mha4mysql-node-0.57]# make install
[root@slave mha4mysql-node-0.57]# ll /usr/local/bin/ -t #查看安装,有以下文件则mha4mysql-node安装成功
total 1800
-r-xr-xr-x 1 root root 16381 Jul 21 09:54 apply_diff_relay_logs
-r-xr-xr-x 1 root root 8261 Jul 21 09:54 purge_relay_logs
-r-xr-xr-x 1 root root 7525 Jul 21 09:54 save_binary_logs
-r-xr-xr-x 1 root root 4807 Jul 21 09:54 filter_mysqlbinlog
MHA目录结构说明
MHAManager
mhamanager工具包主要包括以下工具1
2
3
4
5
6
7
8
9
10
11[root@slave ~]# ll /usr/local/bin/
总用量 84
-r-xr-xr-x 1 root root 1995 7月 19 11:49 masterha_check_repl #检查MySQL复制情况
-r-xr-xr-x 1 root root 1779 7月 19 11:49 masterha_check_ssh #检查MHA的SSH配置情况
-r-xr-xr-x 1 root root 1865 7月 19 11:49 masterha_check_status #检测当前MHA运行状态
-r-xr-xr-x 1 root root 3201 7月 19 11:49 masterha_conf_host #添加或删除配置的server信息
-r-xr-xr-x 1 root root 2517 7月 19 11:49 masterha_manager #启动MHA
-r-xr-xr-x 1 root root 2165 7月 19 11:49 masterha_master_monitor #检测Master是否宕机
-r-xr-xr-x 1 root root 2373 7月 19 11:49 masterha_master_switch #控制故障转移,自动或者手动
-r-xr-xr-x 1 root root 5172 7月 19 11:49 masterha_secondary_check #通过其他路由检测Master是否真的宕机
-r-xr-xr-x 1 root root 1739 7月 19 11:49 masterha_stop #停止MHA
MHANode
mhanode工具包主要包括以下工具1
2
3
4
5
6[root@master ~]# ll /usr/local/bin/
总用量 44
-r-xr-xr-x 1 root root 16371 7月 19 11:41 apply_diff_relay_logs #识别差异日志的中继日志,并将其差异事件应用于其他Slave
-r-xr-xr-x 1 root root 4807 7月 19 11:41 filter_mysqlbinlog #去除不必要的Rollback事件
-r-xr-xr-x 1 root root 8263 7月 19 11:41 purge_relay_logs #删除无用的Relay log,避免延时
-r-xr-xr-x 1 root root 7525 7月 19 11:41 save_binary_logs #保存和复制down掉的主服务器二进制日志
自定义扩展脚本说明
secondary_check_script #通过多条网络路由检测master的可用性
master_ip_failover_script #自动failover时候的切换脚本,可将vip信息写入此脚本中
shutdown_script #强制关闭master节点执行脚本
report_script #发送报告
init_conf_load_script #加载初始配置参数,如不想在配置中写明文密码
master_ip_online_change_script #手动failover时候的切换脚本
MHA扩展脚本
由于使用的AWS EC2,不支持VIP,可通过辅助IP的形式实现,这里使用脚本来辅助实现HA。
aws_vip_change.sh
1 | [root@slave masterha]# cat /usr/local/bin/aws_vip_change.sh |
master_ip_failover
1 | [root@slave masterha]# cat /usr/local/bin/master_ip_failover |
send_report
1 | [root@slave masterha]# cat /usr/local/bin/send_report |
主库手动绑定ENI和实例关系
在主库上先申请新的ENI,注意申请时先绑定的group,以满足其他EC2能够连接些ENI。
申请成功后,根据新的ENI和当前主实例的实例ID绑定,最终确认新的VIP是否绑定上。1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
[root@master ~]# ifconfig
eth0 Link encap:Ethernet HWaddr 02:E6:0B:68:5C:1C
inet addr:172.31.13.126 Bcast:172.31.15.255 Mask:255.255.240.0
inet6 addr: fe80::e6:bff:fe68:5c1c/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9001 Metric:1
RX packets:4063 errors:0 dropped:0 overruns:0 frame:0
TX packets:3292 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:691839 (675.6 KiB) TX bytes:390499 (381.3 KiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:79 errors:0 dropped:0 overruns:0 frame:0
TX packets:79 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:16736 (16.3 KiB) TX bytes:16736 (16.3 KiB)
[root@master ~]# aws ec2 create-network-interface --subnet-id subnet-33859151 --private-ip-address 172.31.0.200 --groups sg-e50a1c87 sg-6b1f0809
{
"NetworkInterface": {
"Status": "pending",
"MacAddress": "02:12:25:83:a0:22",
"SourceDestCheck": true,
"VpcId": "vpc-b18195d3",
"Description": "",
"NetworkInterfaceId": "eni-7a5c7722", #新申请的ENI名字
"PrivateIpAddresses": [
{
"PrivateDnsName": "ip-172-31-0-200.cn-north-1.compute.internal",
"Primary": true,
"PrivateIpAddress": "172.31.0.200" #新ENI绑定的私有IP
}
],
"RequesterManaged": false,
"Groups": [
{
"GroupName": "For Mysql",
"GroupId": "sg-e50a1c87"
},
{
"GroupName": "OnlySSH-Allow",
"GroupId": "sg-6b1f0809"
}
],
"PrivateDnsName": "ip-172-31-0-200.cn-north-1.compute.internal",
"AvailabilityZone": "cn-north-1a",
"RequesterId": "AIDAPVDKK6NLF6SZ553KS",
"SubnetId": "subnet-33859151",
"OwnerId": "981100955930",
"TagSet": [],
"PrivateIpAddress": "172.31.0.200"
}
}
[root@master ~]# aws ec2 attach-network-interface --network-interface-id eni-7a5c7722 --instance-id i-0a259e4950b0b3cf9 --device-index 1 #将ENI和实例绑定
{
"AttachmentId": "eni-attach-2b2c9345"
}
}
[root@master ~]# ifconfig #查看ip:172.31.0.200已绑定成功,同时要在其他EC2确定此安全组是否能够满足需求,否则还需要重新调整。
eth0 Link encap:Ethernet HWaddr 02:E6:0B:68:5C:1C
inet addr:172.31.13.126 Bcast:172.31.15.255 Mask:255.255.240.0
inet6 addr: fe80::e6:bff:fe68:5c1c/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9001 Metric:1
RX packets:5435 errors:0 dropped:0 overruns:0 frame:0
TX packets:4190 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:810999 (791.9 KiB) TX bytes:501986 (490.2 KiB)
eth4 Link encap:Ethernet HWaddr 02:12:25:83:A0:22
inet addr:172.31.0.200 Bcast:172.31.15.255 Mask:255.255.240.0
inet6 addr: fe80::12:25ff:fe83:a022/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9001 Metric:1
RX packets:11 errors:0 dropped:0 overruns:0 frame:0
TX packets:26 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1218 (1.1 KiB) TX bytes:2664 (2.6 KiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:79 errors:0 dropped:0 overruns:0 frame:0
TX packets:79 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:16736 (16.3 KiB) TX bytes:16736 (16.3 KiB)
[root@master ~]#
从库配置MHAManager
创建配置目录/etc/masterha/,同时创建一个项目上的配置文件,仅配置自动FailOver部分,只需要master_ip_failover和report_script脚本,其他脚本暂时不定义。1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29[root@slave ~]# mkdir -p /etc/masterha/
[root@slave ~]# cd /etc/masterha/
[root@slave masterha]#
[root@slave masterha]# cat app1.cnf
[server default]
manager_workdir=/var/log/masterha/app1 #manager工作目录
manager_log=/var/log/masterha/app1/manager.log #manager日志
master_binlog_dir=/hwdata/data/percona #mysql数据目录
password=mha123456 #mha连接密码
user=mha01 #mha连接用户
ping_interval=1 #监控主库,发送ping包的时间间隔,默认是3秒,尝试三次没有回应的时候自动进行failover
remote_workdir=/tmp #远端mysql在发生切换时binlog的保存位置
repl_password=slave123456 #复制密码
repl_user=slave01 #复制用户
ssh_user=root #ssh用户
master_ip_failover_script=/usr/local/bin/master_ip_failover #自动failover切换脚本
#master_ip_online_change_script= /usr/local/bin/master_ip_online_change #手动failover切换脚本
report_script=/usr/local/bin/send_report #报告发送脚本(不是必须)
#shutdown_script= #故障发生后关闭主机的脚本(不是必须)
secondary_check_script = masterha_secondary_check -s 172.31.13.126 -s 172.31.9.182 --user=root hostname=172.31.13.126 --master_ip=172.31.13.126 --master_port=3306 #通过第三方机器确认目标主库是否存活,这里的ip可换成其他机器,用于检测
[server1]
hostname=172.31.13.126 #主库地址
port=3306
[server2]
hostname=172.31.9.182 #从库地址
port=3306
candidate_master=1 #候选master
测试MHAManager SSH
1 | [root@slave masterha]# masterha_check_ssh --conf=/etc/masterha/app1.cnf |
测试MHAManager 复制
1 | [root@slave masterha]# masterha_check_repl --conf=/etc/masterha/app1.cnf |
启动MHAManager
启动进程
1 | [ ] |
查看启动日志
1 | [root@slave ~]# cat /var/log/masterha/app1/manager.log |
模拟主库故障
确认当前VIP状态
在主库上执行停止服务,观察MHAManager日志,并确认VIP是否切换到从库上1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29[root@master ~]# ifconfig #确认当前VIP:172.31.0.200在主库上
eth0 Link encap:Ethernet HWaddr 02:E6:0B:68:5C:1C
inet addr:172.31.13.126 Bcast:172.31.15.255 Mask:255.255.240.0
inet6 addr: fe80::e6:bff:fe68:5c1c/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9001 Metric:1
RX packets:9629 errors:0 dropped:0 overruns:0 frame:0
TX packets:8119 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1202679 (1.1 MiB) TX bytes:1113439 (1.0 MiB)
eth4 Link encap:Ethernet HWaddr 02:12:25:83:A0:22
inet addr:172.31.0.200 Bcast:172.31.15.255 Mask:255.255.240.0
inet6 addr: fe80::12:25ff:fe83:a022/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9001 Metric:1
RX packets:4 errors:0 dropped:0 overruns:0 frame:0
TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:562 (562.0 b) TX bytes:600 (600.0 b)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:90 errors:0 dropped:0 overruns:0 frame:0
TX packets:90 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:17324 (16.9 KiB) TX bytes:17324 (16.9 KiB)
[root@master ~]#
确认MHAManger进程状态
1 | [root@slave masterha]# ps aux|grep master |
停止主库服务,触发主库FailOver
1 | [root@master ~]# /etc/init.d/mysqld stop #服务停止 |
观察FailOver日志
登录从库,查看MHAManager日志1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250[root@slave app1]# cat manager.log
Mon Jul 24 15:36:52 2017 - [info] MHA::MasterMonitor version 0.57.
Mon Jul 24 15:36:54 2017 - [info] Multi-master configuration is detected. Current primary(writable) master is 172.31.13.126(172.31.13.126:3306)
Mon Jul 24 15:36:54 2017 - [info] Master configurations are as below:
Master 172.31.13.126(172.31.13.126:3306), replicating from 172.31.9.182(172.31.9.182:3306)
Master 172.31.9.182(172.31.9.182:3306), replicating from 172.31.13.126(172.31.13.126:3306), read-only
Mon Jul 24 15:36:54 2017 - [info] GTID failover mode = 1
Mon Jul 24 15:36:54 2017 - [info] Dead Servers:
Mon Jul 24 15:36:54 2017 - [info] Alive Servers:
Mon Jul 24 15:36:54 2017 - [info] 172.31.13.126(172.31.13.126:3306)
Mon Jul 24 15:36:54 2017 - [info] 172.31.9.182(172.31.9.182:3306)
Mon Jul 24 15:36:54 2017 - [info] Alive Slaves:
Mon Jul 24 15:36:54 2017 - [info] 172.31.9.182(172.31.9.182:3306) Version=5.7.18-15-log (oldest major version between slaves) log-bin:enabled
Mon Jul 24 15:36:54 2017 - [info] GTID ON
Mon Jul 24 15:36:54 2017 - [info] Replicating from 172.31.13.126(172.31.13.126:3306)
Mon Jul 24 15:36:54 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Jul 24 15:36:54 2017 - [info] Current Alive Master: 172.31.13.126(172.31.13.126:3306)
Mon Jul 24 15:36:54 2017 - [info] Checking slave configurations..
Mon Jul 24 15:36:54 2017 - [info] Checking replication filtering settings..
Mon Jul 24 15:36:54 2017 - [info] binlog_do_db= , binlog_ignore_db=
Mon Jul 24 15:36:54 2017 - [info] Replication filtering check ok.
Mon Jul 24 15:36:54 2017 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Mon Jul 24 15:36:54 2017 - [info] Checking SSH publickey authentication settings on the current master..
Mon Jul 24 15:36:54 2017 - [info] HealthCheck: SSH to 172.31.13.126 is reachable.
Mon Jul 24 15:36:54 2017 - [info]
172.31.13.126(172.31.13.126:3306) (current master)
+--172.31.9.182(172.31.9.182:3306)
Mon Jul 24 15:36:54 2017 - [info] Checking master_ip_failover_script status:
Mon Jul 24 15:36:54 2017 - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=172.31.13.126 --orig_master_ip=172.31.13.126 --orig_master_port=3306
Mon Jul 24 15:36:54 2017 - [info] OK.
Mon Jul 24 15:36:54 2017 - [warning] shutdown_script is not defined.
Mon Jul 24 15:36:54 2017 - [info] Set master ping interval 1 seconds.
Mon Jul 24 15:36:54 2017 - [info] Set secondary check script: masterha_secondary_check -s 172.31.13.126 -s 172.31.9.182 --user=root hostname=172.31.13.126 --master_ip=172.31.13.126 --master_port=3306
Mon Jul 24 15:36:54 2017 - [info] Starting ping health check on 172.31.13.126(172.31.13.126:3306)..
Mon Jul 24 15:36:54 2017 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond.. #等待主库FailOver
Mon Jul 24 15:40:01 2017 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Mon Jul 24 15:40:01 2017 - [info] Executing secondary network check script: masterha_secondary_check -s 172.31.13.126 -s 172.31.9.182 --user=root hostname=172.31.13.126 --master_ip=172.31.13.126 --master_port=3306 --user=root --master_host=172.31.13.126 --master_ip=172.31.13.126 --master_port=3306 --master_user=mha01 --master_password=mha123456 --ping_type=SELECT
Mon Jul 24 15:40:01 2017 - [info] Executing SSH check script: exit 0
Mon Jul 24 15:40:01 2017 - [info] HealthCheck: SSH to 172.31.13.126 is reachable.
Monitoring server 172.31.13.126 is reachable, Master is not reachable from 172.31.13.126. OK.
Monitoring server 172.31.9.182 is reachable, Master is not reachable from 172.31.9.182. OK.
Mon Jul 24 15:40:01 2017 - [info] Master is not reachable from all other monitoring servers. Failover should start.
Mon Jul 24 15:40:02 2017 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.31.13.126' (111))
Mon Jul 24 15:40:02 2017 - [warning] Connection failed 2 time(s)..
Mon Jul 24 15:40:03 2017 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.31.13.126' (111))
Mon Jul 24 15:40:03 2017 - [warning] Connection failed 3 time(s)..
Mon Jul 24 15:40:04 2017 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.31.13.126' (111))
Mon Jul 24 15:40:04 2017 - [warning] Connection failed 4 time(s)..
Mon Jul 24 15:40:04 2017 - [warning] Master is not reachable from health checker!
Mon Jul 24 15:40:04 2017 - [warning] Master 172.31.13.126(172.31.13.126:3306) is not reachable!
Mon Jul 24 15:40:04 2017 - [warning] SSH is reachable.
Mon Jul 24 15:40:04 2017 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status..
Mon Jul 24 15:40:04 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Mon Jul 24 15:40:04 2017 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Mon Jul 24 15:40:04 2017 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Mon Jul 24 15:40:05 2017 - [info] GTID failover mode = 1
Mon Jul 24 15:40:05 2017 - [info] Dead Servers:
Mon Jul 24 15:40:05 2017 - [info] 172.31.13.126(172.31.13.126:3306)
Mon Jul 24 15:40:05 2017 - [info] Alive Servers:
Mon Jul 24 15:40:05 2017 - [info] 172.31.9.182(172.31.9.182:3306)
Mon Jul 24 15:40:05 2017 - [info] Alive Slaves:
Mon Jul 24 15:40:05 2017 - [info] 172.31.9.182(172.31.9.182:3306) Version=5.7.18-15-log (oldest major version between slaves) log-bin:enabled
Mon Jul 24 15:40:05 2017 - [info] GTID ON
Mon Jul 24 15:40:05 2017 - [info] Replicating from 172.31.13.126(172.31.13.126:3306)
Mon Jul 24 15:40:05 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Jul 24 15:40:05 2017 - [info] Checking slave configurations..
Mon Jul 24 15:40:05 2017 - [info] Checking replication filtering settings..
Mon Jul 24 15:40:05 2017 - [info] Replication filtering check ok.
Mon Jul 24 15:40:05 2017 - [info] Master is down!
Mon Jul 24 15:40:05 2017 - [info] Terminating monitoring script.
Mon Jul 24 15:40:05 2017 - [info] Got exit code 20 (Master dead).
Mon Jul 24 15:40:05 2017 - [info] MHA::MasterFailover version 0.57.
Mon Jul 24 15:40:05 2017 - [info] Starting master failover.
Mon Jul 24 15:40:05 2017 - [info]
Mon Jul 24 15:40:05 2017 - [info] * Phase 1: Configuration Check Phase..
Mon Jul 24 15:40:05 2017 - [info]
Mon Jul 24 15:40:07 2017 - [info] GTID failover mode = 1
Mon Jul 24 15:40:07 2017 - [info] Dead Servers:
Mon Jul 24 15:40:07 2017 - [info] 172.31.13.126(172.31.13.126:3306)
Mon Jul 24 15:40:07 2017 - [info] Checking master reachability via MySQL(double check)...
Mon Jul 24 15:40:07 2017 - [info] ok.
Mon Jul 24 15:40:07 2017 - [info] Alive Servers:
Mon Jul 24 15:40:07 2017 - [info] 172.31.9.182(172.31.9.182:3306)
Mon Jul 24 15:40:07 2017 - [info] Alive Slaves:
Mon Jul 24 15:40:07 2017 - [info] 172.31.9.182(172.31.9.182:3306) Version=5.7.18-15-log (oldest major version between slaves) log-bin:enabled
Mon Jul 24 15:40:07 2017 - [info] GTID ON
Mon Jul 24 15:40:07 2017 - [info] Replicating from 172.31.13.126(172.31.13.126:3306)
Mon Jul 24 15:40:07 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Jul 24 15:40:07 2017 - [info] Starting GTID based failover.
Mon Jul 24 15:40:07 2017 - [info]
Mon Jul 24 15:40:07 2017 - [info] ** Phase 1: Configuration Check Phase completed.
Mon Jul 24 15:40:07 2017 - [info]
Mon Jul 24 15:40:07 2017 - [info] * Phase 2: Dead Master Shutdown Phase..
Mon Jul 24 15:40:07 2017 - [info]
Mon Jul 24 15:40:07 2017 - [info] Forcing shutdown so that applications never connect to the current master..
Mon Jul 24 15:40:07 2017 - [info] Executing master IP deactivation script:
Mon Jul 24 15:40:07 2017 - [info] /usr/local/bin/master_ip_failover --orig_master_host=172.31.13.126 --orig_master_ip=172.31.13.126 --orig_master_port=3306 --command=stopssh --ssh_user=root
Mon Jul 24 15:40:07 2017 - [info] done.
Mon Jul 24 15:40:07 2017 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Mon Jul 24 15:40:07 2017 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Mon Jul 24 15:40:07 2017 - [info]
Mon Jul 24 15:40:07 2017 - [info] * Phase 3: Master Recovery Phase..
Mon Jul 24 15:40:07 2017 - [info]
Mon Jul 24 15:40:07 2017 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Mon Jul 24 15:40:07 2017 - [info]
Mon Jul 24 15:40:07 2017 - [info] The latest binary log file/position on all slaves is mysql-bin.000011:234
Mon Jul 24 15:40:07 2017 - [info] Latest slaves (Slaves that received relay log files to the latest):
Mon Jul 24 15:40:07 2017 - [info] 172.31.9.182(172.31.9.182:3306) Version=5.7.18-15-log (oldest major version between slaves) log-bin:enabled
Mon Jul 24 15:40:07 2017 - [info] GTID ON
Mon Jul 24 15:40:07 2017 - [info] Replicating from 172.31.13.126(172.31.13.126:3306)
Mon Jul 24 15:40:07 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Jul 24 15:40:07 2017 - [info] The oldest binary log file/position on all slaves is mysql-bin.000011:234
Mon Jul 24 15:40:07 2017 - [info] Oldest slaves:
Mon Jul 24 15:40:07 2017 - [info] 172.31.9.182(172.31.9.182:3306) Version=5.7.18-15-log (oldest major version between slaves) log-bin:enabled
Mon Jul 24 15:40:07 2017 - [info] GTID ON
Mon Jul 24 15:40:07 2017 - [info] Replicating from 172.31.13.126(172.31.13.126:3306)
Mon Jul 24 15:40:07 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Jul 24 15:40:07 2017 - [info]
Mon Jul 24 15:40:07 2017 - [info] * Phase 3.3: Determining New Master Phase..
Mon Jul 24 15:40:07 2017 - [info]
Mon Jul 24 15:40:07 2017 - [info] Searching new master from slaves..
Mon Jul 24 15:40:07 2017 - [info] Candidate masters from the configuration file:
Mon Jul 24 15:40:07 2017 - [info] 172.31.9.182(172.31.9.182:3306) Version=5.7.18-15-log (oldest major version between slaves) log-bin:enabled
Mon Jul 24 15:40:07 2017 - [info] GTID ON
Mon Jul 24 15:40:07 2017 - [info] Replicating from 172.31.13.126(172.31.13.126:3306)
Mon Jul 24 15:40:07 2017 - [info] Primary candidate for the new Master (candidate_master is set)
Mon Jul 24 15:40:07 2017 - [info] Non-candidate masters:
Mon Jul 24 15:40:07 2017 - [info] Searching from candidate_master slaves which have received the latest relay log events..
Mon Jul 24 15:40:07 2017 - [info] New master is 172.31.9.182(172.31.9.182:3306)
Mon Jul 24 15:40:07 2017 - [info] Starting master failover..
Mon Jul 24 15:40:07 2017 - [info]
From:
172.31.13.126(172.31.13.126:3306) (current master)
+--172.31.9.182(172.31.9.182:3306)
To:
172.31.9.182(172.31.9.182:3306) (new master)
Mon Jul 24 15:40:07 2017 - [info]
Mon Jul 24 15:40:07 2017 - [info] * Phase 3.3: New Master Recovery Phase..
Mon Jul 24 15:40:07 2017 - [info]
Mon Jul 24 15:40:07 2017 - [info] Waiting all logs to be applied..
Mon Jul 24 15:40:07 2017 - [info] done.
Mon Jul 24 15:40:07 2017 - [info] Getting new master's binlog name and position..
Mon Jul 24 15:40:07 2017 - [info] mysql-bin.000008:234
Mon Jul 24 15:40:07 2017 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.31.9.182', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='slave01', MASTER_PASSWORD='xxx';
Mon Jul 24 15:40:07 2017 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysql-bin.000008, 234, 28a037e5-6d0f-11e7-b43a-02e60b685c1c:1-15,
39292a50-6d0f-11e7-999e-020efbf1e94e:1-4
Mon Jul 24 15:40:07 2017 - [info] Executing master IP activate script:
Mon Jul 24 15:40:07 2017 - [info] /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=172.31.13.126 --orig_master_ip=172.31.13.126 --orig_master_port=3306 --new_master_host=172.31.9.182 --new_master_ip=172.31.9.182 --new_master_port=3306 --new_master_user='mha01' --new_master_password=xxx
NEW_MASTER_IP=172.31.9.182,OLD_MASTER_IP=172.31.13.126
Connection to 172.31.13.126 closed.
VIP_ATTACHMENT_ID=eni-attach-5b219e35
ATTACHMENT eni-attach-5b219e35 detaching
INSTANCE_ID=i-03135a59c242e9bad
eni-attach-0f209f61
Mon Jul 24 15:40:22 2017 - [info] OK.
Mon Jul 24 15:40:22 2017 - [info] Setting read_only=0 on 172.31.9.182(172.31.9.182:3306)..
Mon Jul 24 15:40:22 2017 - [info] ok.
Mon Jul 24 15:40:22 2017 - [info] ** Finished master recovery successfully.
Mon Jul 24 15:40:22 2017 - [info] * Phase 3: Master Recovery Phase completed.
Mon Jul 24 15:40:22 2017 - [info]
Mon Jul 24 15:40:22 2017 - [info] * Phase 4: Slaves Recovery Phase..
Mon Jul 24 15:40:22 2017 - [info]
Mon Jul 24 15:40:22 2017 - [info]
Mon Jul 24 15:40:22 2017 - [info] * Phase 4.1: Starting Slaves in parallel..
Mon Jul 24 15:40:22 2017 - [info]
Mon Jul 24 15:40:22 2017 - [info] All new slave servers recovered successfully.
Mon Jul 24 15:40:22 2017 - [info]
Mon Jul 24 15:40:22 2017 - [info] * Phase 5: New master cleanup phase..
Mon Jul 24 15:40:22 2017 - [info]
Mon Jul 24 15:40:22 2017 - [info] Resetting slave info on the new master..
Mon Jul 24 15:40:22 2017 - [info] 172.31.9.182: Resetting slave info succeeded.
Mon Jul 24 15:40:22 2017 - [info] Master failover to 172.31.9.182(172.31.9.182:3306) completed successfully.
Mon Jul 24 15:40:22 2017 - [info]
----- Failover Report --
app1: MySQL Master failover 172.31.13.126(172.31.13.126:3306) to 172.31.9.182(172.31.9.182:3306) succeeded
Master 172.31.13.126(172.31.13.126:3306) is down!
Check MHA Manager logs at slave:/var/log/masterha/app1/manager.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on 172.31.13.126(172.31.13.126:3306)
Selected 172.31.9.182(172.31.9.182:3306) as a new master.
172.31.9.182(172.31.9.182:3306): OK: Applying all logs succeeded.
172.31.9.182(172.31.9.182:3306): OK: Activated master IP address.
172.31.9.182(172.31.9.182:3306): Resetting slave info succeeded.
Master failover to 172.31.9.182(172.31.9.182:3306) completed successfully.
Mon Jul 24 16:27:13 2017 - [info] Sending mail..
defined(@array) is deprecated at /usr/share/perl5/vendor_perl/Mail/Sender.pm line 318.
(Maybe you should just omit the defined()?)
defined(@array) is deprecated at /usr/share/perl5/vendor_perl/Mail/Sender.pm line 2693.
(Maybe you should just omit the defined()?)
Option new_slave_hosts requires an argument
Unknown option: conf
tail: manager.log: file truncated
>> 220 126.com Anti-spam GT for Coremail System (126com[20140526])
<< EHLO slave
>> 250-mail
>> 250-PIPELINING
>> 250-AUTH LOGIN PLAIN
>> 250-AUTH=LOGIN PLAIN
>> 250-coremail 1Uxr2xKj7kG0xkI17xGrU7I0s8FY2U3Uj8Cz28x1UUUUU7Ic2I0Y2Ur0Vl_3UCa0xDrUUUUj
>> 250-STARTTLS
>> 250 8BITMIME
<< AUTH LOGIN
>> 334 dXNlcm5hbWU6
<< ZHdqNTY4NTg4QDEyNi5jb20=
>> 334 UGFzc3dvcmQ6
<< RHdqMTIzNDU=
>> 235 Authentication successful
<< MAIL FROM:<dwj568588@126.com>
>> 250 Mail OK
<< RCPT TO:<duanwenjie@huan.tv>
>> 250 Mail OK
<< DATA
>> 354 End data with <CR><LF>.<CR><LF>
<< To: duanwenjie@huan.tv
<< From: dwj568588@126.com
<< Subject: app1: MySQL Master failover 172.31.13.126(172.31.13.126:3306) to 172.31.9.182(172.31.9.182:3306) succeeded
<< Date: Mon, 24 Jul 2017 16:27:15 +0800
<< X-Mailer: Perl script "send_report"
<< using Mail::Sender 0.8.16 by Jenda Krynicky, Czechlands
<< running on slave ()
<< under account "zhouting"
<< Message-ID: <20170724_082715_056339.dwj568588@126.com>
<< MIME-Version: 1.0
<< Content-type: text/plain; charset=utf-8
<<
<< Master 172.31.13.126(172.31.13.126:3306) is down!
<<
<< Check MHA Manager logs at slave:/var/log/masterha/app1/manager.log for details.
<<
<< Started automated(non-interactive) failover.
<< Invalidated master IP address on 172.31.13.126(172.31.13.126:3306)
<< Selected 172.31.9.182(172.31.9.182:3306) as a new master.
<< 172.31.9.182(172.31.9.182:3306): OK: Applying all logs succeeded.
<< 172.31.9.182(172.31.9.182:3306): OK: Activated master IP address.
<< 172.31.9.182(172.31.9.182:3306): Resetting slave info succeeded.
<< Master failover to 172.31.9.182(172.31.9.182:3306) completed successfully.
<<
<< .
>> 250 Mail OK queued as smtp1,C8mowAAH7yZhr3VZFR+JCA--.54250S2 1500884834
<< QUIT
>> 221 Bye
[root@slave app1]#
日志过程解析
1 | 启动前的准备工作 |
确认FailOver状态
1 | [root@slave app1]# ifconfig #vip已经添加到从库上 |
查看邮件发送状态
登录邮件,查看是否收到报警邮件。
手动FailOver
MHA的手动FailOver不在本文范围,详情理论可参考官方文档。
参数列表
参数名 | 是否必须 | 参数使用域 | 默认值 | 例子 |
---|---|---|---|---|
hostname | Yes | Local Only | - | hostname=master,hostname=slave |
ip | No | Local Only | gethostbyname($hostname) | ip=192.168.1.100 |
port | No | Local/App/Global | 3306 | port=3306 |
ssh_host | No | Local Only | same as hostname | ssh_host=master,ssh_host=slave |
ssh_ip | No | Local Only | gethostbyname($ssh_host) | ssh_ip=192.168.1.101 |
ssh_port | No | Local/App/Global | 22 | ssh_port=22 |
ssh_connection_timeout | No | Local/App/Global | 5 | ssh_connection_timeout=20 |
ssh_options | No | Local/App/Global | “”(empty string) | ssh_options=”-i /root/.ssh/id_dsa2” |
candidate_master | No | Local Only | 0 | candidate_master=1 |
no_master | No | Local Only | 0 | no_master=1 |
ignore_fail | No | Local Only | 0 | ignore_fail=1 |
skip_init_ssh_check | No | Local Only | 0 | skip_init_ssh_check=1 |
skip_reset_slave | No | Local/App/Global | 0 | skip_reset_slave=1 |
user | No | Local/App/Global | root | user=root |
password | No | Local/App/Global | “”(empty string) | password=rootpass |
repl_user | No | Local/App/Global | Master_User value from SHOW SLAVE STATUS | repl_user=repl |
repl_password | No | Local/App/Global | (current replication password) | repl_user=replpass |
disable_log_bin | No | Local/App/Global | 0 | disable_log_bin=1 |
master_pid_file | No | Local/App/Global | “”(empty string) | master_pid_file=/var/lib/mysql/master1.pid |
ssh_user | No | Local/App/Global | current OS user | ssh_user=root |
remote_workdir | No | Local/App/Global | /var/tmp | remote_workdir=/var/log/masterha/app1 |
master_binlog_dir | No | Local/App/Global | /var/lib/mysql | master_binlog_dir=/data/mysql1 |
log_level | No | App/Global | info | log_level=debug |
manager_workdir | No | App | /var/tmp | manager_workdir=/var/log/masterha |
client_bindir | No | App | - | client_bindir=/usr/mysql/bin |
client_libdir | No | App | - | client_libdir=/usr/lib/mysql |
manager_log | No | App | STDERR | manager_log=/var/log/masterha/app1.log |
check_repl_delay | No | App/Global | 1 | check_repl_delay=0 |
check_repl_filter | No | App/Global | 1 | check_repl_filter=0 |
latest_priority | No | App/Global | 1 | latest_priority=0 |
multi_tier_slave | No | App/Global | 0 | multi_tier_slave=1 |
ping_interval | No | App/Global | 3 | ping_interval=5 |
ping_type | No | App/Global | SELECT | ping_type=CONNECT |
secondary_check_script | No | App/Global | null | secondary_check_script= masterha_secondary_check -s remote_dc1 -s remote_dc2 |
master_ip_failover_script | No | App/Global | null | master_ip_failover_script=/usr/local/bin/master_ip_failover |
master_ip_online_change_script | No | App/Global | null | master_ip_online_change_script= /usr/local/bin/master_ip_online_change |
shutdown_script | No | App/Global | null | shutdown_script= /usr/local/bin/master_shutdown |
report_script | No | App/Global | null | report_script= /usr/local/bin/report |
详情解释
1 | Local Scope: 针对每个server级别有效的选项.local scope级别的参数需要在配置文件的 [server_xxx]段落配置 |
参考
MySQL Master High Available 理论篇:
https://yq.aliyun.com/articles/58004?spm=5176.100239.blogcont57855.9.jUuCt0
结束语
云服务器上MySQL高可用,也可通过云负载均衡产品+MySQL复制来实现,但在数据安全性上,没有MHA+VIP+MySQL相对安全。