进学阁

业精于勤荒于嬉,行成于思毁于随

0%

高可用技术

什么是高可用

1
2
3
4
5
6
7
企业高可用标准:全年无故障率

无故障时间 故障时间
99.9% 0.1% = 525.6 min KA+双主 :人为干预
99.99% 0.01% = 52.56 min MHA 、ORCH :半自动化
99.999% 0.001% = 5.256 min PXC 、 MGR 、MGC
99.9999% 0.0001% = 0.5256 min 自动化、云化、平台化

数据库容灾级别

1. MHA高可用技术

MHA脚本逻辑:

1.0 MHA高可用架构介绍及搭建过程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# 规划:
主库: 从库:
51 node 52 node 53 node manager


# 准备环境
略。1主2从GTID

# 配置关键程序软连接
ln -s /data/app/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog
ln -s /data/app/mysql/bin/mysql /usr/bin/mysql

# 配置各节点互信(各节点之间无密码SSH)
## db01:
rm -rf /root/.ssh
ssh-keygen
cd /root/.ssh
mv id_rsa.pub authorized_keys
scp -r /root/.ssh 10.0.0.52:/root
scp -r /root/.ssh 10.0.0.53:/root
## 各节点验证
## db01:
ssh 10.0.0.51 date
ssh 10.0.0.52 date
ssh 10.0.0.53 date
## db02:
ssh 10.0.0.51 date
ssh 10.0.0.52 date
ssh 10.0.0.53 date
## db03:
ssh 10.0.0.51 date
ssh 10.0.0.52 date
ssh 10.0.0.53 date

#安装软件
## 下载mha软件
mha官网:https://code.google.com/archive/p/mysql-master-ha/
github下载地址:https://github.com/yoshinorim/mha4mysqlmanager/wiki/Downloads
## 所有节点安装Node软件依赖包
yum install perl-DBD-MySQL -y
rpm -ivh mha4mysql-node*.rpm
## 在db01主库中创建mha需要的用户
create user mha@'10.0.0.%' identified with mysql_native_password by 'mha';
grant all privileges on *.* to mha@'10.0.0.%';
## Manager软件安装(db03)
yum install -y perl-Config-Tiny epel-release perl-Log-Dispatch perlParallel-ForkManager perl-Time-HiRes
rpm -ivh mha4mysql-manager*.rpm
## 配置文件准备(db03)
### 创建配置文件目录
mkdir -p /etc/mha
### 创建日志目录
mkdir -p /var/log/mha/app1
### 编辑mha配置文件
vim /etc/mha/app1.cnf
[server default]
manager_log=/var/log/mha/app1/manager
manager_workdir=/var/log/mha/app1
master_binlog_dir=/data/binlog/
user=mha
password=mha
ping_interval=2
repl_password=123
repl_user=repl
ssh_user=root
[server1]
hostname=10.0.0.51
port=3306
[server2]
hostname=10.0.0.52
candidate_master=1
port=3306
[server3]
hostname=10.0.0.53
port=3306

# 状态检查
## 互信检查
masterha_check_ssh --conf=/etc/mha/app1.cnf
## 主从状态检查
masterha_check_repl --conf=/etc/mha/app1.cnf

# 开启MHA(db03):
nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf -
-ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 &

# 查看MHA状态
[root@db03 ~]# masterha_check_status --conf=/etc/mha/app1.cnf

1.1 MHA软件结构介绍

1
2
3
4
5
6
7
8
9
10
11
12
13
manager 组件
masterha_manger 启动MHA
masterha_check_ssh 检查MHA的SSH配置状况
masterha_check_repl 检查MySQL复制状况
masterha_master_monitor 检测master是否宕机
masterha_check_status 检测当前MHA运行状态
masterha_master_switch 控制故障转移(自动或者手动)
masterha_conf_host 添加或删除配置的server信息

node 组件
save_binary_logs 保存和复制master的二进制日志
apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用于其他的
purge_relay_logs 清除中继日志(不会阻塞SQL线程)

1.2 应用透明—VIP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
vip : 10.0.0.55/24
vip 故障转移脚本
上传脚本文件到/usr/local/bin 解压
[root@db03 mha_script]# cp -a /data/mha_script/* /usr/local/bin

修改权限
[root@db03 bin]# chmod +x /usr/local/bin/*

修改内容
[root@db03 bin]# cp master_ip_failover master_ip_failover.bak
my $vip = '10.0.0.55/24';
my $key = '1';
my $if = 'ens33';
my $ssh_start_vip = "/sbin/ifconfig $if:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig $if:$key down";
my $ssh_Bcast_arp= "/sbin/arping -I $if -c 3 -A 10.0.0.55";
修改Manager 配置文件
vim /etc/mha/app1.cnf
master_ip_failover_script=/usr/local/bin/master_ip_failover

重启MHA
[root@db03 bin]# masterha_stop --conf=/etc/mha/app1.cnf
[root@db03 bin]# nohup masterha_manager --conf=/etc/mha/app1.cnf --
remove_dead_master_conf --ignore_last_failover < /dev/null>
/var/log/mha/app1/manager.log 2>&1 &
手工在主库添加VIP
[root@db03 bin]# masterha_check_status --conf=/etc/mha/app1.cnf
[root@db02 ~]# ifconfig ens33:1 10.0.0.55/24

效果测试
使用navicat 连接测试MHA vip功能。

1.3 故障提醒功能

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
准备脚本
[root@db03 bin]# cp send_report send_report.bak1
my $smtp='smtp.qq.com'; # smtp服务器
my $mail_from='22654481@qq.com'; # 发件箱
my $mail_user='22654481'; # 用户名 QQ号
my $mail_pass='gemghsvgkeyzcagh'; # 授权码
my $mail_to=['22654481@qq.com']; # 收件箱
#my $mail_to=['to1@qq.com','to2@qq.com'];

修改配置文件
vim /etc/mha/app1.cnf # 添加一行:
report_script=/usr/local/bin/send_report

重启MHA
[root@db03 bin]# masterha_stop --conf=/etc/mha/app1.cnf
[root@db03 bin]# nohup masterha_manager --conf=/etc/mha/app1.cnf --
remove_dead_master_conf --ignore_last_failover < /dev/null>
/var/log/mha/app1/manager.log 2>&1 &

1.3.1 故障如何半自愈?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
# 1. 检查数据库是否启动
[root@db01 app]# mysqladmin -umha -pmha -h 10.0.0.51 ping
# 2. 恢复主从
## 2.1 缺失少部分日志
如果没有缺失GTID事务。直接change master
## 2.2 缺失大部分日志
备份主库数据,恢复至从库。change master
clone plugin + change master to
# 3. 配置文件添加修复节点
masterha_conf_host --command=add --conf=/etc/mha/app1.cnf --
hostname=10.0.0.51 --block=server10 --params="port=3306"
# 4. 检查
## 互信检查
masterha_check_ssh --conf=/etc/mha/app1.cnf
## 主从状态检查
masterha_check_repl --conf=/etc/mha/app1.cnf
# 启动MHA
nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf -
-ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 &

手工恢复具体步骤:
1. 修改Linux服务器
2. 安装MySQL软件
3. 初始化数据,并启动
4. clone 数据到新节点,并构建主从
主库:
mysql -e "INSTALL PLUGIN clone SONAME 'mysql_clone.so';create user test@'%'
identified by '123';grant backup_admin on *.* to test@'%';"
从库:
mysql -e "INSTALL PLUGIN clone SONAME 'mysql_clone.so';create user test@'%'
identified by '123';grant clone_admin on *.* to test@'%';SET GLOBAL
clone_valid_donor_list='10.0.0.52:3306';"
mysql -utest -p123 -h10.0.0.51 -P3306 -e "CLONE INSTANCE FROM
test@'10.0.0.52':3306 IDENTIFIED BY '123';"
mysql -e " SELECT STAGE, STATE, END_TIME FROM
performance_schema.clone_progress;"
mysql -e \
"CHANGE MASTER TO \
MASTER_HOST='10.0.0.52',\
MASTER_USER='repl', \
MASTER_PASSWORD='123', \
MASTER_PORT=3306, \
MASTER_AUTO_POSITION=1;"
mysql -e "start slave;"
mysql -e "show slave status \G"|grep "Running:"
5. 添加节点至配置文件
masterha_conf_host --command=add --conf=/etc/mha/app1.cnf --
hostname=10.0.0.51 --block=server10 --params="port=3306"

6. 修复binlogserver
mysqlbinlog -R --host=10.0.0.52 --user=mha --password=mha --raw --stop-never mysql-bin.000003 &

7. 状态检查
## 互信检查
masterha_check_ssh --conf=/etc/mha/app1.cnf
## 主从状态检查
masterha_check_repl --conf=/etc/mha/app1.cnf
8. 启动MHA
[root@db03 app1]# nohup masterha_manager --conf=/etc/mha/app1.cnf --
remove_dead_master_conf --ignore_last_failover < /dev/null>
/var/log/mha/app1/manager.log 2>&1 &
[root@db03 app1]# masterha_check_status --conf=/etc/mha/app1.cnf


可以写脚本实现半自愈功能。

1.4 日志补偿的冗余方案–binlog_server

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#创建必要目录(db03)
mkdir -p /data/binlog_server/
chown -R mysql.mysql /data/*
cd /data/binlog_server/
[root@db03 ~]# mysql -e "show slave status \G"|grep "Master_Log"
Master_Log_File: mysql-bin.000008
Read_Master_Log_Pos: 194
Relay_Master_Log_File: mysql-bin.000008
Exec_Master_Log_Pos: 194
[root@db03 ~]#
[root@db03 binlog_server]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:7057) is running(0:PING_OK), master:10.0.0.52
mysqlbinlog -R --host=10.0.0.52 --user=mha --password=mha --raw --stopnever mysql-bin.000003 &
注意:
拉取日志的起点,需要按照目前从库的已经获取到的二进制日志点为起点

#配置文件设置
vim /etc/mha/app1.cnf
[binlog1]
no_master=1
hostname=10.0.0.53
master_binlog_dir=/data/binlog_server/

#重启MHA
[root@db03 bin]# masterha_stop --conf=/etc/mha/app1.cnf
[root@db03 bin]# nohup masterha_manager --conf=/etc/mha/app1.cnf --
remove_dead_master_conf --ignore_last_failover < /dev/null>
/var/log/mha/app1/manager.log 2>&1 &

1.5 MHA的维护操作 - 在线切换功能

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
[root@db03 ~]# masterha_stop --conf=/etc/mha/app1.cnf
只切换角色
masterha_master_switch --conf=/etc/mha/app1.cnf --master_state=alive --
new_master_host=10.0.0.51 --orig_master_is_new_slave --
running_updates_limit=10000
注意:
master_ip_online_change_script is not defined. If you do not disable writes
on the current master manually, applications keep writing on the current
master. Is it ok to proceed? (yes/NO): yes
1. 此种方法 切换,要注意将原主库,FTWRL,否则会造成主从不一致。
2. 手工切换vip
3. 重新拉去新主库的binlog
4. 发邮件功能

master_ip_online_change_script功能实现
功能: 在线切换时,自动锁原主库,VIP自动切换

#准备切换脚本
vim /usr/local/bin/master_ip_online_change
my $vip = "10.0.0.55/24";
my $key = "1";
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key $vip down";
my $ssh_Bcast_arp= "/sbin/arping -I ens33 -c 3 -A 10.0.0.55";

#修改MHA配置文件
vim /etc/mha/app1.cnf
master_ip_online_change_script=/usr/local/bin/master_ip_online_change

#停 MHA
[root@db03 bin]# masterha_stop --conf=/etc/mha/app1.cnf

#检查repl
[root@db03 bin]# masterha_check_repl --conf=/etc/mha/app1.cnf

#在线切换
masterha_master_switch --conf=/etc/mha/app1.cnf --master_state=alive --
new_master_host=10.0.0.51 --orig_master_is_new_slave --
running_updates_limit=10000

#重构binlogserver
[root@db03 bin]# ps -ef |grep mysqlbinlog
root 28144 16272 0 17:50 pts/1 00:00:00 mysqlbinlog -R --
host=10.0.0.52 --user=mha --password=x x --raw --stop-never mysqlbin.000005
root 28529 16272 0 18:03 pts/1 00:00:00 grep --color=auto
mysqlbinlog
[root@db03 bin]# kill -9 28144
[root@db03 bin]# cd /data/binlog_server/
[root@db03 binlog_server]# ll
total 4
-rw-r----- 1 root root 194 Apr 1 17:50 mysql-bin.000005
[root@db03 binlog_server]# rm -rf *
[root@db03 binlog_server]# mysqlbinlog -R --host=10.0.0.51 --user=mha --
password=mha --raw --stop-never mysql-bin.000009 &
[1] 28534

#启动MHA
[root@db03 bin]# nohup masterha_manager --conf=/etc/mha/app1.cnf --
remove_dead_master_conf --ignore_last_failover < /dev/null>
/var/log/mha/app1/manager.log 2>&1 &
[root@db03 binlog_server]# masterha_check_status --conf=/etc/mha/app1.cnf
app1 (pid:28535) is running(0:PING_OK), master:10.0.0.51

1.6 MHA如何防止脑裂

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# 什么是脑裂?
1.7 MHA的数据一致性保证
2、ProxySQL中间件
2.1 介绍
多个节点争抢主库角色。会导致多库写入的问题。
manager程序与现有主库之间出现网络故障。默认manager是单一心跳检查的。

# 解决方案:
1. 采用多心跳投票机制。例如:多条网络线路(以太网、磁盘网络)。
2. stonith是“shoot the other node in the head”的首字母简写
# shutdown_script 脚本
利用服务器的远程控制IDRAC等,使用ipmitool强制去关机,以避免fence设备重启主服务器,造成
脑列现象.
shutdown_script= /usr/local/bin/power_manager
fence: 网络fence(ilo、idrac)、电源fence

# secondary_check_script脚本
secondary_check_script = "masterha_secondary_check -s 10.0.0.52 -s
10.0.0.53"

1.7 MHA的数据一致性保证

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#*************** rpl_semi_sync ***************
rpl_semi_sync_master_enabled =ON
rpl_semi_sync_master_timeout =5000
rpl_semi_sync_master_wait_for_slave_count =1
rpl_semi_sync_master_wait_no_slave =ON
rpl_semi_sync_master_wait_point =AFTER_SYNC
rpl_semi_sync_slave_enabled =ON

#*************** group commit ***************
binlog_group_commit_sync_delay =1
binlog_group_commit_sync_no_delay_count =1000

#*************** gtid ***************
gtid_mode =ON
enforce_gtid_consistency =ON
log_slave_update =1

#*************** gtid ***************
slave_parallel_type =LOGICAL_CLOCK
slave_parallel_workers =4
master_info_repository =TABLE
relay_log_info_repository =TABL

2. ProxySQL中间件

2.1 介绍

1
2
3
4
5
6
7
ProxySQL是基于MySQL的一款开源的中间件的产品,是一个灵活的MySQL代理层,可以实现读写分离,
支持 Query路由功能,支持动态指定某个SQL进行缓存,支持动态加载配置信息(无需重启 ProxySQL
服务),支持故障切换和SQL的过滤功能。

相关 ProxySQL 的网站:
https://www.proxysql.com/
https://github.com/sysown/proxysql/wiki

2. 2 安装ProxySQL

1
2
3
4
5
6
7
8
9
10
11
12
13
# 下载proxySQL
https://proxysql.com/
https://github.com/sysown/proxysql/releases

# 安装proxySQL
[root@db03 ~]# rpm -ivh proxysql-2.0.10-1-centos7.x86_64.rpm
[root@db03 ~]# systemctl start proxysql
[root@db03 ~]# netstat -tulnp
tcp 0 0 0.0.0.0:6032 0.0.0.0:* LISTEN
2115/proxysql
tcp 0 0 0.0.0.0:6033 0.0.0.0:* LISTEN
2115/proxysql
[root@db03 ~]# mysql -uadmin -padmin -h127.0.0.1 -P6032

2.3 ProxySQL中管理结构

ProxySQL 自带系统库信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
在ProxySQL,6032端口共五个库: main、disk、stats 、monitor、stats_history
main:
main 库中有如下信息:
mysql_servers: 后端可以连接 MySQL 服务器的列表
mysql_users: 配置后端数据库的账号和监控的账号。
mysql_query_rules: 指定 Query 路由到后端不同服务器的规则列表。
mysql_replication_hostgroups : 节点分组配置信息

注: 表名以 runtime_开头的表示ProxySQL 当前运行的配置内容,不能直接修改。不带runtime_
是下文图中Mem相关的配置。
#disk :
持久化的磁盘的配置
#stats:
统计信息的汇总
#monitor:
监控的收集信息,比如数据库的健康状态等
#stats_history: ProxySQL 收集的有关其内部功能的历史指标

ProxySQL 管理接口的多层配置关系

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
整套配置系统分为三层:
# 顶层 RUNTIME
# 中间层 MEMORY (主要修改的配置表)
# 持久层 DISK 和 CFG FILE

RUNTIME :
代表 ProxySQL 当前正在使用的配置,无法直接修改此配置,必须要从下一层 (MEM层)
“load” 进来。

MEMORY:
MEMORY 层上面连接 RUNTIME 层,下面disk持久层。这层可以在线操作 ProxySQL 配置,随
便修改,不会影响生产环境。确认正常之后在加载达到RUNTIME和持久化的磁盘上。修改方法:
insert、update、delete、select。

DISK和CONFIG FILE:
持久化配置信息。重启时,可以从磁盘快速加载回来。

ProxySQL 在不同层次间移动配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
为了将配置持久化到磁盘或者应用到 runtime,在管理接口下有一系列管理命令来实现它们。
1. user相关配置
## MEM 加载到runtime
LOAD MYSQL USERS TO RUNTIME;
## runtime 保存至 MEM
SAVE MYSQL USERS TO MEMORY;

## disk 加载到 MEM
LOAD MYSQL USERS FROM DISK;
## MEM 到 disk
SAVE MYSQL USERS TO DISK;
## CFG 到 MEM
LOAD MYSQL USERS FROM CONFIG
=============================
2. server 相关配置
## MEM 加载到runtime
LOAD MYSQL SERVERS TO RUNTIME;
## runtime 保存至 MEM
SAVE MYSQL SERVERS TO MEMORY;
## disk 加载到 MEM
LOAD MYSQL SERVERS FROM DISK;
## MEM 到 disk
SAVE MYSQL SERVERS TO DISK;
## CFG 到 MEM
LOAD MYSQL SERVERS FROM CONFIG
===============================
3. mysql query rules配置
## MEM 加载到runtime
LOAD MYSQL QUERY RULES TO RUNTIME;
## runtime 保存至 MEM
SAVE MYSQL QUERY RULES TO MEMORY;
## disk 加载到 MEM
LOAD MYSQL QUERY RULES FROM DISK;
## MEM 到 disk
SAVE MYSQL QUERY RULES TO DISK;
## CFG 到 MEM
LOAD MYSQL QUERY RULES FROM CONFIG
=================================
4. MySQL variables配置
## MEM 加载到runtime
LOAD MYSQL VARIABLES TO RUNTIME;
## runtime 保存至 MEM
SAVE MYSQL VARIABLES TO MEMORY;
## disk 加载到 MEM
LOAD MYSQL VARIABLES FROM DISK;
## MEM 到 disk
SAVE MYSQL VARIABLES TO DISK;
## CFG 到 MEM
LOAD MYSQL VARIABLES FROM CONFIG

总结:
日常配置其实大部分时间在MEM配置,然后load到RUNTIME,然后SAVE到DIsk。cfg很少使用。
例如 :
load xxx to runtime;
save xxx to disk;
注意:
只有load到 runtime 状态时才会验证配置。在保MEM或disk时,都不会发生任何警告或错误。
当load到 runtime 时,如果出现错误,将恢复为之前保存得状态,这时可以去检查错误日志。

2.4 ProxySQL应用——基于SQL的读写分离

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
1. 在mysql_replication_hostgroup表中,配置读写组编号
db03 [main]>insert into
mysql_replication_hostgroups
(writer_hostgroup, reader_hostgroup, comment)
values (10,20,'proxy');
load mysql servers to runtime;
save mysql servers to disk;

db03 [main]>select * from mysql_replication_hostgroups\G
*************************** 1. row ***************************
writer_hostgroup: 10
reader_hostgroup: 20
check_type: read_only
comment: proxy
1 row in set (0.00 sec)
说明:
ProxySQL 会根据server 的read_only 的取值将服务器进行分组。 read_only=0 的
server,master被分到编号为10的写组,read_only=1 的server,slave则被分到编号20的读
组。所以需要将从库设置:
set global read_only=1;

3. 添加主机到ProxySQL
insert into mysql_servers(hostgroup_id,hostname,port) values
(10,'10.0.0.51',3306);
insert into mysql_servers(hostgroup_id,hostname,port) values
(20,'10.0.0.52',3306);
insert into mysql_servers(hostgroup_id,hostname,port) values
(20,'10.0.0.53',3306);
load mysql servers to runtime;
save mysql servers to disk;

4. 创建监控用户,并开启监控
# 主库创建监控用户
create user monitor@'%' identified with mysql_native_password by '123';
grant replication client on *.* to monitor@'%';
# proxySQL修改variables表
set mysql-monitor_username='monitor';
set mysql-monitor_password='123';
或者 :
UPDATE global_variables SET variable_value='monitor'
WHERE variable_name='mysql-monitor_username';
UPDATE global_variables SET variable_value='123'
WHERE variable_name='mysql-monitor_password';
load mysql variables to runtime;
save mysql variables to disk;
# 查询监控日志
db03 [(none)]>select * from mysql_server_connect_log;
db03 [(none)]>select * from mysql_server_ping_log;
db03 [(none)]>select * from mysql_server_read_only_log;
db03 [(none)]>select * from mysql_server_replication_lag_log;

4.配置应用用户
create user root@'%' identified with mysql_native_password by '123';
grant all on *.* to root@'%';
insert into mysql_users(username,password,default_hostgroup)
values('root','123',10);
load mysql users to runtime;
save mysql users to disk;
早期版本,需要开启事务持续化。
update mysql_users set transaction_persistent=1 where username='root';
load mysql users to runtime;
save mysql users to disk;

5. 实用的读写规则
insert into
mysql_query_rules(rule_id,active,match_pattern,destination_hostgroup,apply)
values (1,1,'^select.*for update$',10,1);
insert into
mysql_query_rules(rule_id,active,match_pattern,destination_hostgroup,apply)
values (2,1,'^select',20,1);
load mysql query rules to runtime;
save mysql query rules to disk;
注: select … for update规则的rule_id必须要小于普通的select规则的rule_id,
ProxySQL是根据rule_id的顺序进行规则匹配。

6. 测试读写分离
[root@db03 ~]# mysql -uroot -p123 -P 6033 -h 127.0.0.1 -e "begin;select
@@server_id;commit"
[root@db03 ~]# mysql -uroot -p123 -P 6033 -h 127.0.0.1 -e "select
@@server_id;"
db03 [(none)]>select * from stats_mysql_query_digest\G

2.5 ProxySQL应用扩展——花式路由规则

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
1. 基于端口的路由
## 修改ProxySQL监听SQL流量的端口号,监听多端口上。
set mysql-interfaces='0.0.0.0:6033;0.0.0.0:6034';
save mysql variables to disk;

## 重启生效
systemctl restart proxysql

## 设定路由规则
delete from mysql_query_rules; # 为了测试,先清空已有规则
insert into
mysql_query_rules(rule_id,active,proxy_port,destination_hostgroup,apply)
values(1,1,6033,10,1), (2,1,6034,20,1);
load mysql query rules to runtime;
save mysql query rules to disk;
说明:
除了基于端口进行分离,还可以基于监听地址(修改字段proxy_addr即可),也可以基于客户端地
址(修改字段client_addr字段即可)。

2. 基于用户的路由
nsert into mysql_users(username,password,default_hostgroup)
values('writer','123',10),('reader','123',20);
load mysql users to runtime;
save mysql users to disk;

delete from mysql_query_rules; # 为了测试,先清空已有规则

insert into
mysql_query_rules(rule_id,active,username,destination_hostgroup,apply)
values(1,1,'writer',10,1),(2,1,'reader',20,1);

load mysql query rules to runtime;
save mysql query rules to disk;

3. ORCH可视化高可用集群应用

3.1 ORCH介绍

Orchestrator(orch):go编写的MySQL高可用性和复制拓扑管理工具,支持复制拓扑结构的调整,

自动故障转移和手动主从切换等。后端数据库用MySQL或SQLite存储元数据,并提供Web界面展示

MySQL复制的拓扑关系及状态,通过Web可更改MySQL实例的复制关系和部分配置信息,同时也提供

命令行和api接口,方便运维管理。 相对比MHA来看最重要的是解决了管理节点的单点问题,其通过

raft协议保证本身的高可用。GitHub的一部分管理也在用该工具进行管理。

① 自动发现MySQL的复制拓扑,并且在web上展示。

② 重构复制关系,可以在web进行拖图来进行复制关系变更。

③ 检测主异常,并可以自动或手动恢复,通过Hooks进行自定义脚本。

④ 支持命令行和web界面管理复制。

参考文档: Mysql orchestrator高可用_天道酬勤-明天会更好的博客-CSDN博客_mysql orchestrator

3.2 部署

a. 集群规划

IP地址 主机名 安装软件 数据库端口
10.0.0.51 db01 mysql、orchestrator 3306、3307
10.0.0.52 db02 mysql、orchestrator 3306、3307
10.0.0.53 db03 mysql、orchestrator 3306、3307
 根据以上表格安装软件、配置地址和主机名、初始化6个MySQL实例。 orch默认是用主机名来进行管理的,需要在mysql的配置文件里添加:report_host和report_port参 数。  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
[root@db01 ~]# mysql -V
mysql Ver 8.0.20 for Linux on x86_64 (MySQL Community Server - GPL)
[root@db02 ~]# mysql -V
mysql Ver 8.0.20 for Linux on x86_64 (MySQL Community Server - GPL)
[root@db03 ~]# mysql -V
mysql Ver 8.0.20 for Linux on x86_64 (MySQL Community Server - GPL)
[root@db01 ~]# hostname
db01
[root@db01 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4
localhost4.localdomain4
::1 localhost localhost.localdomain localhost6
localhost6.localdomain6
10.0.0.53 db03
10.0.0.52 db02
10.0.0.51 db01
[root@db02 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4
localhost4.localdomain4
::1 localhost localhost.localdomain localhost6
localhost6.localdomain6
10.0.0.53 db03
10.0.0.52 db02
10.0.0.51 db01
[root@db03 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4
localhost4.localdomain4
::1 localhost localhost.localdomain localhost6
localhost6.localdomain6
10.0.0.53 db03
10.0.0.52 db02
10.0.0.51 db01

# 部署6个 MySQL实例
pkill mysqld
rm -rf /etc/my3306.cnf
rm -rf /data/3306/data /data/3306/binlog*
mkdir -p /data/3306/data /data/3306/binlog
mysqld --initialize-insecure --user=mysql --basedir=/data/app/mysql --
datadir=/data/3306/data
pkill mysqld
rm -rf /etc/my3307.cnf
rm -rf /data/3307/data /data/3307/binlog*
mkdir -p /data/3307/data /data/3307/binlog
mysqld --initialize-insecure --user=mysql --basedir=/data/app/mysql --
datadir=/data/3307/data
cat > /etc/my3306.cnf <<EOF
[client]
socket=/tmp/mysql3306.sock
[mysqld]
user=mysql
basedir=/data/app/mysql
datadir=/data/3306/data
log_bin=/data/3306/binlog/mysql-bin
server_id=6
socket=/tmp/mysql.sock
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
log_slave_updates=ON
tmpdir = /tmp
default-storage-engine=INNODB
disabled_storage_engines = MyISAM,BLACKHOLE,FEDERATED,CSV,ARCHIVE
max_connections=500
max_allowed_packet=32M
default_authentication_plugin=mysql_native_password
report_host=10.0.0.51
report_port=3306
mysqlx=0
EOF

cat > /etc/my3307.cnf <<EOF
[client]
socket=/tmp/mysql3307.sock
[mysqld]
user=mysql
basedir=/data/app/mysql
datadir=/data/3307/data
log_bin=/data/3307/binlog/mysql-bin
server_id=16
socket=/tmp/mysql3307.sock
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
log_slave_updates=ON
tmpdir = /tmp
default-storage-engine=INNODB
disabled_storage_engines = MyISAM,BLACKHOLE,FEDERATED,CSV,ARCHIVE
max_connections=500
max_allowed_packet=32M
default_authentication_plugin=mysql_native_password
report_host=10.0.0.51
report_port=3307
port=3307
mysqlx=0
EOF

b. 获取软件和相关脚本

https://github.com/openark/orchestrator/releases

https://github.com/theTibi/orchestrator_vip

c. 安装软件

yum install -y orchestrator-*

d. 配置orch 后端数据库及用户(所有3306节点)

1
2
3
CREATE DATABASE IF NOT EXISTS oldguo;
CREATE USER 'oldguo'@'127.0.0.1' IDENTIFIED BY '123456';
GRANT ALL ON oldguo.* TO 'oldguo'@'127.0.0.1';

e.配置被管理节点主从关系(3307)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# 主库
CREATE USER 'repl'@'%' IDENTIFIED BY '123456';
GRANT replication slave on *.* TO 'repl'@'%';

# 从库
change master to
master_host='10.0.0.51',master_port=3307,master_user='repl',master_password
='123456',master_auto_position=1,MASTER_HEARTBEAT_PERIOD=2,MASTER_CONNECT_R
ETRY=1, MASTER_RETRY_COUNT=86400;
start slave;
set global slave_net_timeout=8;
set global read_only=1;
set global super_read_only=1;

#说明:
1、orch检测主库宕机依赖从库的IO线程。默认change master to时,
MASTER_HEARTBEAT_PERIOD过长,会导致切换判断过久。所以,需要修改
MASTER_HEARTBEAT_PERIOD 优化故障感知速度。
2、另外slave_net_timeout,参数定义了从库从主库获取数据等待的秒数,超过这个时间从库会主动
退出读取,中断连接,并尝试重连。

f. 配置被管理节点用户

1
2
3
4
CREATE USER 'oldboy'@'%' IDENTIFIED BY 'Aa123456';
GRANT SUPER, PROCESS, REPLICATION SLAVE, RELOAD ON *.* TO 'oldboy'@'%';
GRANT SELECT ON mysql.slave_master_info TO 'oldboy'@'%';
GRANT SELECT ON meta.* TO 'oldboy'@'%';

g.orch配置文件定制

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
cp orchestrator-sample.conf.json /etc/orchestrator.conf.json
vim /etc/orchestrator.conf.json
{
"Debug": true,
"EnableSyslog": false,
"ListenAddress": ":3000",
"MySQLTopologyUser": "orchestrator",
"MySQLTopologyPassword": "Aa123456",
"MySQLTopologyCredentialsConfigFile": "",
"MySQLTopologySSLPrivateKeyFile": "",
"MySQLTopologySSLCertFile": "",
"MySQLTopologySSLCAFile": "",
"MySQLTopologySSLSkipVerify": true,
"MySQLTopologyUseMutualTLS": false,
"MySQLOrchestratorHost": "127.0.0.1",
"MySQLOrchestratorPort": 3306,
"MySQLOrchestratorDatabase": "orchestrator",
"MySQLOrchestratorUser": "orchestrator",
"MySQLOrchestratorPassword": "123456",
"MySQLOrchestratorCredentialsConfigFile": "",
"MySQLOrchestratorSSLPrivateKeyFile": "",
"MySQLOrchestratorSSLCertFile": "",
"MySQLOrchestratorSSLCAFile": "",
"MySQLOrchestratorSSLSkipVerify": true,
"MySQLOrchestratorUseMutualTLS": false,
"MySQLConnectTimeoutSeconds": 1,
"MySQLTopologyReadTimeoutSeconds": 3,
"MySQLDiscoveryReadTimeoutSeconds": 3,
"DefaultInstancePort": 3306,
"DiscoverByShowSlaveHosts": true,
"InstancePollSeconds": 3,
"UnseenInstanceForgetHours": 240,
"SnapshotTopologiesIntervalHours": 0,
"InstanceBulkOperationsWaitTimeoutSeconds": 10,
"HostnameResolveMethod": "default",
"MySQLHostnameResolveMethod": "@@hostname",
"SkipBinlogServerUnresolveCheck": true,
"SkipMaxScaleCheck":true,
"ExpiryHostnameResolvesMinutes": 60,
"RejectHostnameResolvePattern": "",
"ReasonableReplicationLagSeconds": 10,
"ProblemIgnoreHostnameFilters": [],
"VerifyReplicationFilters": false,
"ReasonableMaintenanceReplicationLagSeconds": 20,
"CandidateInstanceExpireMinutes": 1440,
"AuditLogFile": "",
"AuditToSyslog": false,
"RemoveTextFromHostnameDisplay": ":3306",
"ReadOnly": false,
"AuthenticationMethod": "",
"HTTPAuthUser": "",
"HTTPAuthPassword": "",
"AuthUserHeader": "",
"PowerAuthUsers": ["*"],
"ClusterNameToAlias": {"127.0.0.1": "test suite"},
"SlaveLagQuery": "",
"DetectClusterAliasQuery": "SELECT cluster_name FROM meta.cluster WHERE
cluster_name = left(@@hostname,4) ",
"DetectClusterDomainQuery": "SELECT cluster_domain FROM meta.cluster WHERE
cluster_name = left(@@hostname,4) ",
"DetectInstanceAliasQuery": "SELECT @@hostname as instance_alias",
"DetectPromotionRuleQuery": "",
"DetectDataCenterQuery": "SELECT data_center FROM meta.cluster WHERE
cluster_name = left(@@hostname,4) ",
"PhysicalEnvironmentPattern": "",
"PromotionIgnoreHostnameFilters": [],
"DetachLostReplicasAfterMasterFailover": true,
"DetectSemiSyncEnforcedQuery": "SELECT 0 AS semisync FROM DUAL WHERE NOT
EXISTS (SELECT 1 FROM performance_schema.global_variables WHERE
VARIABLE_NAME = 'rpl_semi_sync_master_wait_no_slave' AND VARIABLE_VALUE =
'ON') UNION SELECT 1 FROM DUAL WHERE EXISTS (SELECT 1 FROM
performance_schema.global_variables WHERE VARIABLE_NAME =
'rpl_semi_sync_master_wait_no_slave' AND VARIABLE_VALUE = 'ON')",
"ServeAgentsHttp": false,
"AgentsServerPort": ":3001",
"AgentsUseSSL": false,
"AgentsUseMutualTLS": false,
"AgentSSLSkipVerify": false,
"AgentSSLPrivateKeyFile": "",
"AgentSSLCertFile": "",
"AgentSSLCAFile": "",
"AgentSSLValidOUs": [],
"UseSSL": false,
"UseMutualTLS": false,
"SSLSkipVerify": false,
"SSLPrivateKeyFile": "",
"SSLCertFile": "",
"SSLCAFile": "",
"SSLValidOUs": [],
"URLPrefix": "",
"StatusEndpoint": "/api/status",
"StatusSimpleHealth": true,
"StatusOUVerify": false,
"AgentPollMinutes": 60,
"UnseenAgentForgetHours": 6,
"StaleSeedFailMinutes": 60,
"SeedAcceptableBytesDiff": 8192,
"AutoPseudoGTID":true,
"PseudoGTIDPattern": "drop view if exists
`meta`.`_pseudo_gtid_hint__asc:",
"PseudoGTIDPatternIsFixedSubstring": true,
"PseudoGTIDMonotonicHint": "asc:",
"DetectPseudoGTIDQuery": "select count(*) as pseudo_gtid_exists from
meta.pseudo_gtid_status where anchor = 1 and time_generated > now() -
interval 2 hour",
"BinlogEventsChunkSize": 10000,
"SkipBinlogEventsContaining": [],
"ReduceReplicationAnalysisCount": true,
"FailureDetectionPeriodBlockMinutes": 60,
"RecoveryPeriodBlockSeconds": 31,
"RecoveryIgnoreHostnameFilters": [],
"RecoverMasterClusterFilters": ["*"],
"RecoverIntermediateMasterClusterFilters": ["*"],
"OnFailureDetectionProcesses": ["echo ' Detected {failureType} on
{failureCluster}. Affected replicas: {countSlaves}' >>
/tmp/recovery.log"],
"PreGracefulTakeoverProcesses": ["echo ' Planned takeover about to take
place on {failureCluster}. Master will switch to read_only' >>
/tmp/recovery.log"],
"PreFailoverProcesses": ["echo ' Will recover from {failureType} on
{failureCluster}' >> /tmp/recovery.log"],
"PostFailoverProcesses": ["echo ' (for all types) Recovered from
{failureType} on {failureCluster}. Failed: {failedHost}:{failedPort};
Successor: {successorHost}:{successorPort}; failureClusterAlias:
{failureClusterAlias}' >>
/tmp/recovery.log","/usr/local/orchestrator/orch_hook.sh {failureType}
{failureClusterAlias} {failedHost} {successorHost} >> /tmp/orch.log"],
"PostUnsuccessfulFailoverProcesses": [ "echo ' Unsuccessful Failover ' >>
/tmp/recovery.log"],
"PostMasterFailoverProcesses": ["echo ' Recovered from {failureType} on
{failureCluster}. Failed: {failedHost}:{failedPort}; Promoted:
{successorHost}:{successorPort}' >>/tmp/recovery.log"],
"PostIntermediateMasterFailoverProcesses": ["echo ' Recovered from
{failureType} on {failureCluster}. Failed: {failedHost}:{failedPort};
Successor: {successorHost}:{successorPort}' >> /tmp/recovery.log"],
"PostGracefulTakeoverProcesses": ["echo ' Planned takeover complete' >>
/tmp/recovery.log"],
"CoMasterRecoveryMustPromoteOtherCoMaster": true,
"DetachLostSlavesAfterMasterFailover": true,
"ApplyMySQLPromotionAfterMasterFailover": true,
"PreventCrossDataCenterMasterFailover": false,
"MasterFailoverDetachSlaveMasterHost": false,
"MasterFailoverLostInstancesDowntimeMinutes": 0,
"PostponeSlaveRecoveryOnLagMinutes": 0,
"OSCIgnoreHostnameFilters": [],
"GraphiteAddr": "",
"GraphitePath": "",
"GraphiteConvertHostnameDotsToUnderscores": true,
"RaftEnabled": true,
"BackendDB": "mysql",
"RaftBind": "10.0.0.51",
"RaftDataDir": "/usr/local/orchestrator",
"DefaultRaftPort": 10008,
"RaftNodes": ["10.0.0.51","10.0.0.52","10.0.0.53"],
"ConsulAddress": "",
"ConsulAclToken": ""
}

h.启动orch

cd /usr/local/orchestrator && nohup ./orchestrator -- config=/etc/orchestrator.conf.json http &

i.配置自动切换脚本

1
2
3
4
5
6
7
8
"PostFailoverProcesses": ["echo ' (for all types) Recovered from
{failureType} on {failureCluster}. Failed: {failedHost}:{failedPort};
Successor: {successorHost}:{successorPort}; failureClusterAlias:
{failureClusterAlias}' >>
/tmp/recovery.log","/usr/local/orchestrator/orch_hook.sh {failureType}
{failureClusterAlias} {failedHost} {successorHost} >> /tmp/orch.log"],
"PostUnsuccessfulFailoverProcesses": [ "echo ' Unsuccessful Failover ' >>
/tmp/recovery.log"],

j. 免交互管理orch

1
2
3
4
5
6
7
8
9
10
11
12
# 查看集群信息
orchestrator-client -c clusters
# 手工发现节点
orchestrator-client -c discover -i db01:3307
# 忘记节点
orchestrator-client -c forget -i db01:3307
# 查看集群拓扑
orchestrator-client -c topology-tabulated -i db03:3307
# 手工切换主从关系
orchestrator-client -c graceful-master-takeover -a db03:3307 -d db01:3307
# 调整主从拓扑
orchestrator-client -c relocate -i db03:3307 -d db01:3307

4. InnoDB Cluster 应用

4.1 介绍

1
2
3
4
5
6
7
8
9
MySQL InnoDB集群提供了一个集成的,本地的,HA解决方案。Mysq Innodb Cluster是利用组复制
的 pxos 协议,保障数据一致性,组复制支持单主模式和多主模式。
MySQL InnoDB集群由以下几部分组成:
- MySQL Servers with Group Replication:向集群的所有成员复制数据,同时提供容错、自
动故障转移和弹性。MySQL Server 5.7.17或更高的版本。
- MySQL Router:确保客户端请求是负载平衡的,并在任何数据库故障时路由到正确的服务器。
MySQL Router 2.1.3或更高的版本。
- MySQL Shell:通过内置的管理API创建及管理Innodb集群。MySQL Shell 1.0.9或更高的版
本。

4.2 构建过程

4.2.1主机 规划列表

IP HOSTNAME 主机角色 安装软件
10.0.0.151 master mic-master mysql、mysqlsh
10.0.0.152 slave1 mic-slave1 mysql、mysqlsh
10.0.0.153 slave2 mic-slave2 mysql、mysqlsh
10.0.0.154 manager mic-manager mysqlsh、mysql-router

4.2.2 准备MIC基础环境

准备三台虚拟机并上传软件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# 按照规划列表,将所有软件解压。
tar xf mysql-8.0.20-linux-glibc2.12-x86_64.tar.xz
tar xf mysql-shell-8.0.21-linux-glibc2.12-x86-64bit.tar.gz
tar xf mysql-router-8.0.21-linux-glibc2.12-x86_64.tar.xz
ln -s mysql-8.0.20-linux-glibc2.12-x86_64 mysql
ln -s mysql-shell-8.0.21-linux-glibc2.12-x86-64bit.tar.gz mysqlsh
ln -s mysql-shell-8.0.21-linux-glibc2.12-x86-64bit.tar.gz mysql-router
# 配置环境遍变量
export PATH=/data/app/mysql/bin:/data/app/mysqlsh/bin:/data/app/mysqlrouter/bin:$PATH
# 按照规划列表,将所有主机名和hosts文件全部解析。
cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4
localhost4.localdomain4
::1 localhost localhost.localdomain localhost6
localhost6.localdomain6
10.0.0.151 master
10.0.0.152 slave1
10.0.0.153 slave2
10.0.0.154 manage

# 将所有节点Firewalld和SELINUX关闭。
略。

准备三台数据库节点(151、152、153)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
#清理环境,并重新初始化数据
pkill mysqld
pkill mysqlsh
rm -rf /data/3306/data /data/3306/binlog*
mkdir /data/3306/data /data/3306/binlog
mysqld --initialize-insecure --user=mysql --basedir=/data/app/mysql --
datadir=/data/3306/data
#配置文件准备

## master节点
[root@master app]# cat > /etc/my.cnf <<EOF
[client]
socket=/tmp/mysql.sock
[mysqld]
user=mysql
basedir=/data/app/mysql
datadir=/data/3306/data
log_bin=/data/3306/binlog
server_id=151
socket=/tmp/mysql.sock
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
log_slave_updates=ON
tmpdir = /tmp
default-storage-engine=INNODB
disabled_storage_engines = MyISAM,BLACKHOLE,FEDERATED,CSV,ARCHIVE
max_connections=500
max_allowed_packet=32M
default_authentication_plugin=mysql_native_password
binlog_checksum=NONE
binlog_transaction_dependency_tracking = WRITESET
transaction_write_set_extraction=XXHASH64
loose-group_replication_group_name="ca842376-1c50-42ac-bb57-a5adc7da7a12"
loose-group_replication_start_on_boot=OFF
loose-group_replication_local_address= "master:33061"
loose-group_replication_group_seeds=
"master:33061,slave1:33062,slave2:33063"
loose-group_replication_bootstrap_group=OFF
loose-group_replication_ip_whitelist="master,slave1,slave2,manager"
EOF

## slave1 节点
[root@slave1 data]# cat >/etc/my.cnf <<EOF
[client]
socket=/tmp/mysql.sock
[mysqld]
user=mysql
basedir=/data/app/mysql
datadir=/data/3306/data
log_bin=/data/3306/binlog
server_id=152
socket=/tmp/mysql.sock
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
log_slave_updates=ON
tmpdir = /tmp
default-storage-engine=INNODB
disabled_storage_engines = MyISAM,BLACKHOLE,FEDERATED,CSV,ARCHIVE
max_connections=500
max_allowed_packet=16M
default_authentication_plugin=mysql_native_password
binlog_checksum=NONE
binlog_transaction_dependency_tracking = WRITESET
transaction_write_set_extraction=XXHASH64
loose-group_replication_group_name="ca842376-1c50-42ac-bb57-a5adc7da7a12"
loose-group_replication_start_on_boot=OFF
loose-group_replication_local_address= "slave1:33062"
loose-group_replication_group_seeds=
"master:33061,slave1:33062,slave2:33063"
loose-group_replication_bootstrap_group=OFF
loose-group_replication_ip_whitelist="master,slave1,slave2,manager"
loose-group_replication_allow_local_disjoint_gtids_join=ON
[root@slave1 data]#
EOF

## slave2 节点
[root@slave2 data]# cat > /etc/my.cnf <<EOF
[client]
socket=/tmp/mysql.sock
[mysqld]
user=mysql
basedir=/data/app/mysql
datadir=/data/3306/data
log_bin=/data/3306/binlog
server_id=153
socket=/tmp/mysql.sock
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
log_slave_updates=ON
tmpdir = /tmp
default-storage-engine=INNODB
disabled_storage_engines = MyISAM,BLACKHOLE,FEDERATED,CSV,ARCHIVE
max_connections=200
max_allowed_packet=16M
default_authentication_plugin=mysql_native_password
binlog_checksum=NONE
binlog_transaction_dependency_tracking = WRITESET
transaction_write_set_extraction=XXHASH64
loose-group_replication_group_name="ca842376-1c50-42ac-bb57-a5adc7da7a12"
loose-group_replication_start_on_boot=OFF
loose-group_replication_local_address= "slave2:33063"
loose-group_replication_group_seeds=
"master:33061,slave1:33062,slave2:33063"
loose-group_replication_bootstrap_group=OFF
loose-group_replication_ip_whitelist="master,slave1,slave2,manager"
loose-group_replication_allow_local_disjoint_gtids_join=ON
EOF

4.2.3 所有数据库节点初始化集群实例

设置root@’localhost’密码

1
2
3
mysql> alter user root@'localhost' identified with mysql_native_password by
'123';

初始化配置集群实例

1
2
3
4
mysqlsh
mysql-js> shell.connect('root@localhost:3306');
mysql-js> dba.configureLocalInstance(); #见下图。
mysql-js> dba.checkInstanceConfiguration("root@localhost:3306");# 见下图。
           ![](https://cdn.fbbizyy.com/yuque/0/2022/png/1581532/1662218754065-3fbd608d-870b-47c0-8110-74098c19ca69.png)

          ![](https://cdn.fbbizyy.com/yuque/0/2022/png/1581532/1662218768038-615966d9-ecbc-47f5-84b6-6b8aef49fe19.png)

4.2.4 创建集群并添加节点(154)

创建集群并添加主节点

1
2
3
4
mysqlsh
mysql-js> shell.connect('root@master:3306');
mysql-js> var cluster = dba.createCluster('oldguo');
mysql-js> cluster.status();
       ![](https://cdn.fbbizyy.com/yuque/0/2022/png/1581532/1662218825275-df2b5e2a-957d-4184-bd3b-92b05825d5ad.png)

       ![](https://cdn.fbbizyy.com/yuque/0/2022/png/1581532/1662218843093-31c7ccce-07d7-4016-b7db-6db654769d73.png)

添加slave1\slave2节点

1
2
3
4
5
6
mysqlsh
mysql-js> shell.connect('root@master:3306');
mysql-js> var cluster = dba.getCluster('oldguo');
mysql-js> cluster.addInstance('root@slave1:3306');
mysql-js> cluster.addInstance('root@slave2:3306');
mysql-js> cluster.status();
     ![](https://cdn.fbbizyy.com/yuque/0/2022/png/1581532/1662218888036-addddda1-df0e-4d16-9b87-686d431609f9.png)

    ![](https://cdn.fbbizyy.com/yuque/0/2022/png/1581532/1662218899554-fc49b44c-1f76-427a-a5d7-11b7655972d9.png)

配置启动 mysql-router(154)

1
2
3
4
# 注册router到集群,生成myrouter目录, 并生成启动程序和配置文件.
mysqlrouter --bootstrap root@master:3306 -d myrouter --user=root
# 启动myrouter
myrouter/start.sh
        ![](https://cdn.fbbizyy.com/yuque/0/2022/png/1581532/1662218928961-7a72ce01-05cb-4037-a85e-c5c547234d2e.png)

验证连接router

1
2
3
4
5
6
a) 管理节点本机mysql-shell连接:
mysqlsh --uri root@localhost:6446
b) 管理节点本机mysql连接:
mysql -u root -h 127.0.0.1 -P 6446 -p
c) 远程客户机通过route连接mysql
mysql -u root -h manager_ip -P 6446 -p

验证cluster集群

1
2
3
4
1.登陆后,新建一个表,往里面写进数据,查看从节点数据会不会同步;
2.关闭master的mysql服务,route将主节点自动切换到slave1,slave1从只读变为可读写,重新启
动master mysql后,master变为只读模式。
3. 验证读写分离功能

可能遇到的问题

1
2
3
4
5
6
7
8
9
10
主节点:
mysqlshell 清空集群
dba.dropMetadataSchema()
mysql> stop group_replication;
mysql> reset master; (清空日志,确保和从库的表没有冲突奥,)
mysql> reset slave
其他节点(主要清理和主库的主从信息, 确保主库和从库的表没有冲突奥)
mysql> stop group_replication;
mysql> reset master;
mysql> reset slave

4.3 MIC集群常用管理命令汇总

4.3.1 信息查询

1
2
3
4
5
6
7
8
9
10
# 检查节点是否符合集群标准
dba.checkInstanceConfiguration()
# 集群结构信息描述
cluster.describe()
# 集群状态总览
cluster.status()
# 查看集群名字
dba.getCluster();
#查看router信息
cluster.listRouters()

4.3.2 集群创建及节点管理

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# dba.createCluster()
例:
var cluster = dba.createCluster('oldguo');
# Cluster.addInstance()
例:
cluster.addInstance('root@slave1:3306');
# Cluster.removeInstance()
例:
cluster.removeInstance('root@slave1:3306');
# Cluster.rejoinInstance()
如果实例离开集群,比如丢失连接,并且没有自动重新加入集群。可以通过
cluster.rejoinInstance()方法将实例重新加入到集群中。
# Cluster.setPrimaryInstance(instance)
在线切换primary 实例。

例:
cluster.setPrimaryInstance('root@slave1:3306')
cluster.status()
cluster.setPrimaryInstance('root@master:3306')
# cluster.switchToMultiPrimaryMode()
切换为多primary模式。
# cluster.switchToSinglePrimaryMode('root@master:3306')
切换为单primary模式。
注:在切换为多 primary 模式后,6447端口(默认只读)接收读写,并且可通过该端口访问所有集
群成员。而6446端口(默认读写)只能连接到其中一个成员(之前的primary成员)。

4.3.3 集群故障处理

1
2
3
4
5
6
7
# dba.rebootClusterFromCompleteOutage()
用于在集群完全断电后重新配置集群。如果dba.rebootClusterFromCompleteOutage() 方法失
败,可以通过一下方式删除所有集群元数据。
# dba.dropMetadataSchema()
删除集群元数据,然后dba.createCluster() 重建集群。
#删除已注册的router
Cluster.removeRouterMetadata(router)