MySQL高可用架构中MHA的本质以及如何部署
这篇文章将为大家详细讲解有关MySQL高可用架构中MHA的本质以及如何部署,文章内容质量较高,因此小编分享给大家做个参考,希望大家阅读完这篇文章后对相关知识有一定的了解。
MySQL高可用架构之MHA1、关于MHAMHA(Master HA)是一款开源的MySQL的高可用程序,它为MySQL主从复制架构提供了automating master failover功能。MHA在监控到master节点故障时,会提升其中拥有的最新数据的slave节点成为新的master节点,在此期间,MHA会通过其它从节点获取额外信息来避免一致性方面的问题。MHA还提供了master节点的在线切换功能,即按需切换master/slave节点。
MHA服务有两种角色,MHA Manager(管理节点)和MHA Node(数据节点):
MHAManager:通常单独部署在一台独立机器上管理多个master/slave集群,每个master/slave集群称为一个application;MHAnode:运行在每台MySQL服务器上,它通过监控具备解析和清理log功能的脚本来加快故障转移2、MHA组件说明
Manager节点:
-masterha_check_ssh:MHA依赖的SSH环境检测工具;-masterha_check_repl:MySQL复制环境检测工具;-masterha_manager:MHA服务主程序;-masterha_check_status:MHA运行状态探测工具;-masterha_master_monitor:MySQLmaster节点可用性检测工具;-masterha_master_switch:master节点切换工具;-masterha_conf_host:添加或删除配置的节点;-masterha_stop:关闭MHA服务的工具;
Node节点:
-save_binary_logs:保存和复制master的二进制日志;-apply_diff_relay_logs:识别差异的中继日志事件并用于其他slave;-fiter_mysqlbinlog:去除不必要的ROLLBACK事件(MHA已不再使用这个工具);-purge_relay_logs:清除中继日志(不会阻塞SQL线程);
自定义扩展:
-secondary_check_script:通过多条网络路由检测master的可用性;-master_ip_failover_script:更新appliction使用的masterip;-shutdown_script:强制关闭master节点;-report_script:发送报告;-init_conf_load_script:加载初始配置参数;-master_ip_online_change_script:更新master节点ip地址3、部署及测试
实验拓扑:
node1:192.168.150.137MHAmanagernode2:192.168.150.138MHAnodemariadbmasternode3:192.168.150.139MHAnodemariadbslavecandidatenode4:192.168.150.140MHAnodemariadbslave
配置过程:
1、修改每台服务器的hosts文件/etc/hosts127.0.0.1localhostlocalhost.localdomainlocalhost4localhost4.localdomain4::1localhostlocalhost.localdomainlocalhost6localhost6.localdomain6192.168.150.137node1.comnode1192.168.150.138node2.comnode2192.168.150.139node3.comnode3192.168.150.140node4.comnode42、node2-node4进行mariadb的yum安装yum-yinstallmariadb-server3、配置ssh互信通信环境[root@node1~]#ssh-keygen-trsa-P''Generatingpublic/privatersakeypair.Enterfileinwhichtosavethekey(/root/.ssh/id_rsa):Createddirectory'/root/.ssh'.Youridentificationhasbeensavedin/root/.ssh/id_rsa.Yourpublickeyhasbeensavedin/root/.ssh/id_rsa.pub.Thekeyfingerprintis:a2:f2:10:28:cd:ea:7b:d8:f4:95:15:6e:73:a6:9d:4eroot@node1.comThekey'srandomartp_w_picpathis:+--[RSA2048]----+|||.||..||+=o||o+.S*.||..o.+.E||.*o.o||..*..||oo.|+-----------------+[root@node1~]#ls.ssh/id_rsaid_rsaid_rsa.pub[root@node1~]#cat.ssh/id_rsa.pub>.ssh/authorized_keys[root@node1~]#sshnode1Theauthenticityofhost'node1(192.168.150.137)'can'tbeestablished.ECDSAkeyfingerprintis2a:e3:03:52:8c:84:02:59:a2:26:a3:b2:f6:74:6c:3c.Areyousureyouwanttocontinueconnecting(yes/no)?yesWarning:Permanentlyadded'node1,192.168.150.137'(ECDSA)tothelistofknownhosts.Lastlogin:WedMar2915:06:522017from192.168.150.1[root@node1~]#ll.ssh/authorized_keys-rw-r--r--1rootroot3963月2915:35.ssh/authorized_keys[root@node1~]#chmodgo=.ssh/authorized_keys[root@node1~]#scp-p.ssh/id_rsa.ssh/authorized_keysnode2:/root/.sshroot@node2'spassword:id_rsa100%16751.6KB/s00:00authorized_keys100%3960.4KB/s00:00[root@node1~]#scp-p.ssh/id_rsa.ssh/authorized_keysnode3:/root/.sshTheauthenticityofhost'node3(192.168.150.139)'can'tbeestablished.ECDSAkeyfingerprintis2a:e3:03:52:8c:84:02:59:a2:26:a3:b2:f6:74:6c:3c.Areyousureyouwanttocontinueconnecting(yes/no)?yesWarning:Permanentlyadded'node3,192.168.150.139'(ECDSA)tothelistofknownhosts.root@node3'spassword:id_rsa100%16751.6KB/s00:00authorized_keys100%3960.4KB/s00:00[root@node1~]#scp-p.ssh/id_rsa.ssh/authorized_keysnode4:/root/.sshroot@node4'spassword:Permissiondenied,pleasetryagain.root@node4'spassword:id_rsa100%16751.6KB/s00:00authorized_keys100%3960.4KB/s00:00[root@node1~]#sshnode2Lastlogin:WedMar2915:07:052017from192.168.150.1[root@node2~]#exit登出Connectiontonode2closed.[root@node1~]#sshnode3Lastlogin:WedMar2915:07:182017from192.168.150.1[root@node3~]#exit登出Connectiontonode3closed.[root@node1~]#sshnode3Lastlogin:WedMar2915:40:052017fromnode1.com[root@node3~]#exit登出Connectiontonode3closed.[root@node1~]#sshnode4Lastfailedlogin:WedMar2915:39:53CST2017fromnode1.comonssh:nottyTherewas1failedloginattemptsincethelastsuccessfullogin.Lastlogin:WedMar2915:39:302017fromnode1.com[root@node4~]#exit登出Connectiontonode4closed.[root@node1~]#sshnode4Lastlogin:WedMar2915:40:132017fromnode1.com[root@node4~]#exit登出Connectiontonode4closed.4、修改mysql参数master:[mysqld]innodb_file_per_table=1skip_name_resolve=1log-bin=master-binrelay-log=relay-binserver_id=1slave:[mysqld]innode_file_per_table=1skip_name_resolve=1log-bin=master-binrelay-log=relay-binserver_id=2read_only=1relay_log_purge=05、主库开启并创建授权账号master:MariaDB[(none)]>SHOWMASTERSTATUS;+-------------------+----------+--------------+------------------+|File|Position|Binlog_Do_DB|Binlog_Ignore_DB|+-------------------+----------+--------------+------------------+|master-bin.000003|245|||+-------------------+----------+--------------+------------------+1rowinset(0.00sec)MariaDB[(none)]>GRANTREPLICATIONSLAVE,REPLICATIONCLIENTON*.*TO'repluser'@'192.168.%.%'IDENTIFIEDBY'replpass';MariaDB[(none)]>GRANTALLON*.*TO'mhauser'@'192.168.%.%'IDENTIFIEDBY'mhapass';#此为mha的管理账号MariaDB[(none)]>SHOWGLOBALVARIABLESLIKE'read_only';#此时主库是可写可读+---------------+-------+|Variable_name|Value|+---------------+-------+|read_only|OFF|+---------------+-------+1rowinset(0.00sec)6、从库进行主从功能开启node3、node4操作相同[root@node3~]#mysqlWelcometotheMariaDBmonitor.Commandsendwith;or\g.YourMariaDBconnectionidis2Serverversion:5.5.52-MariaDBMariaDBServerCopyright(c)2000,2016,Oracle,MariaDBCorporationAbandothers.Type'help;'or'\h'forhelp.Type'\c'toclearthecurrentinputstatement.MariaDB[(none)]>CHANGEMASTERTOMASTER_HOST='192.168.150.138',MASTER_USER='repluser',MSTER_PASSWORD='replpass',MASTER_LOG_FILE='master-bin.000003',MASTER_LOG_POS=245;QueryOK,0rowsaffected(0.01sec)MariaDB[(none)]>SHOWSLAVESTATUS\G***************************1.row***************************Slave_IO_State:Master_Host:192.168.150.138Master_User:repluserMaster_Port:3306Connect_Retry:60Master_Log_File:master-bin.000003Read_Master_Log_Pos:245Relay_Log_File:relay-bin.000001Relay_Log_Pos:4Relay_Master_Log_File:master-bin.000003Slave_IO_Running:NoSlave_SQL_Running:NoReplicate_Do_DB:Replicate_Ignore_DB:Replicate_Do_Table:Replicate_Ignore_Table:Replicate_Wild_Do_Table:Replicate_Wild_Ignore_Table:Last_Errno:0Last_Error:Skip_Counter:0Exec_Master_Log_Pos:245Relay_Log_Space:245Until_Condition:NoneUntil_Log_File:Until_Log_Pos:0Master_SSL_Allowed:NoMaster_SSL_CA_File:Master_SSL_CA_Path:Master_SSL_Cert:Master_SSL_Cipher:Master_SSL_Key:Seconds_Behind_Master:NULLMaster_SSL_Verify_Server_Cert:NoLast_IO_Errno:0Last_IO_Error:Last_SQL_Errno:0Last_SQL_Error:Replicate_Ignore_Server_Ids:Master_Server_Id:01rowinset(0.00sec)MariaDB[(none)]>STARTSLAVE;QueryOK,0rowsaffected(0.00sec)MariaDB[(none)]>SHOWSLAVESTATUS\G***************************1.row***************************Slave_IO_State:WaitingformastertosendeventMaster_Host:192.168.150.138Master_User:repluserMaster_Port:3306Connect_Retry:60Master_Log_File:master-bin.000003Read_Master_Log_Pos:497Relay_Log_File:relay-bin.000002Relay_Log_Pos:782Relay_Master_Log_File:master-bin.000003Slave_IO_Running:YesSlave_SQL_Running:YesReplicate_Do_DB:Replicate_Ignore_DB:Replicate_Do_Table:Replicate_Ignore_Table:Replicate_Wild_Do_Table:Replicate_Wild_Ignore_Table:Last_Errno:0Last_Error:Skip_Counter:0Exec_Master_Log_Pos:497Relay_Log_Space:1070Until_Condition:NoneUntil_Log_File:Until_Log_Pos:0Master_SSL_Allowed:NoMaster_SSL_CA_File:Master_SSL_CA_Path:Master_SSL_Cert:Master_SSL_Cipher:Master_SSL_Key:Seconds_Behind_Master:0Master_SSL_Verify_Server_Cert:NoLast_IO_Errno:0Last_IO_Error:Last_SQL_Errno:0Last_SQL_Error:Replicate_Ignore_Server_Ids:Master_Server_Id:11rowinset(0.00sec)MariaDB[(none)]>SHOWGLOBALVARIABLESLIKE'read_only';#此时从库是只读模式+---------------+-------+|Variable_name|Value|+---------------+-------+|read_only|ON|+---------------+-------+1rowinset(0.00sec)7、此时一主两从架构已经配置完成,安装MHA包manager节点(node1):安装mha4mysql-manager-0.56-0.el6.noarch.rpm和mha4mysql-node-0.56-0.el6.noarch.rpmyuminstallmha4mysql*-ynode节点(node2-node4):安装mha4mysql-node-0.56-0.el6.noarch.rpmyum-yiinstallmha4mysql-node-0.56-0.el6.noarch.rpm8、初始化MHA创建配置目录及配置文件(在node1上执行)[root@node1~]#mkdir/etc/masterha[root@node1~]#vim/etc/masterha/app1.cnf[serverdefault]user=mhauserpassword=mhapassmanager_workdir=/data/masterha/app1master_log=/data/masterha/app1/manager.logremote_workdir=/data/masterha/app1ssh_user=rootrepl_user=repluserrepl_password=replpassping_interval=1[server1]hostname=192.168.150.138candidate_master=1[server2]hostname=192.168.150.139candidate_master=1[server3]hostname=192.168.150.1409、启动前检测ssh互信配置是否OK[root@node1~]#masterha_check_ssh--conf=/etc/masterha/app1.cnf......Warning:Permanentlyadded'192.168.150.139'(ECDSA)tothelistofknownhosts.WedMar2917:03:032017-[debug]ok.WedMar2917:03:032017-[info]AllSSHconnectiontestspassedsuccessfully.mysql复制集群的连接配置是否OK[root@node1~]#masterha_check_repl--conf=/etc/masterha/app1.cnf......WedMar2917:04:402017-[info]192.168.150.138(192.168.150.138:3306)(currentmaster)+--192.168.150.139(192.168.150.139:3306)+--192.168.150.140(192.168.150.140:3306)WedMar2917:04:402017-[info]Checkingreplicationhealthon192.168.150.139..WedMar2917:04:402017-[info]ok.WedMar2917:04:402017-[info]Checkingreplicationhealthon192.168.150.140..WedMar2917:04:402017-[info]ok.WedMar2917:04:402017-[warning]master_ip_failover_scriptisnotdefined.WedMar2917:04:402017-[warning]shutdown_scriptisnotdefined.WedMar2917:04:402017-[info]Gotexitcode0(Notmasterdead).MySQLReplicationHealthisOK.10、启动MHA[root@node1~]#nohupmasterha_manager--conf=/etc/masterha/app1.cnf>/data/masterha/app1/manager.log2>&1&[1]16989[root@node1~]#tail-f/data/masterha/app1/manager.log192.168.150.138(192.168.150.138:3306)(currentmaster)+--192.168.150.139(192.168.150.139:3306)+--192.168.150.140(192.168.150.140:3306)WedMar2921:51:582017-[warning]master_ip_failover_scriptisnotdefined.WedMar2921:51:582017-[warning]shutdown_scriptisnotdefined.WedMar2921:51:582017-[info]Setmasterpinginterval1seconds.WedMar2921:51:582017-[warning]secondary_check_scriptisnotdefined.Itishighlyrecommendedsettingittocheckmasterreachabilityfromtwoormoreroutes.WedMar2921:51:582017-[info]Startingpinghealthcheckon192.168.150.138(192.168.150.138:3306)..WedMar2921:51:582017-[info]Ping(SELECT)succeeded,waitinguntilMySQLdoesn'trespond..11、启动后查看master节点状态[root@node1~]#masterha_check_status--conf=/etc/masterha/app1.cnfapp1(pid:16623)isrunning(0:PING_OK),master:192.168.150.13812、进行故障转移测试(1)maser节点关闭mariadb[root@node2~]#killallmysqldmysqld_safe(2)此时在manager上可以看到转移的日志-----FailoverReport-----app1:MySQLMasterfailover192.168.150.138(192.168.150.138:3306)to192.168.150.139(192.168.150.139:3306)succeededMaster192.168.150.138(192.168.150.138:3306)isdown!CheckMHAManagerlogsatnode1.com:/data/masterha/app1/manager.logfordetails.Startedautomated(non-interactive)failover.Thelatestslave192.168.150.139(192.168.150.139:3306)hasallrelaylogsforrecovery.Selected192.168.150.139(192.168.150.139:3306)asanewmaster.192.168.150.139(192.168.150.139:3306):OK:Applyingalllogssucceeded.192.168.150.140(192.168.150.140:3306):Thishosthasthelatestrelaylogevents.Generatingrelaydifffilesfromthelatestslavesucceeded.192.168.150.140(192.168.150.140:3306):OK:Applyingalllogssucceeded.Slavestarted,replicatingfrom192.168.150.139(192.168.150.139:3306)192.168.150.139(192.168.150.139:3306):Resettingslaveinfosucceeded.Masterfailoverto192.168.150.139(192.168.150.139:3306)completedsuccessfully.故障转移后,manager会自动停止,此时查看master状态[root@node1~]#masterha_check_status--conf=/etc/masterha/app1.cnfapp1isstopped(2:NOT_RUNNING).(3)查看其它两个库状态node3已成功接管master,并可读写MariaDB[(none)]>SHOWMASTERSTATUS;+-------------------+----------+--------------+------------------+|File|Position|Binlog_Do_DB|Binlog_Ignore_DB|+-------------------+----------+--------------+------------------+|master-bin.000003|245|||+-------------------+----------+--------------+------------------+1rowinset(0.00sec)MariaDB[(none)]>SHOWGLOBALVARIABLESLIKE'read_only';+---------------+-------+|Variable_name|Value|+---------------+-------+|read_only|OFF|+---------------+-------+1rowinset(0.00sec)node4MariaDB[(none)]>SHOWSLAVESTATUS\G***************************1.row***************************Slave_IO_State:WaitingformastertosendeventMaster_Host:192.168.150.139Master_User:repluserMaster_Port:3306Connect_Retry:60Master_Log_File:master-bin.000003Read_Master_Log_Pos:245Relay_Log_File:relay-bin.000002Relay_Log_Pos:530Relay_Master_Log_File:master-bin.000003Slave_IO_Running:YesSlave_SQL_Running:YesReplicate_Do_DB:Replicate_Ignore_DB:Replicate_Do_Table:Replicate_Ignore_Table:Replicate_Wild_Do_Table:Replicate_Wild_Ignore_Table:Last_Errno:0Last_Error:Skip_Counter:0Exec_Master_Log_Pos:245Relay_Log_Space:818Until_Condition:NoneUntil_Log_File:Until_Log_Pos:0Master_SSL_Allowed:NoMaster_SSL_CA_File:Master_SSL_CA_Path:Master_SSL_Cert:Master_SSL_Cipher:Master_SSL_Key:Seconds_Behind_Master:0Master_SSL_Verify_Server_Cert:NoLast_IO_Errno:0Last_IO_Error:Last_SQL_Errno:0Last_SQL_Error:Replicate_Ignore_Server_Ids:Master_Server_Id:21rowinset(0.00sec)MariaDB[(none)]>SHOWGLOBALVARIABLESLIKE'read_only';+---------------+-------+|Variable_name|Value|+---------------+-------+|read_only|ON|+---------------+-------+1rowinset(0.00sec)(4)提供新的从节点已修复复制集群master界定啊故障后,需要重新准备好一个新的MySQL节点。基于来自于master节点的备份恢复后,将其重新配置为mster的从节点即可。新加入节点IP为原master节点IP,否则还得修改appl.cnf中相应的设置,最后再次启动manager,并再次检查状态。[root@node2~]#rm-rf/var/lib/mysql/*[root@node2~]#vim/etc/my.cnf添加从库两选项read_only=1relay_log_purge=0[root@node2~]#systemctlstartmariadb.service[root@node2~]#mysqlWelcometotheMariaDBmonitor.Commandsendwith;or\g.YourMariaDBconnectionidis2Serverversion:5.5.52-MariaDBMariaDBServerCopyright(c)2000,2016,Oracle,MariaDBCorporationAbandothers.Type'help;'or'\h'forhelp.Type'\c'toclearthecurrentinputstatement.MariaDB[(none)]>GRANTREPLICATIONSLAVE,REPLICATIONCLIENTON*.*TO'repluser'@'192.168.%.%'IDENTIFIEDBY'replpass';QueryOK,0rowsaffected(0.00sec)MariaDB[(none)]>GRANTALLON*.*TO'mhauser'@'192.168.%.%'IDENTIFIEDBY'mhapass';QueryOK,0rowsaffected(0.00sec)MariaDB[(none)]>FLUSHPRIVILEGES;QueryOK,0rowsaffected(0.00sec)MariaDB[(none)]>CHANGEMASTERTOMASTER_HOST='192.168.150.139',MASTER_USER='repluser',MASTER_PASSWORD='replpass',MASTER_LOG_FILE='master-bin.000003',MASTER_LOG_POS=245;QueryOK,0rowsaffected(0.01sec)MariaDB[(none)]>SHOWSLAVE\GERROR1064(42000):YouhaveanerrorinyourSQLsyntax;checkthemanualthatcorrespondstoyourMariaDBserverversionfortherightsyntaxtousenear''atline1MariaDB[(none)]>STARTSLAVE;QueryOK,0rowsaffected(0.01sec)MariaDB[(none)]>SHOWSLAVESTATUS\G***************************1.row***************************Slave_IO_State:WaitingformastertosendeventMaster_Host:192.168.150.139Master_User:repluserMaster_Port:3306Connect_Retry:60Master_Log_File:master-bin.000003Read_Master_Log_Pos:245Relay_Log_File:relay-bin.000002Relay_Log_Pos:530Relay_Master_Log_File:master-bin.000003Slave_IO_Running:YesSlave_SQL_Running:YesReplicate_Do_DB:Replicate_Ignore_DB:Replicate_Do_Table:Replicate_Ignore_Table:Replicate_Wild_Do_Table:Replicate_Wild_Ignore_Table:Last_Errno:0Last_Error:Skip_Counter:0Exec_Master_Log_Pos:245Relay_Log_Space:818Until_Condition:NoneUntil_Log_File:Until_Log_Pos:0Master_SSL_Allowed:NoMaster_SSL_CA_File:Master_SSL_CA_Path:Master_SSL_Cert:Master_SSL_Cipher:Master_SSL_Key:Seconds_Behind_Master:0Master_SSL_Verify_Server_Cert:NoLast_IO_Errno:0Last_IO_Error:Last_SQL_Errno:0Last_SQL_Error:Replicate_Ignore_Server_Ids:Master_Server_Id:21rowinset(0.00sec)再次开启manager查看状态,状态全部OK,主库变更为node3[root@node1~]#nohupmasterha_manager--conf=/etc/masterha/app1.cnf>/data/masterha/app1/manager.log2>&1&[root@node1~]#masterha_check_status--conf=/etc/masterha/app1.cnfapp1(pid:17229)isrunning(0:PING_OK),master:192.168.150.139补充:
提供额外机制,防止对master的监控做出误判、VIP添加、在进行故障转移时对原有master节点执行STONITH操作避免脑裂,可通过shutdown_scrip实现、必要时,进行在线master节点转换;
关于MySQL高可用架构中MHA的本质以及如何部署就分享到这里了,希望以上内容可以对大家有一定的帮助,可以学到更多知识。如果觉得文章不错,可以把它分享出去让更多的人看到。
声明:本站所有文章资源内容,如无特殊说明或标注,均为采集网络资源。如若本站内容侵犯了原著者的合法权益,可联系本站删除。