以下將介紹 repmgr Cluster 設定
Step 1. 建立repmgr replication superuser account以及repmgr同步資料庫
postgres=# create user repmgr superuser password 'password'; CREATE ROLE postgres=# create database repmgr owner repmgr; CREATE DATABASE
Step 2. 調整~/.pgpass內容
[pgadm@pgrep1 ~]$ cat ~/.pgpass pgrep1:5699:postgres:pgadm:password pgrep2:5699:postgres:pgadm:password pgrep1:5699:repmgr:repmgr:password pgrep2:5699:repmgr:repmgr:password
Step 3. 調整pg_hba.conf能讓repmgr連線,順序由上往下、由左往右很重要 (兩邊主機的連線都要填上)
[pgadm@pgrep1 ~]$ vim $PGDATA/pg_hba.conf local all all trust host all repmgr 172.30.0.21/32 trust host all repmgr 172.30.0.22/32 trust ... local replication all trust host replication repmgr 172.30.0.21/32 trust host replication repmgr 172.30.0.22/32 trust
Step 4. 設定完後重讀即可 (不需重啟)
[pgadm@pgrep1 ~]$ pg_ctl reload
Step 5. 在另外一台測試連線
[pgadm@pgrep2 ~]$ psql 'host=pgrep1 user=repmgr dbname=repmgr connect_timeout=2' psql (14.2) Type "help" for help. repmgr=# \q
Step 6. 在節點1,建立repmgr.conf設定內容,檔案存放於 /pgdata/repmgr.conf,也可以存放到 /etc/repmgr.conf
[pgadm@pgrep1 ~]$ vim /pgdata/repmgr.conf node_id=1 node_name='pgrep1' conninfo='host=pgrep1 user=repmgr dbname=repmgr connect_timeout=2' data_directory='/pgdata/dbdata' replication_user='repmgr' replication_type='physical' repmgr_bindir='/pgbin/pghome_1/bin' pg_bindir='/pgbin/pghome_1/bin' log_level='info' log_file='/tmp/repmgr.log' connection_check_type=ping reconnect_attempts=3 reconnect_interval=10 monitoring_history=yes monitor_interval_secs=5 failover='automatic' promote_command='/pgdata/dbdata/repmgr_promote.sh' follow_command='/pgdata/dbdata/repmgr_follow.sh %n'
Step 7. 在節點1,建立script /pgdata/dbdata/repmgr_follow.sh
#! /bin/bash echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: follow $1" >> /tmp/repmgr.log /pgbin/pghome_1/bin/repmgr standby follow -f /pgdata/repmgr.conf --upstream-node-id=$1 --log-to-file
Step 8. 在節點1,建立script /pgdata/dbdata/repmgr_promote.sh
#!/bin/bash REPMGRLOG_PATH=/tmp/repmgr.log IFCONFIG_IF=ens192 #failover function failover(){ echo "failover Start" >> $REPMGRLOG_PATH /pgbin/pghome_1/bin/repmgr standby promote -f /pgdata/repmgr.conf --log-to-file >> $REPMGRLOG_PATH echo "failover Success." >> $REPMGRLOG_PATH } #remove VIP function delete_vip(){ /bin/ssh -t pgadm@pgrep2 "/bin/sudo /usr/sbin/ip addr del 172.30.0.23/24 dev " $IFCONFIG_IF >> $REPMGRLOG_PATH } #add VIP function add_vip(){ /bin/sudo /usr/sbin/ip addr add 172.30.0.23/24 dev $IFCONFIG_IF >> $REPMGRLOG_PATH } #check connection using telnet, if failed then failover result=`echo -e "\n" | telnet pgrep2 5699 2>/dev/null | grep Connected | wc -l` #telnet normal then pass #if connection failed then start failover. if [ $result -eq 1 ]; then # echo "Database Normal." >> $REPMGRLOG_PATH exit 0 else echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: remove VIP start." >> $REPMGRLOG_PATH delete_vip echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: remove VIP finish." >> $REPMGRLOG_PATH echo "Connection fail, failover start." >> $REPMGRLOG_PATH echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: promote start" >> $REPMGRLOG_PATH failover echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: promote finish" >> $REPMGRLOG_PATH fi #check database when standby conevert to primary standby_flg=`psql -p 5699 -U repmgr -h localhost -At -c "SELECT pg_is_in_recovery();"` if [ ${standby_flg} == 'f' ]; then echo "database switch to primary. delete VIP" echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: remove VIP start." >> $REPMGRLOG_PATH delete_vip echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: remove VIP finish." >> $REPMGRLOG_PATH echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: add VIP start." >> $REPMGRLOG_PATH add_vip echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: add VIP finish." >> $REPMGRLOG_PATH exit 0 elif [ ${standby_flg} == 't' ]; then echo e "`date +%F\ %T`: database switch failed, current database is standby.!\n" >>$REPMGRLOG_PATH exit 0 fi
Step 9. 在節點2,建立repmgr.conf設定內容
[pgadm@pgrep2 ~]$ vim /pgdata/repmgr.conf node_id=2 node_name='pgrep2' conninfo='host=pgrep2 user=repmgr dbname=repmgr connect_timeout=2' data_directory='/pgdata/dbdata' replication_user='repmgr' replication_type='physical' repmgr_bindir='/pgbin/pghome_1/bin' pg_bindir='/pgbin/pghome_1/bin' log_level='debug' log_file='/tmp/repmgr.log' connection_check_type=ping reconnect_attempts=3 reconnect_interval=10 monitoring_history=yes monitor_interval_secs=5 failover='manual' promote_command='/pgdata/repmgr_promote.sh' follow_command='/pgdata/repmgr_follow.sh %n'
Step 10. 在節點2,建立script /pgdata/dbdata/repmgr_follow.sh
#! /bin/bash echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: follow $1" >> /tmp/repmgr.log /pgbin/pghome_1/bin/repmgr standby follow -f /pgdata/repmgr.conf --upstream-node-id=$1 --log-to-file
Step 11. 在節點2,建立script /pgdata/dbdata/repmgr_promote.sh
#!/bin/bash REPMGRLOG_PATH=/tmp/repmgr.log IFCONFIG_IF=ens192 #failover function failover(){ echo "failover Start" >> $REPMGRLOG_PATH /pgbin/pghome_1/bin/repmgr standby promote -f /pgdata/repmgr.conf --log-to-file >> $REPMGRLOG_PATH echo "failover Success." >> $REPMGRLOG_PATH } #remove VIP function delete_vip(){ /bin/ssh -t pgadm@pgrep1 "/bin/sudo /usr/sbin/ip addr del 172.30.0.23/24 dev " $IFCONFIG_IF >> $REPMGRLOG_PATH } #add VIP function add_vip(){ /bin/sudo /usr/sbin/ip addr add 172.30.0.23/24 dev $IFCONFIG_IF >> $REPMGRLOG_PATH } #check connection using telnet, if failed then failover result=`echo -e "\n" | telnet pgrep1 5699 2>/dev/null | grep Connected | wc -l` #telnet normal then pass #if connection failed then start failover. if [ $result -eq 1 ]; then # echo "Database Normal." >> $REPMGRLOG_PATH exit 0 else echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: remove VIP start." >> $REPMGRLOG_PATH delete_vip echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: remove VIP finish." >> $REPMGRLOG_PATH echo "Connection fail, failover start." >> $REPMGRLOG_PATH echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: promote start" >> $REPMGRLOG_PATH failover echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: promote finish" >> $REPMGRLOG_PATH fi #check database when standby conevert to primary standby_flg=`psql -p 5699 -U repmgr -h localhost -At -c "SELECT pg_is_in_recovery();"` if [ ${standby_flg} == 'f' ]; then echo "database switch to primary. delete VIP" echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: remove VIP start." >> $REPMGRLOG_PATH delete_vip echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: remove VIP finish." >> $REPMGRLOG_PATH echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: add VIP start." >> $REPMGRLOG_PATH add_vip echo "["`date "+%Y-%m-%d %H:%M:%S"`"]: add VIP finish." >> $REPMGRLOG_PATH exit 0 elif [ ${standby_flg} == 't' ]; then echo e "`date +%F\ %T`: database switch failed, current database is standby.!\n" >>$REPMGRLOG_PATH exit 0 fi
Step 12. 註冊Primary Database
[pgadm@pgrep1 ~]$ repmgr -f /pgdata/repmgr.conf primary register INFO: connecting to primary database... DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=pgrep1 fallback_application_name=repmgr options=-csearch_path=" NOTICE: attempting to install extension "repmgr" NOTICE: "repmgr" extension successfully installed NOTICE: primary node record (ID: 1) registered
Step 13. 確認Cluster狀態
[pgadm@pgrep1 ~]$ repmgr -f /pgdata/repmgr.conf cluster show DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=pgrep1 fallback_application_name=repmgr options=-csearch_path=" DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=pgrep1 fallback_application_name=repmgr options=-csearch_path=" ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+--------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------- 1 | pgrep1 | primary | * running | | default | 100 | 1 | host=pgrep1 user=repmgr dbname=repmgr connect_timeout=2
Step 14. 在節點2,Clone Primary Database (測試)
[pgadm@pgrep2 ~]$ repmgr -h pgrep1 -U repmgr -d repmgr -f /pgdata/repmgr.conf standby clone --dry-run NOTICE: destination directory "/pgdata/dbdata" provided INFO: connecting to source node DETAIL: connection string is: host=pgrep1 user=repmgr dbname=repmgr DETAIL: current installation size is 32 MB INFO: "repmgr" extension is installed in database "repmgr" DEBUG: 1 node records returned by source node DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=pgrep1 fallback_application_name=repmgr options=-csearch_path=" WARNING: target data directory appears to be a PostgreSQL data directory DETAIL: target data directory is "/pgdata/dbdata" HINT: use -F/--force to overwrite the existing data directory DEBUG: upstream_node_id determined as 1 INFO: replication slot usage not requested; no replication slot will be set up for this standby INFO: parameter "max_wal_senders" set to 10 NOTICE: checking for available walsenders on the source node (2 required) INFO: sufficient walsenders available on the source node DETAIL: 2 required, 10 available NOTICE: checking replication connections can be made to the source server (2 required) INFO: required number of replication connections could be made to the source server DETAIL: 2 replication connections required NOTICE: standby will attach to upstream node 1 HINT: consider using the -c/--fast-checkpoint option INFO: would execute: /pgbin/pghome_1/bin/pg_basebackup -l "repmgr base backup" -D /pgdata/dbdata -h pgrep1 -p 5699 -U repmgr -X stream INFO: all prerequisites for "standby clone" are met
Step 15. 在節點2,Clone Primary Database (正式執行)
[pgadm@pgrep2 ~]$ repmgr -h pgrep1 -U repmgr -d repmgr -f /pgdata/repmgr.conf standby clone NOTICE: destination directory "/pgdata/dbdata" provided INFO: connecting to source node DETAIL: connection string is: host=pgrep1 user=repmgr dbname=repmgr DETAIL: current installation size is 32 MB DEBUG: 1 node records returned by source node DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=pgrep1 fallback_application_name=repmgr options=-csearch_path=" DEBUG: upstream_node_id determined as 1 INFO: replication slot usage not requested; no replication slot will be set up for this standby NOTICE: checking for available walsenders on the source node (2 required) NOTICE: checking replication connections can be made to the source server (2 required) INFO: checking and correcting permissions on existing directory "/pgdata/dbdata" NOTICE: starting backup (using pg_basebackup)... HINT: this may take some time; consider using the -c/--fast-checkpoint option INFO: executing: /pgbin/pghome_1/bin/pg_basebackup -l "repmgr base backup" -D /pgdata/dbdata -h pgrep1 -p 5699 -U repmgr -X stream NOTICE: standby clone (using pg_basebackup) complete NOTICE: you can now start your PostgreSQL server HINT: for example: pg_ctl -D /pgdata/dbdata start HINT: after starting the server, you need to register this standby with "repmgr standby register"
Step 16. 啟動Standby服務
[pgadm@pgrep2 ~]$ pg_ctl start -D $PGDATA waiting for server to start....2022-03-21 13:41:50.923 CST [18049] LOG: redirecting log output to logging collector process 2022-03-21 13:41:50.923 CST [18049] HINT: Future log output will appear in directory "/pgdata/pglog". done server started
Step 17. 確認Streaming Replication狀態 (Primary)
postgres=# \x Expanded display is on. postgres=# SELECT * FROM pg_stat_replication; -[ RECORD 1 ]----+------------------------------ pid | 57278 usesysid | 16385 usename | repmgr application_name | pgrep2 client_addr | 172.30.0.22 client_hostname | client_port | 53436 backend_start | 2022-03-21 13:41:53.330463+08 backend_xmin | state | streaming sent_lsn | 0/5000358 write_lsn | 0/5000358 flush_lsn | 0/5000358 replay_lsn | 0/5000358 write_lag | flush_lag | replay_lag | sync_priority | 0 sync_state | async reply_time | 2022-03-21 13:49:07.501854+08
Step 18. 確認Streaming Replication狀態 (Standby)
postgres=# \x Expanded display is on. postgres=# SELECT * FROM pg_stat_wal_receiver; -[ RECORD 1 ]---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- pid | 18055 status | streaming receive_start_lsn | 0/5000000 receive_start_tli | 1 written_lsn | 0/5000358 flushed_lsn | 0/5000358 received_tli | 1 last_msg_send_time | 2022-03-21 13:51:00.128178+08 last_msg_receipt_time | 2022-03-21 13:50:57.754794+08 latest_end_lsn | 0/5000358 latest_end_time | 2022-03-21 13:43:59.154199+08 slot_name | sender_host | pgrep1 sender_port | 5699 conninfo | user=repmgr passfile=/home/pgadm/.pgpass channel_binding=disable connect_timeout=2 dbname=replication host=pgrep1 port=5699 application_name=pgrep2 fallback_application_name=repmgrclu sslmode=disable sslcompression=0 sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any
Step 19. 註冊Standby Database
[pgadm@pgrep2 pgdata]$ repmgr -f /pgdata/repmgr.conf standby register INFO: connecting to local node "pgrep2" (ID: 2) DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=pgrep2 fallback_application_name=repmgr options=-csearch_path=" INFO: connecting to primary database DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=pgrep1 fallback_application_name=repmgr options=-csearch_path=" WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID: 1) INFO: standby registration complete NOTICE: standby node "pgrep2" (ID: 2) successfully registered
Step 20. 確認Cluster狀態
[pgadm@pgrep2 pgdata]$ repmgr -f /pgdata/repmgr.conf cluster show DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=pgrep2 fallback_application_name=repmgr options=-csearch_path=" DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=pgrep1 fallback_application_name=repmgr options=-csearch_path=" DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=pgrep2 fallback_application_name=repmgr options=-csearch_path=" DEBUG: connecting to: "user=repmgr connect_timeout=2 dbname=repmgr host=pgrep1 fallback_application_name=repmgr options=-csearch_path=" ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+--------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------- 1 | pgrep1 | primary | * running | | default | 100 | 1 | host=pgrep1 user=repmgr dbname=repmgr connect_timeout=2 2 | pgrep2 | standby | running | pgrep1 | default | 100 | 1 | host=pgrep2 user=repmgr dbname=repmgr connect_timeout=2
參考來源:
0 留言