Preface

Previously deployed: 4 pairs of master-slave clusters + 1 pair of master-slave session master-slave backup of redis. If there is any downtime in the redis cluster, how to ensure the availability of services? This paper intends to add the start sentinel service on the session server to test the disaster recovery of the cluster.

1. Add sentry

a. reorganize the cluster

Because the connection operation of cluster uses intranet, the actual application is the same. Modify the container startup command and configuration file, and cancel the port mapping and cli connection password of the redis cluster external public network. In addition, too many nodes are inconvenient to manage, so reduce them.
Adjust to 3 pairs of master-slave clusters without password, delete clm4 and cls4:

/ # redis-cli --cluster del-node 172.1.30.21:6379 c2b42a6c35ab6afb1f360280f9545b3d1761725e
>>> Removing node c2b42a6c35ab6afb1f360280f9545b3d1761725e from cluster 172.1.30.21:6379
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
/ # redis-cli --cluster del-node 172.1.50.21:6379 6d1b7a14a6d0be55a5fcb9266358bd1a42244d47
>>> Removing node 6d1b7a14a6d0be55a5fcb9266358bd1a42244d47 from cluster 172.1.50.21:6379
[ERR] Node 172.1.50.21:6379 is not empty! Reshard data away and try again.
#You need to empty the slot data first (rebalance to weight = 0)
/ # redis-cli --cluster rebalance 172.1.50.21:6379 --cluster-weight 6d1b7a14a6d0be55a5fcb9266358bd1a42244d47=0
Moving 2186 slots from 172.1.50.21:6379 to 172.1.30.11:6379
###
Moving 2185 slots from 172.1.50.21:6379 to 172.1.50.11:6379
###
Moving 2185 slots from 172.1.50.21:6379 to 172.1.50.12:6379
###
/ # redis-cli --cluster del-node 172.1.50.21:6379 6d1b7a14a6d0be55a5fcb9266358bd1a42244d47
>>> Removing node 6d1b7a14a6d0be55a5fcb9266358bd1a42244d47 from cluster 172.1.50.21:6379
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.

Volume reduction succeeded.

b. start the sentry

Modify the container startup command of rm/rs here:

docker run --name rm \
           --restart=always \
    --network=mybridge --ip=172.1.13.11 \
    -v /root/tmp/dk/redis/data:/data \
    -v /root/tmp/dk/redis/redis.conf:/etc/redis/redis.conf \
    -v /root/tmp/dk/redis/sentinel.conf:/etc/redis/sentinel.conf \
    -d cffycls/redis5:1.7  
docker run --name rs \
    --restart=always \
    --network=mybridge --ip=172.1.13.12 \
    -v /root/tmp/dk/redis_slave/data:/data \
    -v /root/tmp/dk/redis_slave/redis.conf:/etc/redis/redis.conf \
    -v /root/tmp/dk/redis_slave/sentinel.conf:/etc/redis/sentinel.conf \
    -d cffycls/redis5:1.7

Reference redis cluster implementation (VI) disaster recovery and downtime recovery>,<Detailed description of Redis and Sentinel configuration items >, modify profile:

#If the storage path of the generated data
dir /data/sentinel

#<master-name> <ip> <redis-port> <quorum>
#Monitor name, ip, port, minimum number of guaranteed consistency
sentinel monitor mymaster1 172.1.50.11 6379 2
sentinel monitor mymaster2 172.1.50.12 6379 2
sentinel monitor mymaster3 172.1.50.13 6379 2

#sentinel down-after-milliseconds <master-name> <milliseconds>
#Monitoring name, the timeout when this node is considered offline
# Default is 30 seconds.
sentinel down-after-milliseconds mymaster1 30000
sentinel down-after-milliseconds mymaster2 30000
sentinel down-after-milliseconds mymaster3 30000

#sentinel parallel-syncs <master-name> <numslaves>
#Monitor name with a value of 1 to ensure that only one slave at a time cannot process command requests
sentinel parallel-syncs mymaster1 1
sentinel parallel-syncs mymaster2 1
sentinel parallel-syncs mymaster3 1

#Default value
# Default is 3 minutes.
sentinel failover-timeout mymaster1 180000
sentinel failover-timeout mymaster2 180000
sentinel failover-timeout mymaster3 180000

Create the corresponding folder (xx/data/sentinel), restart the two containers, and enter rm:

/ # redis-sentinel /etc/redis/sentinel.conf
... ... 
14:X 11 Jul 2019 18:25:24.418 # +monitor master mymaster3 172.1.50.13 6379 quorum 2
14:X 11 Jul 2019 18:25:24.419 # +monitor master mymaster1 172.1.50.11 6379 quorum 2
14:X 11 Jul 2019 18:25:24.419 # +monitor master mymaster2 172.1.50.12 6379 quorum 2
14:X 11 Jul 2019 18:25:24.421 * +slave slave 172.1.30.12:6379 172.1.30.12 6379 @ mymaster1 172.1.50.11 6379
14:X 11 Jul 2019 18:25:24.425 * +slave slave 172.1.30.13:6379 172.1.30.13 6379 @ mymaster2 172.1.50.12 6379
14:X 11 Jul 2019 18:26:14.464 # +sdown master mymaster3 172.1.50.13 6379 

"You don't need to monitor the slave. If you monitor the master, the slave will be added to the sentinel automatically."