Redis provides a replication function for Redis. When the data in the Redis database changes, this change will be automatically synchronized to other Redis machines.
When Redis deploys multiple machines, these machine nodes will be divided into two categories, one is the master node and the other is the slave node. Generally, the master node can read and write, and the slave node can only read. A master node can have multiple slave nodes, but a slave node will only have one master node, which is the so-called one master and many slaves structure.
Support master-slave replication, the master will automatically synchronize data to the slave, and reading and writing can be separated;
The master device provides services to the master and slave devices in a non-blocking manner. Therefore, in the process of master-slave synchronization, the client can still submit the query or modification request;
The slave also completes data synchronization in a non-blocking way. During synchronization, if the client submits a query request, Redis will return the data before synchronization.
Redis has no automatic fault tolerance and recovery function, and the master-slave shutdown will lead to the failure of the front end of the read-write request, which needs to wait for the machine to restart or manually switch the IP of the front end to recover;
When the host computer goes down, some data can't be synchronized to the slave computer in time before the downtime, and data inconsistency will be introduced after IP switching, which will reduce the availability of the system;
Redis is difficult to support online expansion, and when the cluster capacity reaches the upper limit, online expansion will become very complicated;
The data in the master node and slave node of Redis are the same, which reduces the availability of memory.
In actual production, we give priority to sentry mode. In this mode, when the master falls, the sentry will automatically elect the master and point another slave to the new master.
In the master-slave mode, redis also provides the sentinel command redis-sentinel, which is an independent process. As a process, it will run independently. The principle is that the sentry process sends commands to all redis robots and waits for the response of the Redis server, thus monitoring multiple running Redis instances. Generally, an odd number of sentries are used to facilitate decision-making and election. A number of outposts form a outpost cluster, and the outposts will communicate directly to check whether the outposts are operating normally. At the same time, it is found that the outpost of the main fighter will make a decision to elect a new main fighter.
The role of sentry mode:
By sending a command, the Redis server returns to monitor its running status, including the master server and the slave server;
However, problems may occur when the sentinel process monitors the Redis server. To this end, we can use multiple sentries for monitoring. There will also be monitoring between sentries, thus forming a variety of sentry modes.
Sentinel is very similar to the role of city zoo in Kafka cluster.
Sentinel mode is based on master-slave mode and has all the advantages of master-slave.
The master device and slave device can be automatically switched, which makes the system more robust and more available.
It has the disadvantage of master-slave mode, the data on each machine is the same, and the availability of memory is low.
Redis is difficult to support online expansion, and when the cluster capacity reaches the upper limit, online expansion will become very complicated.
Redis cluster mode itself does not use consistent hash algorithm, but uses slots slots.
Redis sentry mode can basically achieve high availability and read-write separation, but in this mode, every Redis server stores the same data, which wastes memory. Therefore, the Cluster mode is added to redis3.0 to realize the distributed storage of redis, that is, each Redis node stores different contents. Each node will communicate with other nodes through the cluster bus. Use a special port number when communicating, that is, the foreign service port number plus 10000. For example, if the port number of a node is 6379, then the port number it communicates with other nodes is 16379. The communication between nodes adopts special binary protocol.
For the client, the whole cluster is regarded as a whole, and the client can connect any node to operate, just like operating a single Redis instance. When the client does not assign a key to a node at runtime, Redis will return a steering instruction, pointing to the correct node, a bit like a 302 redirect jump on a browser page.
According to the official recommendation, cluster deployment must have at least three master nodes, and it is best to adopt the mode of three masters, three slaves and six nodes.
On each node of Redis, there are two things, one is slot, and its value range is: 0- 16383. From the execution results of redis-trib.rb above, we can see the distribution of this 16383 slot on the three hosts. There is also a cluster, which can be understood as a plug-in for cluster management, similar to sentry.
When the key we access arrives, Redis will get a result according to the algorithm of crc 16, and then take the remainder of this result and 16384, so that each key will correspond to a hash slot with the number between 0- 16383. Through this value, you can find the node corresponding to the corresponding slot, and then jump directly to this corresponding node for access operation.
In order to ensure high availability, redis-cluster cluster introduces the master-slave mode, and one master node corresponds to one or more slave nodes. When other master nodes ping the master node master 1, if more than half of the master nodes communicate with the master 1 overtime, it is considered that the master 1 is down, and the slave node slave 1 of the master 1 will be enabled, and the slave node slave 1 will be switched to the master node to continue providing services.
If both the master node 1 and its slave node slave 1 fail, the whole cluster will enter a failed state because the slot mapping of the cluster is incomplete. If more than half of the master nodes of the cluster hang up, the cluster will enter a failed state whether there are slave nodes or not.
Redis-cluster adopts the idea of decentralization and has no central node. The client is directly connected with the Redis node, and no intermediate proxy layer is needed. The client does not need to connect all the nodes in the cluster, but only any available node in the cluster.
The expansion of redis cluster means adding machines to the cluster, while the contraction means deleting machines from the cluster and redistributing 16383 slots to the nodes in the cluster (data migration).
Capacity expansion and contraction also use the cluster management tool redis-tri.rb
When expanding capacity, first use redis-tri.rb add-node to add new machines to the cluster. Although the new machine is already in the cluster, it is still useless if slots are not allocated. Redis-tri.rb reshard is used for segment-by-segment rehash (data migration) and after the slots on the old nodes are assigned to the new nodes, the new nodes will be invalid.
To shrink the volume, first use redis-tri.rb reshard to delete the slot on the computer, and then use redis-tri.rb add-del to delete the computer.
Using the idea of decentralization, data is stored and distributed in multiple nodes according to slots, and the data between nodes can be shared and the data distribution can be dynamically adjusted.
Scalability: it can be linearly extended to more than 1000 nodes, and nodes can be added or deleted dynamically;
High availability: When some nodes are unavailable, the cluster is still available. By adding Slave as backup data copy, automatic failover can be realized, status information is exchanged between nodes through gossip protocol, and the effect of Slave on Master is improved through voting mechanism.
Reduce the operation and maintenance cost and improve the scalability and availability of the system.
1.rediscusser is a cluster architecture without a central node, which automatically repairs the state of the cluster by relying on Goss protocol (rumor propagation). But gossip has the problems of message delay and message redundancy. When there are too many cluster nodes, constant PING/PANG communication is needed between nodes, and unnecessary traffic will occupy a lot of network resources. Although Reds4.0 has optimized this, this problem still exists.
2. Data migration issues
Redis cluster can dynamically expand and contract the capacity of nodes. This process is still semi-automatic and needs manual intervention. When expanding and contracting capacity, data migration is needed.
In order to ensure the consistency of migration, Redis is a synchronous operation. During the migration process, Redis at both ends will enter a blocking state for different durations. For a small key, this time can be ignored. However, if the memory of the Key is used too much, it will seriously contact the failover in the cluster and cause unnecessary switching.
Master-slave mode: after the master node is suspended, it is necessary to manually specify a new master node, which is not available and basically unnecessary.
Sentinel mode: after the master node is suspended, the sentinel process will actively elect a new master node, which is highly available, but the data stored by each node is the same, wasting memory space. It is used when there is not much data and the cluster scale is not very large, and it needs automatic fault tolerance and disaster tolerance.
Cluster mode: used when the data volume is relatively large and QPS requirements are high. Redis cluster was officially launched after Redis 3.0, and the time was late. At present, there are not many cases that can prove successful in mass production environment, and it takes time to test.