Skip to content

Commit 525551a

Browse files
authored
Backport - Fix replica can't finish failover when config epoch is outdated (#2178) to 7.2 (#2232)
When the primary changes the config epoch and then down immediately, the replica may not update the config epoch in time. Although we will broadcast the change in cluster (see #1813), there may be a race in the network or in the code. In this case, the replica will never finish the failover since other primaries will refuse to vote because the replica's slot config epoch is old. We need a way to allow the replica can finish the failover in this case. When the primary refuses to vote because the replica's config epoch is less than the dead primary's config epoch, it can send an UPDATE packet to the replica to inform the replica about the dead primary. The UPDATE message contains information about the dead primary's config epoch and owned slots. The failover will time out, but later the replica can try again with the updated config epoch and it can succeed. Fixes #2169. --------- Signed-off-by: Ran Shidlansik <[email protected]>
1 parent 5dc6632 commit 525551a

File tree

1 file changed

+15
-3
lines changed

1 file changed

+15
-3
lines changed

src/cluster.c

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3145,9 +3145,11 @@ int clusterProcessPacket(clusterLink *link) {
31453145
senderConfigEpoch)
31463146
{
31473147
serverLog(LL_VERBOSE,
3148-
"Node %.40s has old slots configuration, sending "
3149-
"an UPDATE message about %.40s",
3150-
sender->name, server.cluster->slots[j]->name);
3148+
"Node %.40s (%s) has old slots configuration, sending "
3149+
"an UPDATE message about %.40s (%s)",
3150+
sender->name, sender->human_nodename,
3151+
server.cluster->slots[j]->name,
3152+
server.cluster->slots[j]->human_nodename);
31513153
clusterSendUpdate(sender->link,
31523154
server.cluster->slots[j]);
31533155

@@ -4080,6 +4082,16 @@ void clusterSendFailoverAuthIfNeeded(clusterNode *node, clusterMsg *request) {
40804082
node->name, node->human_nodename, j,
40814083
(unsigned long long) server.cluster->slots[j]->configEpoch,
40824084
(unsigned long long) requestConfigEpoch);
4085+
4086+
/* Send an UPDATE message to the replica. After receiving the UPDATE message,
4087+
* the replica will update the slots config so that it can initiate a failover
4088+
* again later. Otherwise the replica will never get votes if the primary is down. */
4089+
serverLog(LL_VERBOSE,
4090+
"Node %.40s (%s) has old slots configuration, sending "
4091+
"an UPDATE message about %.40s (%s)",
4092+
node->name, node->human_nodename,
4093+
server.cluster->slots[j]->name, server.cluster->slots[j]->human_nodename);
4094+
clusterSendUpdate(node->link, server.cluster->slots[j]);
40834095
return;
40844096
}
40854097

0 commit comments

Comments
 (0)