Skip to content

Commit f342755

Browse files
committed
Drop the MEET packet if the link node is in handshake state
After valkey-io#1307 got merged, we notice there is a assert happen in setClusterNodeToInboundClusterLink: ``` === ASSERTION FAILED === ==> '!link->node' is not true ``` In valkey-io#778, we will call setClusterNodeToInboundClusterLink to attach the node to the link during the MEET processing, so if we receive a another MEET packet in a short time, the node is still in handshake state, we will meet this assert and crash the server. If the link is bound to a node and the node is in the handshake state, and we receive a MEET packet, it may be that the sender sent multiple MEET packets when reconnecting, and in here we are dropping the MEET. Note that in getNodeFromLinkAndMsg, the node in the handshake state has a random name and not truly "known", so we don't know the sender. Dropping the MEET packet can prevent us from creating a random node, avoid incorrect link binding, and avoid duplicate MEET packet eliminate the handshake state. Signed-off-by: Binbin <[email protected]>
1 parent 32f2c73 commit f342755

File tree

1 file changed

+17
-4
lines changed

1 file changed

+17
-4
lines changed

src/cluster_legacy.c

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3004,7 +3004,8 @@ int clusterIsValidPacket(clusterLink *link) {
30043004
}
30053005

30063006
if (type == server.cluster_drop_packet_filter || server.cluster_drop_packet_filter == -2) {
3007-
serverLog(LL_WARNING, "Dropping packet that matches debug drop filter");
3007+
serverLog(LL_WARNING, "Dropping packet of type %s that matches debug drop filter",
3008+
clusterGetMessageTypeString(type));
30083009
return 0;
30093010
}
30103011

@@ -3095,7 +3096,7 @@ int clusterProcessPacket(clusterLink *link) {
30953096
if (server.debug_cluster_close_link_on_packet_drop &&
30963097
(type == server.cluster_drop_packet_filter || server.cluster_drop_packet_filter == -2)) {
30973098
freeClusterLink(link);
3098-
serverLog(LL_WARNING, "Closing link for matching packet type %hu", type);
3099+
serverLog(LL_WARNING, "Closing link for matching packet type %s", clusterGetMessageTypeString(type));
30993100
return 0;
31003101
}
31013102
return 1;
@@ -3111,8 +3112,8 @@ int clusterProcessPacket(clusterLink *link) {
31113112
freeClusterLink(link);
31123113
serverLog(
31133114
LL_NOTICE,
3114-
"Closing link for node that sent a lightweight message of type %hu as its first message on the link",
3115-
type);
3115+
"Closing link for node that sent a lightweight message of type %s as its first message on the link",
3116+
clusterGetMessageTypeString(type));
31163117
return 0;
31173118
}
31183119
clusterNode *sender = link->node;
@@ -3121,6 +3122,18 @@ int clusterProcessPacket(clusterLink *link) {
31213122
return 1;
31223123
}
31233124

3125+
if (type == CLUSTERMSG_TYPE_MEET && link->node && nodeInHandshake(link->node)) {
3126+
/* If the link is bound to a node and the node is in the handshake state, and we receive
3127+
* a MEET packet, it may be that the sender sent multiple MEET packets when reconnecting,
3128+
* and in here we are dropping the MEET. Note that in getNodeFromLinkAndMsg, the node in
3129+
* the handshake state has a random name and not truly "known", so we don't know the sender.
3130+
* Dropping the MEET packet can prevent us from creating a random node, avoid incorrect
3131+
* link binding, and avoid duplicate MEET packet eliminate the handshake state. */
3132+
serverLog(LL_NOTICE, "Dropping MEET packet from node %.40s because the node is already in handshake state",
3133+
link->node->name);
3134+
return 1;
3135+
}
3136+
31243137
uint16_t flags = ntohs(hdr->flags);
31253138
uint64_t sender_claimed_current_epoch = 0, sender_claimed_config_epoch = 0;
31263139
clusterNode *sender = getNodeFromLinkAndMsg(link, hdr);

0 commit comments

Comments
 (0)