-
Notifications
You must be signed in to change notification settings - Fork 957
Description
The problem/use-case that the feature addresses
Currently, when we execute REPLICAOF NO ONE on a node in a cluster mode disabled setting, it turns into a primary. However, the node might not be up to date and can be lagging behind in comparison to other replica nodes. This can lead to partial data loss as well as causes replica nodes with higher repl offsets to full sync with the new primary.
Description of the feature
I was hoping to get input on the possibility of extending REPLICAOF NO ONE to use a coordinated failover approach (similar to the FAILOVER command) of pausing writes on the primary, allowing the chosen node (role of replica) to sync fully with the current primary, demote primary, and PSYNC failover on the chosen node to promote it to a primary.
This will limit data loss and unavailability of the cluster during the failover when using the REPLICAOF NO ONE.
The primary use case is for systems that are using the REPLICAOF NO ONE command to issue failovers on clusters in a cluster mode disabled setting.
Alternatives you've considered
Upgrading systems to use the FAILOVER command as it uses the coordinated failover approach