Skip to content

[NEW] REPLICAOF NO ONE using a coordinated failover #2587

@KarthikSubbarao

Description

@KarthikSubbarao

The problem/use-case that the feature addresses

Currently, when we execute REPLICAOF NO ONE on a node in a cluster mode disabled setting, it turns into a primary. However, the node might not be up to date and can be lagging behind in comparison to other replica nodes. This can lead to partial data loss as well as causes replica nodes with higher repl offsets to full sync with the new primary.

Description of the feature

I was hoping to get input on the possibility of extending REPLICAOF NO ONE to use a coordinated failover approach (similar to the FAILOVER command) of pausing writes on the primary, allowing the chosen node (role of replica) to sync fully with the current primary, demote primary, and PSYNC failover on the chosen node to promote it to a primary.

This will limit data loss and unavailability of the cluster during the failover when using the REPLICAOF NO ONE.

The primary use case is for systems that are using the REPLICAOF NO ONE command to issue failovers on clusters in a cluster mode disabled setting.

Alternatives you've considered

Upgrading systems to use the FAILOVER command as it uses the coordinated failover approach

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions