Skip to content

Conversation

@ralcolea
Copy link
Contributor

@ralcolea ralcolea commented Oct 28, 2025

This PR introduces several reliability and observability improvements to uRaft module, addressing multiple edge cases around promotion/demotion handling, deadlock prevention, and cluster recovery.

The main motivation was caused by issue #576 (promotion deadlocks), but there were introduced other improvements in the high-availability mechanism like logging and fixes when handling overlapping transitions or loss of quorum.

Key Changes

  • Improved handling of promotion timeouts and failures

    • Introduces handlePromotionFailure() and cleanupDirtyPromotion() in uRaftController.
    • Automatically triggers cleanup of stale temporary metadata file (metadata.sfs.tmp).
    • Ensures the node can rejoin as a shadow after failed or interrupted promotions.
    • Floating IP manager now starts only after a successful promotion.
  • Removed redundant set_block_promotion() calls

    • Prevented unnecessary election blocking in saunafs-uraft-helper and controller logic.
    • Resolved potential promotion deadlocks caused by overlapping state blocking.
  • Logging and Observability

    • Added structured syslog messages for:

      • Promotion, demotion, and quorum-related transitions.
      • Manual demotions and heartbeat-based state changes.
      • Election victories and higher-term detections.
    • Introduced nodeToString() helper for consistent and readable node identifiers in logs.

    • Enhanced logs in saunafs_drop_ip() for clarity during IP removal and state cleanup.

  • Deferred promotions during active demotion

    • Added force_promote_ flag to defer a promotion request until demotion completes.
    • Ensures that overlapping commands no longer cause promotion loss or inconsistent states.
    • Logs forced transitions for traceability.

Testing

The following Molecule scenarios were added and refactored to test the new changes:

  • uraft_kill_sfsmaster: This scenario was refactored to restart uRaft in the server where the sfsmaster process is killed, ensuring the cluster finishes in a stable state.
  • uraft_loss_quorum: This scenario was modified to simulate a quorum loss and validate that the cluster can continue serving I/O operations once the quorum is restored.
  • uraft_deadlock_multiple_elections: This is a new scenario introduced to validate that the uRaft cluster remains stable and free of deadlocks during consecutive election cycles.
  • uraft_failed_promotion: This is a new scenario introduced to verify that the uRaft cluster can gracefully recover when a promotion attempt fails. It also ensures that handlePromotionFailure() works as expected in case of failed promotions.

For more details, see fix-saunafs-issue-576 branch in deployment (Ansible) repository.

@ralcolea ralcolea requested a review from Copilot October 28, 2025 18:05
@ralcolea ralcolea self-assigned this Oct 28, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the uRaft module's reliability by addressing promotion deadlocks, improving state transition handling, and adding comprehensive observability through structured logging. The changes ensure proper recovery from failed promotions, prevent concurrent state transitions, and improve cluster stability during quorum loss scenarios.

Key Changes:

  • Implemented handlePromotionFailure() and cleanupDirtyPromotion() to recover from failed/interrupted promotions
  • Added deferred promotion/demotion mechanism using force_promote_ flag to handle overlapping state transitions
  • Enhanced logging throughout the codebase with structured syslog messages and a new nodeToString() helper
  • Fixed election deadlock by resetting current_term and leader_id when quorum is lost

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/uraft/uraftcontroller.h Added new cleanup methods and force_promote_ flag for managing state transitions
src/uraft/uraftcontroller.cc Implemented promotion failure handling, deferred transitions, enhanced logging, and improved timeout handling
src/uraft/uraft.h Added nodeToString() helper method declaration
src/uraft/uraft.cc Implemented nodeToString(), added syslog messages for election events and state transitions, fixed quorum loss deadlock
src/uraft/saunafs-uraft-helper.in Added saunafs_cleanup_dirty_state() function and enhanced saunafs_drop_ip() with better error handling and logging

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI commented Oct 28, 2025

@ralcolea I've opened a new pull request, #657, to work on those changes. Once the pull request is ready, I'll request review from you.

@ralcolea
Copy link
Contributor Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This PR introduces reliability and observability improvements to the uRaft module, addressing promotion/demotion handling, deadlock prevention, and cluster recovery. The changes include improved handling of promotion timeouts and failures, removal of redundant set_block_promotion() calls, enhanced logging and observability, fixes for election deadlocks, and deferred promotions during active demotion. The review focuses on identifying potential issues related to correctness and maintainability, particularly in the areas of error handling, logging consistency, and code clarity, with severity levels of medium, high, and critical.

@ralcolea ralcolea marked this pull request as ready for review October 29, 2025 10:47
Copy link
Contributor

@lgsilva3087 lgsilva3087 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is looking great. Please see my comments.

Copy link
Contributor

@lgsilva3087 lgsilva3087 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job. Please see my small suggestion.

Copy link
Collaborator

@rolysr rolysr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job @ralcolea, consider my minor suggestions.

This commit applies the following changes intended to fix issue #576:
- Improve states transition during promotion by ensuring that after
  starting one demotion/promotion operation, it is not interrupted
  by other demotion/promotion operation.

- Trigger cleanup after promotion timeout to ensure a temporary master
  can be restarted as a shadow after an interrupted metadata dump or
  a failed promotion that creates a remanent metadata.sfs.tmp file that
  prevent a master restart.

- Fix Floating IP timing to ensure the manager is only started after
  a successful promotion.

Signed-off-by: Crash <[email protected]>
Previously, certain execution paths could permanently block elections
by calling `set_block_promotion(true)`, leading to a deadlock scenario
where no Leader could be elected. Some of these calls were unnecessary,
as the uRaft mechanism based on the loyalty agreement is already
designed to prevent election loops correctly.

This commit addresses issue #576 by:
- Removing redundant calls to `set_block_promotion()` to avoid the
  deadlock scenario described above.
- Improving logging in the `saunafs_drop_ip` function to provide more
  detailed information about the status of the floating IP removal.

Signed-off-by: Crash <[email protected]>
This commit adds detailed logs for promotion and demotion events in
uRaft that were not previously recorded. These logs improve visibility
into node state transitions and help identify issues during debugging.

Signed-off-by: Crash <[email protected]>
Previously, if a promotion was requested while a demotion was still in
progress, the request was ignored. This could leave the node stuck in
a follower state even when a promotion was desired.

This commit adds a `force_promote_` flag to defer the promotion until
the current demotion completes. Once demotion finishes, the controller
automatically triggers a forced promotion and logs the transition.

This ensures consistent role transitions and prevents promotion loss
during overlapping commands.

Signed-off-by: Crash <[email protected]>
This commit introduces the following review suggestions:
- Updated the `cleanup` command in saunafs-uraft-helper to remove the
  temporary metadata.sfs.tmp file only if it is not holded by any
  process, and removed redundant logic for deleting `.sfsmaster.lock`.

- Refactor `saunafs_drop_ip` command in saunafs-uraft-helper to make
  the code more readable.

- Renamed `force_demote_` and `force_promote_` to `is_demote_pending_`
  and `is_promote_pending_` for clearer intent and consistency with
  their behavior.

- Added documentation for newly introduced functions and attributes
  to improve code readability and maintenance.

Co-authored-by: guillex <[email protected]>
Co-authored-by: Rolando Sánchez Ramos <[email protected]>
Signed-off-by: Crash <[email protected]>
@ralcolea ralcolea merged commit 1094009 into dev Nov 1, 2025
9 checks passed
@ralcolea ralcolea deleted the fix-issue-576 branch November 1, 2025 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants