-
Notifications
You must be signed in to change notification settings - Fork 622
[Garnet.Worker] feat: Improve graceful shutdown by add AOF handling or take checkpoint if config enabled #1382 #1383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit enhances the asynchronous shutdown process in `Worker.cs` with the following changes: - The `StopAsync` method now waits up to 30 seconds for existing connections to complete before termination. - Added logic to flush the AOF (Append-Only File) buffer and create a checkpoint on shutdown. This commit operation is only performed if AOF is enabled. - Implemented the new `WaitForActiveConnectionsToComplete` method to check the status of active connections with a retry mechanism. - Called `GC.SuppressFinalize(this)` in the `Dispose` method to prevent unnecessary finalization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances the graceful shutdown process for the Garnet worker service by implementing a more sophisticated shutdown sequence that waits for active connections to complete before termination and handles AOF (Append-Only File) operations during shutdown.
- Adds a 30-second timeout mechanism to wait for active connections to complete during shutdown
- Implements AOF buffer flushing and checkpoint creation during the shutdown process
- Adds proper resource cleanup with GC.SuppressFinalize call in the Dispose method
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <[email protected]>
Simplifies the polling interval logic for active connections by using a delay array and removing consecutive error tracking. Extracts active connection count retrieval into a new GetActiveConnectionCount() helper method for clarity and reuse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 7 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Moved Dispose() call after base.StopAsync in the finally block to ensure proper resource cleanup order. Added null checks for server and server.Metrics in GetActiveConnectionCount to prevent possible null reference exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
umm, I hope my code change didn't impact on cluster test result CI only failure on test.cluster with windows-latest, net8.0, Release So how can I find to resolve this issue? I don't have a mind to locally test about clusttering garnet in windows When before running is successful (https://github.com/microsoft/garnet/actions/runs/18344516902) |
|
In my assumption with Local Visual Studio Test excution passed about failed test item, and only failed only one target failure(.net 8, windows, release) |
During graceful shutdown, the worker now checks if tiered storage is enabled and takes a checkpoint using StoreWrapper if so. If tiered storage is not enabled, it falls back to flushing the AOF buffer as before. This ensures data consistency for both storage modes.
|
I found a issue that I confused AOF and garnet's tiered storage |
save AOF commit first and secondary take checkpoint to ensure state preserving
| { | ||
| try | ||
| { | ||
| var storeWrapperField = server.GetType().GetField("storeWrapper", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of trying to operate from outside using reflection, why not add a StopAsync method to GarnetServer, so that it can perform the graceful shutdown without reflection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, Your suggestion is reasonable!
Instead of using complex reflexion in Worker, Graceful Shutdown logic insertion into Garnet Server more simply maintanable and understandable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, look forward to the update!
feat: Improve graceful shutdown and add AOF handling
This commit(PR) enhances the asynchronous shutdown process in
Worker.cswith the following changes:StopAsyncmethod now waits up to 30 seconds for existing connections to complete before termination.WaitForActiveConnectionsToCompletemethod to check the status of active connections with a retry mechanism.GC.SuppressFinalize(this)in theDisposemethod to prevent unnecessary finalization.then this PR will Close #1382 Issue
(Tested in My side, if you hope you can check also)