Skip to content

Conversation

@hemantk-12
Copy link
Contributor

@hemantk-12 hemantk-12 commented May 8, 2023

What changes were proposed in this pull request?

OM was crashing on the restart because snapshot chain was getting corrupted. On the deep dive, it is found that it is because creation time of snapshots are not unique among snapshots and causing the failure. Mode details are in jira: https://issues.apache.org/jira/browse/HDDS-8530.
This change is to pass the creation time through preExecute instead of each OM node using current time as creation time. Creation time is passed similar to snapshotId.

This change also includes some logging changes in SnapshotChainManager to log more information to debug in future.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8530

How was this patch tested?

Added integration test to verify.

@hemantk-12 hemantk-12 marked this pull request as ready for review May 8, 2023 21:42
@adoroszlai adoroszlai added the snapshot https://issues.apache.org/jira/browse/HDDS-6517 label May 9, 2023
@hemantk-12 hemantk-12 changed the title HDDS-8530. [Snapshot] Fix for OM crash on restart due to snapshot chain manager corruption HDDS-8530. [Snapshot] Fix for OM crash on restart due to snapshot chain manager corruption May 9, 2023
Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hemantk-12 for the fix!

Copy link
Member

@aswinshakil aswinshakil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working this @hemantk-12 . LGTM. Just have a few minor comments.

Co-authored-by: Aswin Shakil Balasubramanian <[email protected]>
@smengcl smengcl merged commit 10aaa57 into apache:master May 18, 2023
@smengcl
Copy link
Contributor

smengcl commented May 18, 2023

Thanks @aswinshakil for reviewing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

snapshot https://issues.apache.org/jira/browse/HDDS-6517

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants