-
Notifications
You must be signed in to change notification settings - Fork 592
HDDS-8389. [Snapshot] Added integration test for SnapDiff when OM leader failover happens #4657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1313363 to
116cd7c
Compare
…der failover happens
|
@swamirishi Could you take a look at this? |
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOmSnapshot.java
Outdated
Show resolved
Hide resolved
...ne/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManagerHASnapshot.java
Outdated
Show resolved
Hide resolved
...ne/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManagerHASnapshot.java
Outdated
Show resolved
Hide resolved
...one/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/SnapshotDiffManager.java
Outdated
Show resolved
Hide resolved
smengcl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hemantk-12 . Great test additions. Nicely written.
swamirishi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOmSnapshot.java
Outdated
Show resolved
Hide resolved
|
|
||
| // If job was IN_PROGRESS or DONE state when OM restarted, it should be | ||
| // DONE by this time. | ||
| // If job FAILED during crash (which mostly happens in the test because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Do we have a unit test case for the FAILED state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No integration test for FAILED state. I added few tests as unit test in PR: https://github.com/apache/ozone/pull/4716/files#diff-11392a9810b911a6b87f9f4440cb5e5342b2a9c112851e7dceafa5103d999e1dR427
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOmSnapshot.java
Outdated
Show resolved
Hide resolved
| }); | ||
| } | ||
|
|
||
| assertEquals(1, snapshotIds.stream().distinct().count()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is snapshotId assertion enough? I guess we should also check rocks db contents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking only snapshotId should be enough. Test is to make sure that snapshotId is consistent among all the OM nodes. We don't care about it value.
What do you mean by "check rocks db contents"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the list of sst files should be same I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that should be tested as part of this test.
...ne/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManagerHASnapshot.java
Outdated
Show resolved
Hide resolved
|
There are some conflicts in the test due to HDDS-8645. |
Resolved conflicts and also ran test 100 times to make sure there is no flakiness. It passed all the time (Workflow failed because of timeout) : https://github.com/hemantk-12/ozone/actions/runs/5049594952/jobs/9059656428 |
smengcl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
|
Thanks @hemantk-12 for the test addition, and @swamirishi for the review. |
What changes were proposed in this pull request?
snapshotDiffManagerand invalidatesnapshotCacheuponOmSnapshotManager#close.snapDiffExecutorandsstDumpToolExecutorinSnapshotDiffManager#close.What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8389
How was this patch tested?
Ran newly added tests locally and verified they pass.