-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-25854 Remove redundant AM in-memory state changes in CatalogJanitor #3234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This change is moved here from #3230. We know from the test report there that |
|
@Apache9 added some context on #3230
|
|
Although the unit test will fail, when I tried this change out on a cluster in a very write heavy ingestion test, the end result was good. The ingestion test completes successfully and without weird artifacts in the logging. All split regions are GCed by procedures. In memory state aligns with filesystem state. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
saintstack
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes to one-place only (in the procedure).
I've lost context on why the CJ redundant removes. I defer to the test you ran Andrew at scale.
(I just noticed the wonky split message -- have started a bit of ITBLL over here...)
|
💔 -1 overall
This message was automatically generated. |
I've committed a few changes over the past couple of days that address two issues here. One was an accumulation of SPLIT state regionStates for as long as CatalogJanitor is deferring cleanup, because daughters still have references, because compaction is backed up. The accumulation is fine but it caused concerning warnings at each balancer iteration. Another was a race condition between split handling and regionserver report handling that could cause multiple split procedures to get scheduled for the same split request. Only one could succeed. The others would cause log noise but their failures were harmless. Would be good to get a second opinion. |
|
Ok, let me rebase this and fix the unit tests. |
…itor In CatalogJanitor we schedule GCRegionProcedure to clean up both filesystem and in-memory state after a split, and GCMultipleMergedRegionsProcedure to do the same for merges. Both of these procedures clean up in-memory state, but CatalogJanitor also does this redundantly just after scheduling the procedures. The cleanup should be done in only one place. Presumably we are using the procedures to do it in a principled way. Remove the redundancy in CatalogJanitor and fix any follow on issues, like test failures.
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
Actual failing test |
|
🎊 +1 overall
This message was automatically generated. |
virajjasani
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we are just removing what scheduled Procs are anyways going to do right, sounds good. +1
…itor (#3234) In CatalogJanitor we schedule GCRegionProcedure to clean up both filesystem and in-memory state after a split, and GCMultipleMergedRegionsProcedure to do the same for merges. Both of these procedures clean up in-memory state, but CatalogJanitor also does this redundantly just after scheduling the procedures. The cleanup should be done in only one place. Presumably we are using the procedures to do it in a principled way. Remove the redundancy in CatalogJanitor and fix any follow on issues, like test failures. Signed-off-by: Duo Zhang <[email protected]> Signed-off-by: Michael Stack <[email protected]> Signed-off-by: Viraj Jasani <[email protected]>
…itor (#3234) In CatalogJanitor we schedule GCRegionProcedure to clean up both filesystem and in-memory state after a split, and GCMultipleMergedRegionsProcedure to do the same for merges. Both of these procedures clean up in-memory state, but CatalogJanitor also does this redundantly just after scheduling the procedures. The cleanup should be done in only one place. Presumably we are using the procedures to do it in a principled way. Remove the redundancy in CatalogJanitor and fix any follow on issues, like test failures. Signed-off-by: Duo Zhang <[email protected]> Signed-off-by: Michael Stack <[email protected]> Signed-off-by: Viraj Jasani <[email protected]>
In CatalogJanitor we schedule GCRegionProcedure to clean up both filesystem and in-memory state after a split, and GCMultipleMergedRegionsProcedure to do the same for merges. Both of these procedures clean up in-memory state, but CatalogJanitor also does this redundantly just after scheduling the procedures. The cleanup should be done in only one place. Presumably we are using the procedures to do it in a principled way. This is least a nit, but probably a source of future bugs. Remove the redundancy in CatalogJanitor and fix any follow on issues, like test failures.