-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-28669: fix leaks a connection to ZooKeeper #6147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
When I turn on replication and create a peer, if a RegionServer goes down, after recovery, I delete a peer, but the connection between the peer and zookeeper will not be disconnected. |
|
When I checked the code, I found that when RecoveredReplicationSource executed tryFinish, only ReplicationSource was deleted from oldsources, but terminate was not executed, resulting in connection leaks. |
| () -> { | ||
| if (workerThreads.isEmpty()) { | ||
| this.getSourceMetrics().clear(); | ||
| this.terminate("Finished recovering queue"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here we just need to call replicationEndpoint.stop? The terminate method is designed for terminate a replication source when removing a peer or refreshing a source when there are still data need to be replicated in the source, so it contains a lot of unnecessary logics when we just want to remove a finished recovered source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, calling replicationEndpoint.stop can solve this bug. Don't the workerThreads created during the initialization of RecoveredReplicationSource need to be stopped? I think the context simply removes it from the map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReplicationSourceShipper is just a thread, it will quit after we returning from the run method, so we do not need to stop it again. And even if you call terminate method here, there is no way to close these threads as there is a workerThreads.isEmpty check above...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks for your help. It is more reasonable to call replicationEndpoint.stop here without call terminate.
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
…ks a connection to ZooKeeper (#6147) Co-authored-by: rodenli <[email protected]> Signed-off-by: Duo Zhang <[email protected]> (cherry picked from commit 73651dc)
fix HBASE-28669