-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29796 Allow sleepForRetry replication config to be overridden by replication peers #7577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 8 commits
f405fb8
d97bf18
ff20206
207dacd
e40230a
0487429
251d4f0
cae2efb
49f6d52
03eaa64
92f68af
de0064e
7e4afc3
db8fa8c
72e9655
851c737
d75920c
13b7661
2d412fe
46b1533
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
@@ -99,8 +99,6 @@ public class ReplicationSource implements ReplicationSourceInterface { | |||||||
| protected ReplicationSourceManager manager; | ||||||||
| // Should we stop everything? | ||||||||
| protected Server server; | ||||||||
| // How long should we sleep for each retry | ||||||||
| private long sleepForRetries; | ||||||||
| protected FileSystem fs; | ||||||||
| // id of this cluster | ||||||||
| private UUID clusterId; | ||||||||
|
|
@@ -204,8 +202,6 @@ public void init(Configuration conf, FileSystem fs, ReplicationSourceManager man | |||||||
| this.waitOnEndpointSeconds = | ||||||||
| this.conf.getInt(WAIT_ON_ENDPOINT_SECONDS, DEFAULT_WAIT_ON_ENDPOINT_SECONDS); | ||||||||
| decorateConf(); | ||||||||
| // 1 second | ||||||||
| this.sleepForRetries = this.conf.getLong("replication.source.sleepforretries", 1000); | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: do you think this so, can we keep this variable and just call |
||||||||
| // 5 minutes @ 1 sec per | ||||||||
| this.maxRetriesMultiplier = this.conf.getInt("replication.source.maxretriesmultiplier", 300); | ||||||||
| this.queueSizePerGroup = this.conf.getInt("hbase.regionserver.maxlogs", 32); | ||||||||
|
|
@@ -526,6 +522,19 @@ private long getCurrentBandwidth() { | |||||||
| return peerBandwidth != 0 ? peerBandwidth : defaultBandwidth; | ||||||||
| } | ||||||||
|
|
||||||||
| /** | ||||||||
| * Get the sleep time for retries. Check peer config first, if set use it, otherwise fall back to | ||||||||
| * global configuration. | ||||||||
| * @return sleep time in milliseconds | ||||||||
| */ | ||||||||
|
||||||||
| */ | |
| */ | |
| @Override |
Outdated
Copilot
AI
Dec 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fallback logic is inconsistent with the existing getCurrentBandwidth() pattern. The bandwidth implementation (lines 519-523) treats 0 as "use default", but this implementation treats values > 0 as valid peer settings and 0 as "use default". This creates an inconsistency where setting sleepForRetries to 0 means "use global default" but for bandwidth, 0 means "use default". Consider using the same pattern as bandwidth (checking != 0 instead of > 0) for consistency, or add explicit documentation about this behavior difference.
Outdated
Copilot
AI
Dec 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing test coverage for the new getSleepForRetries() method and its fallback logic. While the existing tests mock getSleepForRetries() to return 0L, there's no test that verifies:
- When peer config has a positive value (e.g., 2000), it returns that value
- When peer config is 0, it falls back to the global configuration value
- The integration between peer-level and global configuration
Consider adding a dedicated unit test in TestReplicationSource.java that verifies this fallback behavior, similar to how bandwidth is tested in other parts of the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The public getter method getSleepForRetries() is missing Javadoc documentation. For consistency with other public methods in this class and to help users understand the purpose and behavior of this configuration property, add Javadoc that explains: