-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-27798: Client side should back off based on wait interval in RpcThrottlingException #5226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // If, after the planned sleep, there won't be enough time left, we stop now. | ||
| long duration = singleCallDuration(expectedSleep); | ||
| if (duration > callTimeout) { | ||
| String msg = "callTimeout=" + callTimeout + ", callDuration=" + duration + ": " | ||
| + t.getMessage() + " " + callable.getExceptionMessageAdditionalDetail(); | ||
| throw (SocketTimeoutException) new SocketTimeoutException(msg).initCause(t); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving this check inside of the else block is pretty consequential, and I'm not sure which way we'd prefer.
If we leave this logic outside of the if (t instanceof RTE) else block then significant wait intervals — those which would extend past the call timeout — may be totally ignored. We'd instead just throw quickly which would probably manifest as retries, and consequently sort of produce the opposite of a back off.
Alternatively, by isolating this logic to only non-RTE exceptions, we're more likely to respect the back off requested — even if it will inevitably result in the timeout being exceeded.
Maybe we want to calculate the min of the requested wait interval and the duration remaining prior to timeout, and somewhat guarantee that we wait for that? I'm open to other suggestions too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this method here is where we do retries. The fact that we're throwing a SocketTimeoutException means that the retries will stop.
I think we should move this logic outside the if. Let's say the wait interval is 10s, but operation timeout is 5s. Honoring the wait interval will cause us to exceed the timeout, which is not good. At the very least we should do a Math.min, but we've also been told by the server that the quota won't be available until 10s. So we know that if we retry in 5s (due to a Math.min), we will fail. So with that said, we should just move the logic outside the if so that we throw an exception if the wait interval extends beyond operation timeout.
Regarding the quota -- if the server says 10s, i dont think it will ever be ready in, say, 8s. The wait interval is calculated based on the the rate of quota refill, so it should be accurate as a lower threshold. What we can't be sure of is if the quota will actually be ready in 10s (may be longer). If there were only 1 caller, the wait interval would be a great indicator of exactly when to retry. But in the case of N > 1 callers, those other callers may also be consuming the quota and the client may retry in 10s but it's still been saturated in the interim and wasn't able to refill enough to serve the request.
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
bbeaudreault
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think we want to add some tests for this. See TestAsyncClientPauseForServerOverloaded.java as an example where we can create a mock server which throws an RpcThrottlingException. We can have it throw such an exception with a large enough wait interval and set the hbase.client.pause to 0 or something, so that way it's clear that if the method takes longer than a few seconds then it honored our wait interval.
You could also consider extracting this "if throttle do this, otherwise do this" logic into a separate class and just testing that directly. Or you could try keeping the test just on the client side. For the async client, you could create a new test class which extends AsyncRpcRetryingCaller and the doCall just throws an throttle exception. For the non-async client you could just test RpcRetryingCallerImpl directly and pass a callable into callWithRetries which throws a throttle exception.
In terms of master branch, we do need to port this there. It should be relatively simple -- the async client is pretty similar between the two. You'd just exclude the changes to RpcRetryingCallerImpl, since that class doesn't exist in master.
| boolean isServerOverloaded = false; | ||
| if (error instanceof RpcThrottlingException) { | ||
| RpcThrottlingException rpcThrottlingException = (RpcThrottlingException) error; | ||
| pauseNsToUse = rpcThrottlingException.getWaitInterval() * 1000; // wait interval is in millis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use TimeUnit to convert? Then you prob dont need the comment
| // If, after the planned sleep, there won't be enough time left, we stop now. | ||
| long duration = singleCallDuration(expectedSleep); | ||
| if (duration > callTimeout) { | ||
| String msg = "callTimeout=" + callTimeout + ", callDuration=" + duration + ": " | ||
| + t.getMessage() + " " + callable.getExceptionMessageAdditionalDetail(); | ||
| throw (SocketTimeoutException) new SocketTimeoutException(msg).initCause(t); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this method here is where we do retries. The fact that we're throwing a SocketTimeoutException means that the retries will stop.
I think we should move this logic outside the if. Let's say the wait interval is 10s, but operation timeout is 5s. Honoring the wait interval will cause us to exceed the timeout, which is not good. At the very least we should do a Math.min, but we've also been told by the server that the quota won't be available until 10s. So we know that if we retry in 5s (due to a Math.min), we will fail. So with that said, we should just move the logic outside the if so that we throw an exception if the wait interval extends beyond operation timeout.
Regarding the quota -- if the server says 10s, i dont think it will ever be ready in, say, 8s. The wait interval is calculated based on the the rate of quota refill, so it should be accurate as a lower threshold. What we can't be sure of is if the quota will actually be ready in 10s (may be longer). If there were only 1 caller, the wait interval would be a great indicator of exactly when to retry. But in the case of N > 1 callers, those other callers may also be consuming the quota and the client may retry in 10s but it's still been saturated in the interim and wasn't able to refill enough to serve the request.
...ent/src/main/java/org/apache/hadoop/hbase/client/AsyncScanSingleRegionRpcRetryingCaller.java
Show resolved
Hide resolved
f40a6f6 to
cac69f0
Compare
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
cac69f0 to
77efdc8
Compare
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
77efdc8 to
6c864e6
Compare
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
6c864e6 to
6066db1
Compare
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
I'd be surprised if my changes were causing the unit tests to fail to run 🤔 |
| long pauseMsForServerOverloaded = TimeUnit.NANOSECONDS.toMillis(pauseNsForServerOverloaded); | ||
| long basePauseMs = TimeUnit.NANOSECONDS.toMillis(pauseNs); | ||
| long pauseMillis = getPauseMillis(t, null, pauseMsForServerOverloaded, basePauseMs); | ||
| return TimeUnit.MILLISECONDS.toNanos(pauseMillis); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can definitely see an argument for some duplicative logic rather than these superfluous conversions. happy to make that change if people think it'd be preferable/meaningful
| long remainingTimeNs = remainingTimeNs(); | ||
| long maxDelayNs = remainingTimeNs - SLEEP_DELTA_NS; | ||
| if (maxDelayNs <= 0 || maxDelayNs <= pauseNsToUse) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trying to parse this change... i feel like we can ditch this diff, since we handle the completeExceptionally for throttling above? That was the only change I wanted here, not sure if this is accomplishing something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah you're right
6066db1 to
99a5708
Compare
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
72b2c49 to
707b90d
Compare
|
💔 -1 overall
This message was automatically generated. |
707b90d to
e2c9c1f
Compare
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
bbeaudreault
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking really good, just a couple more thoughts (really). As you are using the new manager now, it could effectively be static. I was thinking you'd pass in some of the configs in the constructor, but I don't have a strong opinion. As it stands now it might be lighter weight to simple put in ConnectionUtils rather than create a new class with a single static-like method.
hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncBatchRpcRetryingCaller.java
Show resolved
Hide resolved
| public class HBaseServerExceptionPauseManager { | ||
| private static final Logger LOG = LoggerFactory.getLogger(HBaseServerExceptionPauseManager.class); | ||
|
|
||
| private HBaseServerExceptionPauseManager() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could pass the Optional in here. You could also pass in the pauseNs values if you want, since they don't change. But not a huge deal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which optional are you referring to? I'm envisioning something like:
private final long pauseNs;
private final long pauseNsForServerOverloaded;
public HBaseServerExceptionPauseManager(long pauseNs, long pauseNsForServerOverloaded) {
this.pauseNs = pauseNs;
this.pauseNsForServerOverloaded = pauseNsForServerOverloaded;
}
public OptionalLong getPauseNsFromException(Throwable error, long remainingTimeNs) {
long expectedSleepNs;
if (error instanceof RpcThrottlingException) {
RpcThrottlingException rpcThrottlingException = (RpcThrottlingException) error;
expectedSleepNs = TimeUnit.MILLISECONDS.toNanos(rpcThrottlingException.getWaitInterval());
if (expectedSleepNs > remainingTimeNs) {
return OptionalLong.empty();
}
if (LOG.isDebugEnabled()) {
LOG.debug("Sleeping for {}ms after catching RpcThrottlingException", expectedSleepNs,
rpcThrottlingException);
}
} else {
expectedSleepNs =
HBaseServerException.isServerOverloaded(error) ? pauseNsForServerOverloaded : pauseNs;
}
return OptionalLong.of(expectedSleepNs);
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In one of my other most recent comments I suggested doing the metrics incrementing for server overloaded in here so we don't have to handle server overloaded in two places. Not sure if that works, but if so the metrics instance could get passed in here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, I see what you mean! That would be nice. I'm going to omit it here because it would require a somewhat larger diff; the metric reported is the delayNs rather than just the pauseNs, there's some nuance in whether the metric gets reported in the fast failure cases, etc. so I think it would complicate the changeset
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
a261cfa to
28f77ff
Compare
28f77ff to
ac668cb
Compare
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
…cThrottlingException (apache#5226) Signed-off-by: Bryan Beaudreault <[email protected]>
…cThrottlingException (apache#5226) Signed-off-by: Bryan Beaudreault <[email protected]>
…it interval in RpcThrottlingException (apache#5226) Signed-off-by: Bryan Beaudreault <[email protected]>
The RpcThrottlingException tells the client how much to back off, but right now the recommendation is ignored. This PR introduces logic that respects said back off recommendation.
I've tested the synchronous client implementation on a cluster at my day job. Here is the log output from a single client thread that's being throttled by a strict quota: https://gist.github.com/rmdmattingly/4c8e63bec16cfa2e4324ff6590a5f8ea. It demonstrates that the thread is backing off for the given wait interval.
I'm very open to adding more explicit unit testing of this behavior, and/or gating behind some new Configuration option, and would appreciate any suggestions regarding where said tests/config should live.
This also didn't apply particularly cleanly to the master branch, but I'm particularly interested in getting this feature on branch-2. How should I approach this? I'm open to figuring out an equivalent solution against master.
@bbeaudreault