-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-27088 IntegrationLoadTestCommonCrawl async load improvements #4488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
Needs to go back to draft. Got clever at the end and broke something, back soon. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
Good to go. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
- Use an async client and work stealing executor for parallelism during loads. - Remove the verification read retries, these are not that effective during replication lag anyway. - Increase max task attempts because S3 might throttle. - Implement a side task that exercises Increments by extracting urls from content and updating a cf that tracks referrer counts. These are not validated at this time. It could be possible to log the increments, sum them with a reducer, and then verify the total, but this is left as a future exercise.
…tes submitted. Expose as job counters.
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
…4488) * HBASE-27088 IntegrationLoadTestCommonCrawl async load improvements - Use an async client and work stealing executor for parallelism during loads. - Remove the verification read retries, these are not that effective during replication lag anyway. - Increase max task attempts because S3 might throttle. - Implement a side task that exercises Increments by extracting urls from content and updating a cf that tracks referrer counts. These are not validated at this time. It could be possible to log the increments, sum them with a reducer, and then verify the total, but this is left as a future exercise. Signed-off-by: Viraj Jasani <[email protected]> * Sum RPC time for writes (loader) and reads (verifier) and mutation bytes submitted. Expose as job counters. * Fix an issue with completion chaining * Pause loading if too many operations are in flight
…4488) * HBASE-27088 IntegrationLoadTestCommonCrawl async load improvements - Use an async client and work stealing executor for parallelism during loads. - Remove the verification read retries, these are not that effective during replication lag anyway. - Increase max task attempts because S3 might throttle. - Implement a side task that exercises Increments by extracting urls from content and updating a cf that tracks referrer counts. These are not validated at this time. It could be possible to log the increments, sum them with a reducer, and then verify the total, but this is left as a future exercise. Signed-off-by: Viraj Jasani <[email protected]> * Sum RPC time for writes (loader) and reads (verifier) and mutation bytes submitted. Expose as job counters. * Fix an issue with completion chaining * Pause loading if too many operations are in flight
Manually verified with test executions on a small cluster.