-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HADOOP-18797. Support Concurrent Writes With S3A Magic Committer #6122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-18797. Support Concurrent Writes With S3A Magic Committer #6122
Conversation
…che#6006) Jobs which commit their work to S3 thr magic committer now use a unique magic containing the job ID: __magic_job-${jobid} This allows for multiple jobs to write to the same destination simultaneously. Contributed by Syed Shameerur Rahman
|
FYI: @steveloughran |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
looks good and I'm not going to review backports except for backporting issues. which s3 region did you test against, and with what parameters? |
How was this patch tested? [INFO] Running org.apache.hadoop.fs.s3a.commit.staging.TestDirectoryCommitterScale [INFO] Running org.apache.hadoop.fs.s3a.commit.staging.TestPaths [INFO] Running org.apache.hadoop.fs.s3a.commit.TestMagicCommitPaths [INFO] Running org.apache.hadoop.fs.s3a.commit.staging.TestStagingCommitter [INFO] Running org.apache.hadoop.fs.s3a.commit.staging.TestStagingPartitionedFileListing [INFO] Running org.apache.hadoop.fs.s3a.commit.staging.TestStagingDirectoryOutputCommitter [INFO] Running org.apache.hadoop.fs.s3a.commit.staging.TestStagingPartitionedJobCommit [INFO] Running org.apache.hadoop.fs.s3a.commit.staging.TestStagingPartitionedTaskCommit [INFO] Tests run: 28, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.473 s - in org.apache.hadoop.fs.s3a.commit.TestMagicCommitPaths [INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.611 s - in org.apache.hadoop.fs.s3a.commit.staging.TestPaths [INFO] Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.965 s - in org.apache.hadoop.fs.s3a.commit.staging.TestStagingDirectoryOutputCommitter [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 23.474 s - in org.apache.hadoop.fs.s3a.commit.staging.TestStagingPartitionedFileListing [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 27.627 s - in org.apache.hadoop.fs.s3a.commit.staging.TestStagingPartitionedJobCommit [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 29.249 s - in org.apache.hadoop.fs.s3a.commit.staging.TestStagingPartitionedTaskCommit [INFO] Tests run: 63, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 61.11 s - in org.apache.hadoop.fs.s3a.commit.staging.TestStagingCommitter [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 84.193 s - in org.apache.hadoop.fs.s3a.commit.staging.TestDirectoryCommitterScale [INFO] Running org.apache.hadoop.fs.s3a.commit.staging.integration.ITestStagingCommitProtocol [INFO] Running org.apache.hadoop.fs.s3a.commit.staging.integration.ITestDirectoryCommitProtocol [INFO] Running org.apache.hadoop.fs.s3a.commit.staging.integration.ITestPartitionedCommitProtocol [INFO] Running org.apache.hadoop.fs.s3a.commit.staging.integration.ITestStagingCommitProtocolFailure [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.901 s - in .hadoop.fs.s3a.commit.staging.integration.ITestStagingCommitProtocolFailure [INFO] Running org.apache.hadoop.fs.s3a.commit.magic.ITestMagicCommitProtocol [INFO] Running org.apache.hadoop.fs.s3a.commit.magic.ITestMagicCommitProtocolFailure [INFO] Running org.apache.hadoop.fs.s3a.commit.integration.ITestS3ACommitterMRJob [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.225 s - in org.apache.hadoop.fs.s3a.commit.magic.ITestMagicCommitProtocolFailure [INFO] Running org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory [INFO] Running org.apache.hadoop.fs.s3a.commit.ITestCommitOperations [INFO] Running org.apache.hadoop.fs.s3a.commit.ITestCommitOperationCost [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.05 s - in org.apache.hadoop.fs.s3a.commit.ITestS3ACommitterFactory [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 43.398 s - in org.apache.hadoop.fs.s3a.commit.ITestCommitOperationCost [INFO] Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 143.308 s - in org.apache.hadoop.fs.s3a.commit.ITestCommitOperations [INFO] Running org.apache.hadoop.fs.s3a.auth.ITestAssumedRoleCommitOperations [INFO] Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 218.466 s - in org.apache.hadoop.fs.s3a.commit.integration.ITestS3ACommitterMRJob [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 18, Time elapsed: 62.036 s - in org.apache.hadoop.fs.s3a.auth.ITestAssumedRoleCommitOperations [WARNING] Tests run: 24, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 429.367 s - in .hadoop.fs.s3a.commit.staging.integration.ITestPartitionedCommitProtocol [INFO] Tests run: 24, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 472.06 s - in .hadoop.fs.s3a.commit.staging.integration.ITestStagingCommitProtocol [INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 498.078 s - in .hadoop.fs.s3a.commit.staging.integration.ITestDirectoryCommitProtocol [INFO] Tests run: 23, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 861.607 s - in org.apache.hadoop.fs.s3a.commit.magic.ITestMagicCommitProtocol [INFO] Running org.apache.hadoop.fs.s3a.commit.magic.ITestS3AHugeMagicCommits [WARNING] Tests run: 10, Failures: 0, Errors: 0, Skipped: 10, Time elapsed: 29.141 s - in org.apache.hadoop.fs.s3a.commit.magic.ITestS3AHugeMagicCommits [INFO] Running org.apache.hadoop.fs.s3a.commit.terasort.ITestTerasortOnS3A [WARNING] Tests run: 14, Failures: 0, Errors: 0, Skipped: 14, Time elapsed: 47.485 s - in org.apache.hadoop.fs.s3a.commit.terasort.ITestTerasortOnS3A |
|
thanks. we don't need the whole trace, just region, maven args (-Dscale -Dprefetch ...) and whether any tests failed. if a test fails, look in jira to see if it's known |
|
@shameersss1 @steveloughran Was this patch released as part of any 3.3.x hadoop-aws releases. I see that the JIRA references |
Jobs which commit their work to S3 through
magic committer now use a unique magic
containing the job ID:
__magic_job-${jobid}
This allows for multiple jobs to write
to the same destination simultaneously.
Contributed by Syed Shameerur Rahman
Description of PR
How was this patch tested?
For code changes:
LICENSE,LICENSE-binary,NOTICE-binaryfiles?