-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HADOOP-18104: S3A: Add configs to configure minSeekForVectorReads and maxReadSizeForVectorReads #3964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-18104: S3A: Add configs to configure minSeekForVectorReads and maxReadSizeForVectorReads #3964
Conversation
|
CC @mehakmeet |
ac08a25 to
87682c9
Compare
… maxReadSizeForVectorReads Part of HADOOP-18103. Introducing fs.s3a.min.seek.vectored.read and fs.s3a.max.readsize.vectored.read to configure min seek and max read during a vectored IO operation in S3A connector. These propeties actually define how the ranges will be merged. To completely disable merging set fs.s3a.max.readsize.vectored.read to 0.
48ba464 to
99c983e
Compare
|
💔 -1 overall
This message was automatically generated. |
steveloughran
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor changes, name of the options the most significant
...mon-project/hadoop-common/src/test/java/org/apache/hadoop/fs/impl/TestVectoredReadUtils.java
Outdated
Show resolved
Hide resolved
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/test/MoreAsserts.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
Outdated
Show resolved
Hide resolved
|
|
||
| /** | ||
| * What is the largest size that we should group ranges | ||
| * together during vectored read? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure that "readsize" is the best term here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can say merged.size?
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AReadOpContext.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AReadOpContext.java
Outdated
Show resolved
Hide resolved
| import java.util.ArrayList; | ||
| import java.util.List; | ||
|
|
||
| import org.junit.Test; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the ordering of imports is already mixed up here; should be
java
other
org.apache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how it got messed up. Sorry
...hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractVectoredRead.java
Outdated
Show resolved
Hide resolved
...hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractVectoredRead.java
Outdated
Show resolved
Hide resolved
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
...mon-project/hadoop-common/src/test/java/org/apache/hadoop/fs/impl/TestVectoredReadUtils.java
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AReadOpContext.java
Outdated
Show resolved
Hide resolved
| The S3A FileSystem supports implementation of vectored read api using which | ||
| a client can provide a list of file ranges to read returning a future read | ||
| object associated with each range. For full api specification please see | ||
| [FSDataInputStream](hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use a .html suffix
use mvn site:site to generate the site to check links
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/performance.md
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/performance.md
Outdated
Show resolved
Hide resolved
| S3ATestUtils.removeBaseAndBucketOverrides(conf, | ||
| Constants.AWS_S3_VECTOR_READS_MIN_SEEK_SIZE, | ||
| Constants.AWS_S3_VECTOR_READS_MAX_MERGED_READ_SIZE); | ||
| S3ATestUtils.disableFilesystemCaching(conf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed
| */ | ||
| public static <T> void assertEqual(T actual, T expected, String message) { | ||
| Assertions.assertThat(actual) | ||
| .describedAs("Mismatch in " + message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- use a %s
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
Outdated
Show resolved
Hide resolved
|
🎊 +1 overall
This message was automatically generated. |
mehakmeet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, pending one small doubt of how we are populating the vectored IO configs.
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
Outdated
Show resolved
Hide resolved
|
🎊 +1 overall
This message was automatically generated. |
|
+1 merge at your leisure. you will also need to rebase the branch. you don't need to do a pr for that, just rebase/fix/retest locally and then do a force push with the update. that is allowed on this branch (anything which doesn't match the pattern branch-*) |
… maxReadSizeForVectorReads (#3964) Part of HADOOP-18103. Introducing fs.s3a.vectored.read.min.seek.size and fs.s3a.vectored.read.max.merged.size to configure min seek and max read during a vectored IO operation in S3A connector. These properties actually define how the ranges will be merged. To completely disable merging set fs.s3a.max.readsize.vectored.read to 0. Contributed By: Mukund Thakur
… maxReadSizeForVectorReads (#3964) Part of HADOOP-18103. Introducing fs.s3a.vectored.read.min.seek.size and fs.s3a.vectored.read.max.merged.size to configure min seek and max read during a vectored IO operation in S3A connector. These properties actually define how the ranges will be merged. To completely disable merging set fs.s3a.max.readsize.vectored.read to 0. Contributed By: Mukund Thakur
… maxReadSizeForVectorReads (#3964) Part of HADOOP-18103. Introducing fs.s3a.vectored.read.min.seek.size and fs.s3a.vectored.read.max.merged.size to configure min seek and max read during a vectored IO operation in S3A connector. These properties actually define how the ranges will be merged. To completely disable merging set fs.s3a.max.readsize.vectored.read to 0. Contributed By: Mukund Thakur
… maxReadSizeForVectorReads (#3964) Part of HADOOP-18103. Introducing fs.s3a.vectored.read.min.seek.size and fs.s3a.vectored.read.max.merged.size to configure min seek and max read during a vectored IO operation in S3A connector. These properties actually define how the ranges will be merged. To completely disable merging set fs.s3a.max.readsize.vectored.read to 0. Contributed By: Mukund Thakur
… maxReadSizeForVectorReads (#3964) Part of HADOOP-18103. Introducing fs.s3a.vectored.read.min.seek.size and fs.s3a.vectored.read.max.merged.size to configure min seek and max read during a vectored IO operation in S3A connector. These properties actually define how the ranges will be merged. To completely disable merging set fs.s3a.max.readsize.vectored.read to 0. Contributed By: Mukund Thakur
… maxReadSizeForVectorReads (apache#3964) Part of HADOOP-18103. Introducing fs.s3a.vectored.read.min.seek.size and fs.s3a.vectored.read.max.merged.size to configure min seek and max read during a vectored IO operation in S3A connector. These properties actually define how the ranges will be merged. To completely disable merging set fs.s3a.max.readsize.vectored.read to 0. Contributed By: Mukund Thakur
Description of PR
Part of HADOOP-18103.
Introducing fs.s3a.min.seek.vectored.read and fs.s3a.max.readsize.vectored.read
to configure min seek and max read during a vectored IO operation in S3A connector.
These propeties actually define how the ranges will be merged. To completely
disable merging set fs.s3a.max.readsize.vectored.read to 0.
How was this patch tested?
For code changes:
LICENSE,LICENSE-binary,NOTICE-binaryfiles?