-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-24289 Heterogeneous Storage for Date Tiered Compaction #1730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
Can you get the scope document into a markdown file in |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
OK, I add it to |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
| avoid moving data blocks at the HDFS level, we can set the file parent directory to | ||
| the storage policy we need before writing data. The new file automatically inherits the | ||
| storage policy of the parent directory, and is written according to the correct disk | ||
| type when writing. So as to avoid later data movement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great.
| |hbase.hstore.compaction.date.tiered.cold.window.age.millis|Long.MAX|| | ||
| |hbase.hstore.compaction.date.tiered.cold.window.storage.policy|HOT|| | ||
|
|
||
| The original date tiered compaction related configuration has the same meaning and maintains |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will happen if the CF config storage policy and enable this feature too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If hbase.hstore.compaction.date.tiered.storage.policy.enable is true, this will override CF config storage policy, and hbase.hstore.block.storage.policy does not work. Because storefile must belong to one window and will use window storage policy
| } else { | ||
| writer = writerFactory.createWriter(); | ||
| } | ||
| //writer = writerFactory.createWriter(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will remove later.
| } | ||
| CommonFSUtils.setStoragePolicy(this.fs, dir, fileStoragePolicy); | ||
| if (LOG.isDebugEnabled()) { | ||
| LOG.debug( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only log once when region created? If so, can use info log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I think each type storage policy will create tmp dir once. CommonFSUtils.setStoragePolicy(this.fs, dir, fileStoragePolicy); should follow HRegionFileSystem.mkdirs?
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
Apache9
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Let's have a try.
|
|
||
| private final boolean needEmptyFile; | ||
|
|
||
| private final Map<Long, String> lowerBoundariesPolicies; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ImmutableMap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lowerBoundariesPolicies is HashMap
| throws IOException { | ||
| if (request instanceof DateTieredCompactionRequest) { | ||
| return compactor.compact(request, ((DateTieredCompactionRequest) request).getBoundaries(), | ||
| ((DateTieredCompactionRequest) request).getBoundariesPolicies(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cast it to DateTieredCompactionRequest with a locl variable and then make use of the casted instance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... I follow the use of ((DateTieredCompactionRequest) request).getBoundaries() previous line of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use a local variable and no need to cast twice...
| // before version 3.3.0 (See HDFS-13209). Use child dir here is to make stored files | ||
| // satisfy the specific storage policy when writing. So as to avoid later data movement. | ||
| // We don't want to change whole temp dir to 'fileStoragePolicy'. | ||
| if (fileStoragePolicy != null && !fileStoragePolicy.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use Strings.isNullOrEmpty in guava.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
| * @throws IOException with error | ||
| */ | ||
| @Test | ||
| public void incomingWindowHot() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Method name start with "test"? incomingWindowHot => testIncomingWindowHot
| String prefix = HConstants.STORAGE_POLICY_PREFIX; | ||
| for (Path newFile : newFiles) { | ||
| if (newFile.getParent().getName().startsWith(prefix)) { | ||
| CommonFSUtils.setStoragePolicy(fs.getFileSystem(), newFile, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set storage policy for a file? This is not work before hdfs 3.3.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newFiles是已经compact结束将要被rename到region目录下的文件,但它们的storage policy还未指定,文件storage policy属性保存在INode中。rename之后,未指定storage policy的文件自动继承新的父目录。在rename之前调用setStoragePolicy()是为了使数据和storage policy保持一致,避免storage policy因为继承发生变化。HDFS允许对已经close的文件设置storage policy, 并不会发生数据移动。而且newFiles的数据和临时父目录的storage policy已经是一致的了(因为在compact之前tmp下已创建了不同存储策略的临时目录)
newFiles are files that have been compacted and will be renamed to the region directory, but their storage policy has not been specified. The file storage policy attribute is saved in INode. After renaming, files that do not specify a storage policy automatically inherit the new parent directory. Before renaming call to setStoragePolicy() here is to keep the data consistent with the storage policy and avoid the storage policy changing due to inheritance. HDFS allows storage policy to be set on files that have been closed, and no data movement will occur. Moreover, the data of newFiles and the storage policy of tmp parent dir are already consistent (because tmp directories of different storage strategies have been created under tmp dir before compact).
| long[] sizes = new long[] { 30, 31, 32, 33, 34, 20, 21, 22, 23, 24, 25, 10, 11, 12, 13 }; | ||
| Map<Long, String> expected = new HashMap<>(); | ||
| // boundaries = { Long.MIN_VALUE, 12 } | ||
| expected.put(12L, HOT_WINDOW_SP); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explain more about the unit test? What is the different between this and other tests? And there are some duplicate code in these test methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix duplicate code
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
infraio
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Signed-off-by: Guanghao Zhang <[email protected]> Signed-off-by: Duo Zhang <[email protected]>
…1730) Signed-off-by: Guanghao Zhang <[email protected]> Signed-off-by: Duo Zhang <[email protected]>
…1730) Signed-off-by: Guanghao Zhang <[email protected]> Signed-off-by: Duo Zhang <[email protected]>
No description provided.