-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29218: Reduce calls to Configuration#get() in decompression path #6857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-29218: Reduce calls to Configuration#get() in decompression path #6857
Conversation
… to Configuration.get()
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
d424ae6 to
e8f9f96
Compare
e8f9f96 to
99f82e4
Compare
|
I've added a commit to switch uses of the Caffeine cache back to Guava, in modules that leak into the client. I should have considered that using an un-shaded Caffeine would pollute client classpaths. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
ndimiduk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions / nits. Otherwise looks good to me.
| <groupId>org.apache.commons</groupId> | ||
| <artifactId>commons-crypto</artifactId> | ||
| </dependency> | ||
| <dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this.
hbase-common/src/main/java/org/apache/hadoop/hbase/io/compress/DictionaryCache.java
Show resolved
Hide resolved
...mmon/src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDefaultDecodingContext.java
Show resolved
Hide resolved
| public static final String ZSTD_DICTIONARY_KEY = "hbase.io.compress.zstd.dictionary"; | ||
|
|
||
| private static final Cache<String, Pair<ZstdDictDecompress, Integer>> DECOMPRESS_DICT_CACHE = | ||
| CacheBuilder.newBuilder().maximumSize(100).expireAfterAccess(10, TimeUnit.MINUTES).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the details of this cache be user-configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preference is for less low-value configuration options. But, I'm happy to do it either way, what do you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, I'm fine with leaving them as hard-coded for now. Just asking the question to prompt the discussion.
...td/src/main/java/org/apache/hadoop/hbase/io/compress/zstd/ZstdHFileDecompressionContext.java
Outdated
Show resolved
Hide resolved
|
Oh sorry. Please also address the checkstyle nits. Thanks @charlesconnell ! |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
apache#6857) Signed-off-by: Nick Dimiduk <[email protected]>
apache#6857) Signed-off-by: Nick Dimiduk <[email protected]>
#6857) Signed-off-by: Nick Dimiduk <[email protected]>
#6857) Signed-off-by: Nick Dimiduk <[email protected]> Co-authored-by: Charles Connell <[email protected]>
apache#6857) Signed-off-by: Nick Dimiduk <[email protected]> Co-authored-by: Charles Connell <[email protected]>
apache#6857) Signed-off-by: Nick Dimiduk <[email protected]> Co-authored-by: Charles Connell <[email protected]>
#6857) Signed-off-by: Nick Dimiduk <[email protected]> Co-authored-by: Charles Connell <[email protected]>
#6857) Signed-off-by: Nick Dimiduk <[email protected]> Co-authored-by: Charles Connell <[email protected]>
apache#6857) Signed-off-by: Nick Dimiduk <[email protected]> Co-authored-by: Charles Connell <[email protected]>
Use of the
org.apache.hadoop.conf.Configurationclass to look up values is not super fast. It's fine most of the time, but in a very hot code path, it takes up noticeable CPU time.ByteBuffDecompressor's are pooled and reused to avoid garbage collection churn. This means that sometimes their settings are not right for the block they're being asked to decompress. To handle this, before every decompression action, we callByteBuffDecompressor#reinit(Configuration), so it can pull settings from the Configuration in preparation for the decompression it's about to do. TheConfiguration#get()insidereinit()happens once per block, even though the settings it deals with are consistent across an entire table.Because the settings used by a
ByteBuffDecompressordon't actually change within a table, we can pull the settings it needs from aConfigurationwhen opening the HFile, and then not check again.