Skip to content

Conversation

@charlesconnell
Copy link
Contributor

Use of the org.apache.hadoop.conf.Configuration class to look up values is not super fast. It's fine most of the time, but in a very hot code path, it takes up noticeable CPU time.

ByteBuffDecompressor's are pooled and reused to avoid garbage collection churn. This means that sometimes their settings are not right for the block they're being asked to decompress. To handle this, before every decompression action, we call ByteBuffDecompressor#reinit(Configuration), so it can pull settings from the Configuration in preparation for the decompression it's about to do. The Configuration#get() inside reinit() happens once per block, even though the settings it deals with are consistent across an entire table.

Because the settings used by a ByteBuffDecompressor don't actually change within a table, we can pull the settings it needs from a Configuration when opening the HFile, and then not check again.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@charlesconnell charlesconnell marked this pull request as ready for review March 26, 2025 16:28
@charlesconnell charlesconnell force-pushed the HBASE-29218/hfile-decompressing-context branch 2 times, most recently from d424ae6 to e8f9f96 Compare April 15, 2025 14:11
@charlesconnell charlesconnell force-pushed the HBASE-29218/hfile-decompressing-context branch from e8f9f96 to 99f82e4 Compare April 15, 2025 14:17
@charlesconnell
Copy link
Contributor Author

charlesconnell commented Apr 15, 2025

I've added a commit to switch uses of the Caffeine cache back to Guava, in modules that leak into the client. I should have considered that using an un-shaded Caffeine would pollute client classpaths.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

Copy link
Member

@ndimiduk ndimiduk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions / nits. Otherwise looks good to me.

<groupId>org.apache.commons</groupId>
<artifactId>commons-crypto</artifactId>
</dependency>
<dependency>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this.

public static final String ZSTD_DICTIONARY_KEY = "hbase.io.compress.zstd.dictionary";

private static final Cache<String, Pair<ZstdDictDecompress, Integer>> DECOMPRESS_DICT_CACHE =
CacheBuilder.newBuilder().maximumSize(100).expireAfterAccess(10, TimeUnit.MINUTES).build();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the details of this cache be user-configurable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference is for less low-value configuration options. But, I'm happy to do it either way, what do you prefer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, I'm fine with leaving them as hard-coded for now. Just asking the question to prompt the discussion.

@ndimiduk
Copy link
Member

Oh sorry. Please also address the checkstyle nits. Thanks @charlesconnell !

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 50s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+0 🆗 mvndep 0m 12s Maven dependency ordering for branch
+1 💚 mvninstall 4m 46s master passed
+1 💚 compile 5m 6s master passed
+1 💚 checkstyle 1m 19s master passed
+1 💚 spotbugs 3m 26s master passed
+1 💚 spotless 1m 9s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 16s Maven dependency ordering for patch
+1 💚 mvninstall 4m 45s the patch passed
+1 💚 compile 5m 27s the patch passed
+1 💚 javac 5m 27s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 42s the patch passed
+1 💚 xmllint 0m 0s No new issues.
+1 💚 spotbugs 4m 33s the patch passed
+1 💚 hadoopcheck 16m 20s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 0m 56s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 37s The patch does not generate ASF License warnings.
61m 52s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6857/5/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #6857
JIRA Issue HBASE-29218
Optional Tests dupname asflicense javac codespell detsecrets xmllint hadoopcheck spotless compile spotbugs checkstyle hbaseanti
uname Linux af3550b1c32f 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 835b934
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 83 (vs. ulimit of 30000)
modules C: hbase-common hbase-server hbase-compression/hbase-compression-zstd U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6857/5/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3 xmllint=20913
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 5m 49s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 1m 3s Maven dependency ordering for branch
+1 💚 mvninstall 3m 51s master passed
+1 💚 compile 1m 40s master passed
+1 💚 javadoc 1m 6s master passed
+1 💚 shadedjars 6m 19s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 13s Maven dependency ordering for patch
+1 💚 mvninstall 5m 57s the patch passed
+1 💚 compile 1m 35s the patch passed
+1 💚 javac 1m 35s the patch passed
+1 💚 javadoc 0m 56s the patch passed
+1 💚 shadedjars 6m 18s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 3m 13s hbase-common in the patch passed.
-1 ❌ unit 398m 20s /patch-unit-hbase-server.txt hbase-server in the patch failed.
+1 💚 unit 9m 20s hbase-compression-zstd in the patch passed.
459m 55s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6857/5/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #6857
JIRA Issue HBASE-29218
Optional Tests javac javadoc unit shadedjars compile
uname Linux d9c61612bc3a 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 835b934
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6857/5/testReport/
Max. process+thread count 4329 (vs. ulimit of 30000)
modules C: hbase-common hbase-server hbase-compression/hbase-compression-zstd U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6857/5/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@ndimiduk ndimiduk changed the title HBASE-29218: Pass around an HFileDecompressionContext to reduce calls to Configuration.get() HBASE-29218: Reduce calls to Configuration#get() in decompression path Apr 24, 2025
@ndimiduk ndimiduk merged commit 763093a into apache:master Apr 24, 2025
1 check failed
@ndimiduk ndimiduk deleted the HBASE-29218/hfile-decompressing-context branch April 24, 2025 09:13
ndimiduk pushed a commit to ndimiduk/hbase that referenced this pull request Apr 24, 2025
ndimiduk pushed a commit to ndimiduk/hbase that referenced this pull request Apr 24, 2025
ndimiduk pushed a commit that referenced this pull request Apr 24, 2025
ndimiduk added a commit that referenced this pull request Apr 25, 2025
ndimiduk added a commit to ndimiduk/hbase that referenced this pull request Apr 25, 2025
ndimiduk added a commit to ndimiduk/hbase that referenced this pull request Apr 25, 2025
ndimiduk added a commit that referenced this pull request Apr 26, 2025
ndimiduk added a commit that referenced this pull request Apr 26, 2025
mokai87 pushed a commit to mokai87/hbase that referenced this pull request Aug 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants