HBASE-29644: Refresh_meta triggering compaction on user table #7385

sharmaar12 · 2025-10-14T12:18:32Z

Link to JIRA: https://issues.apache.org/jira/browse/HBASE-29644

Description:
Consider the two cluster setup with one being active and one read replica. If active cluster create a table with FILE based SFT. If you add few rows through active and do flushes to create few Hfiles and then do refresh_meta from read replica its triggering minor compaction. Which should not happen via read replica, it may create inconsitencies because active is not aware of that event.

Cause:
This is happening because we should block the compaction event in ReadOnlyController but we missed adding read only guard to preCompactSelection() function.

Fix:
Add internalReadOnlyGuard to preCompactSelection() in ReadOnlyController

wchevreuil

It's nice to have such safeguard in case we face any unexpected attempt of performing a write operation in a read-only cluster, but it's not an acceptable solution for the use case here. We know something triggers compaction when the refresh_meta command is executed on a read replica cluster, so we should find out where that's been triggered and put a check there to avoid waste of resources, rather than relying on exception being thrown. That would cause log pollution and could create confusion for operators.

sharmaar12 · 2025-10-14T13:44:23Z

@wchevreuil Thanks for the suggestion, we will check what is the root cause of this.

anmolnar · 2025-10-17T16:20:27Z

@sharmaar12 Try the following: create a unit test which triggers the problem, attach debugger and set a breakpoint in your event handler preCompactSelection. From stack trace you will see the root cause of compaction.

sharmaar12 · 2025-10-28T14:48:29Z

@wchevreuil @anmolnar
The current fix follows the approach to discard the compaction request whenever the read-only mode is on. Do you think we need to find all the callers which can execute the compaction thread and block the request at that level?

wchevreuil · 2025-10-29T11:21:08Z

hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java

+    if (isReadOnlyEnabled()) {
+      LOG.info("Ignoring compaction request for " + region + ",because read-only mode is on.");
+      return;
+    }
+


Why we don't simply disable compaction altogether in the read replica cluster? See line #343 in CompactionSplit, there's already a check for compaction enabled flag. I would rather refrain from polluting CompactiSplit code with logic for read replica.

We can use that approach but then one issue I can think of is that hbase.global.readonly.enabled property is dynamically configurable using update_all_config but is it true for hbase.hstore.compaction.enabled also?

I like @wchevreuil 's idea.
How about adding the read-only check to the getter?

public boolean isCompactionsEnabled() { return compactionsEnabled && !isReadOnlyEnabled(); }

You don't need to dynamically change the compaction flag.
wdyt?

Then we may need to at least modify the log messages to mention that either compaction is disabled or readonly mode is on. Otherwise compaction may be enabled but we are logging it as disabled because of read-only mode.

LOG.info("Ignoring compaction request for " + region + (!isReadOnlyEnabled ? ", because compaction is disabled." : " in read-only mode"));

or just leave it as is, not a biggy

hbase.hstore.compaction.enabled

The actual property name is hbase.regionserver.compaction.enabled. Compaction is actual "switchable" via the Admin.compactionSwitch() method (we also expose an hbase shell command for that). The CompactSplit thread itself exposes a switchCompaction method which could be called on both RS startup and the dynamic config handler for the hbase.global.readonly.enabled property.

Be careful with switching the property directly. User might have intentionally disabled it and you should not enable it when go from R/O -> R/W mode. My approach seems safer to me.

Be careful with switching the property directly. User might have intentionally disabled it and you should not enable it when go from R/O -> R/W mode. My approach seems safer to me.

Good point. Let's just do all checks inside isCompactionsEnabled, as @anmolnar suggested.

anmolnar · 2025-11-10T20:45:58Z

@sharmaar12 Do you think that the unit test failure is related to the patch?

sharmaar12 · 2025-11-11T03:26:40Z

@sharmaar12 Do you think that the unit test failure is related to the patch?

@anmolnar It may not be because, it passes on my local setup. Also our code only execute when read-only mode is on.

In previous run https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7385/3/ also there were 8 failures which were different from these and they also passed in my local setup.

Could you please help me rerun the job?

anmolnar · 2025-11-11T15:47:50Z

Looks like the build has been restarted. @wchevreuil would you please review again?

Link to JIRA: https://issues.apache.org/jira/browse/HBASE-29644 Description: Consider the two cluster setup with one being active and one read replica. If active cluster create a table with FILE based SFT. If you add few rows through active and do flushes to create few Hfiles and then do refresh_meta from read replica its triggering minor compaction. Which should not happen via read replica, it may create inconsitencies because active is not aware of that event. Cause: This is happening because we should block the compaction event in ReadOnlyController but we missed adding read only guard to preCompactSelection() function. Fix: Add internalReadOnlyGuard to preCompactSelection() in ReadOnlyController

Apache-HBase · 2025-11-12T11:27:06Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 31s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	hbaseanti	0m 0s		Patch does not have any anti-patterns.
			_ HBASE-29081 Compile Tests _
+1 💚	mvninstall	3m 41s		HBASE-29081 passed
+1 💚	compile	3m 23s		HBASE-29081 passed
-0 ⚠️	checkstyle	0m 15s	/buildtool-branch-checkstyle-hbase-server.txt	The patch fails to run checkstyle in hbase-server
+1 💚	spotbugs	1m 37s		HBASE-29081 passed
+1 💚	spotless	0m 52s		branch has no errors when running spotless:check.
			_ Patch Compile Tests _
+1 💚	mvninstall	3m 9s		the patch passed
+1 💚	compile	3m 24s		the patch passed
+1 💚	javac	3m 24s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
-0 ⚠️	checkstyle	0m 12s	/buildtool-patch-checkstyle-hbase-server.txt	The patch fails to run checkstyle in hbase-server
+1 💚	spotbugs	1m 42s		the patch passed
+1 💚	hadoopcheck	12m 20s		Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚	spotless	0m 45s		patch has no errors when running spotless:check.
			_ Other Tests _
+1 💚	asflicense	0m 11s		The patch does not generate ASF License warnings.
		39m 59s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7385/9/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#7385
JIRA Issue	HBASE-29644
Optional Tests	dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname	Linux 786d00ed274e 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	HBASE-29081 / `e6d2559`
Default Java	Eclipse Adoptium-17.0.11+9
Max. process+thread count	85 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7385/9/console
versions	git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by	Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2025-11-12T15:16:26Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 31s		Docker mode activated.
-0 ⚠️	yetus	0m 3s		Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
			_ Prechecks _
			_ HBASE-29081 Compile Tests _
+1 💚	mvninstall	3m 36s		HBASE-29081 passed
+1 💚	compile	0m 59s		HBASE-29081 passed
+1 💚	javadoc	0m 28s		HBASE-29081 passed
+1 💚	shadedjars	6m 22s		branch has no errors when building our shaded downstream artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	3m 17s		the patch passed
+1 💚	compile	1m 0s		the patch passed
+1 💚	javac	1m 0s		the patch passed
+1 💚	javadoc	0m 28s		the patch passed
+1 💚	shadedjars	6m 13s		patch has no errors when building our shaded downstream artifacts.
			_ Other Tests _
+1 💚	unit	240m 49s		hbase-server in the patch passed.
		269m 22s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7385/9/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR	#7385
JIRA Issue	HBASE-29644
Optional Tests	javac javadoc unit compile shadedjars
uname	Linux bf0c2b90e7ef 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	HBASE-29081 / `e6d2559`
Default Java	Eclipse Adoptium-17.0.11+9
Test Results	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7385/9/testReport/
Max. process+thread count	4194 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7385/9/console
versions	git=2.34.1 maven=3.9.8
Powered by	Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

sharmaar12 force-pushed the meta_compaction branch from 10afd02 to 3a989c8 Compare October 14, 2025 12:21

sharmaar12 changed the title ~~Refresh_meta triggering compaction on user table~~ HBASE-29644: Refresh_meta triggering compaction on user table Oct 14, 2025