HBASE-25929 RegionServer JVM crash when compaction #3318

mymeiyi · 2021-05-27T05:04:52Z

No description provided.

Apache-HBase · 2021-05-27T05:59:14Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 34s	Docker mode activated.
		_ Prechecks _
+1 💚	dupname	0m 0s	No case conflicting files found.
+1 💚	hbaseanti	0m 0s	Patch does not have any anti-patterns.
+1 💚	@author	0m 0s	The patch does not contain any @author tags.
		_ master Compile Tests _
+1 💚	mvninstall	4m 7s	master passed
+1 💚	compile	3m 21s	master passed
+1 💚	checkstyle	1m 5s	master passed
+1 💚	spotbugs	2m 7s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	3m 40s	the patch passed
+1 💚	compile	3m 14s	the patch passed
+1 💚	javac	3m 14s	the patch passed
+1 💚	checkstyle	1m 1s	the patch passed
+1 💚	whitespace	0m 0s	The patch has no whitespace issues.
+1 💚	hadoopcheck	20m 49s	Patch does not cause any errors with Hadoop 3.1.2 3.2.1 3.3.0.
+1 💚	spotbugs	2m 49s	the patch passed
		_ Other Tests _
+1 💚	asflicense	0m 15s	The patch does not generate ASF License warnings.
		53m 25s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#3318
Optional Tests	dupname asflicense javac spotbugs hadoopcheck hbaseanti checkstyle compile
uname	Linux a07eb9aebbed 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `a22e418`
Default Java	AdoptOpenJDK-1.8.0_282-b08
Max. process+thread count	96 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/1/console
versions	git=2.17.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Apache9 · 2021-05-27T07:39:32Z

Mind explaing a bit about the fix?

Apache-HBase · 2021-05-27T08:11:04Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 31s	Docker mode activated.
-0 ⚠️	yetus	0m 3s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ master Compile Tests _
+1 💚	mvninstall	4m 7s	master passed
+1 💚	compile	1m 3s	master passed
+1 💚	shadedjars	8m 11s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 40s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	3m 47s	the patch passed
+1 💚	compile	1m 1s	the patch passed
+1 💚	javac	1m 1s	the patch passed
+1 💚	shadedjars	8m 9s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 37s	the patch passed
		_ Other Tests _
-1 ❌	unit	153m 56s	hbase-server in the patch failed.
		185m 12s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR	#3318
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 8ab55bf9860a 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `a22e418`
Default Java	AdoptOpenJDK-1.8.0_282-b08
unit	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/1/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt
Test Results	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/1/testReport/
Max. process+thread count	3869 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/1/console
versions	git=2.17.1 maven=3.6.3
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

mymeiyi

Mind explaing a bit about the fix?

Sure, it is not easy to explain. Let me draw a simple picture to show how this error could happen.

mymeiyi · 2021-05-27T06:54:30Z

hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java

-            // may clear prevBlocks list.
-            kvs.shipped();
-            bytesWrittenProgressForShippedCall = 0;
+        }


Move this out of "for loop", because the "kvs.shipped()" will release the pre blocks in HFileReader, but some cells may have not been shipped by 'writer.append(c)' in line 441.

anoopsjohn · 2021-05-27T09:24:25Z

hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactionWithByteBuff.java

+import org.junit.rules.TestName;
+
+@Category(LargeTests.class)
+public class TestCompactionWithByteBuff {


This test guarantee that early release of BB happens and so possibly hit that issue? I dont think so. May be am missing something!

anoopsjohn · 2021-05-27T09:25:56Z

hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactionWithByteBuff.java

+    conf.setInt(ByteBuffAllocator.BUFFER_SIZE_KEY, 1024 * 5);
+    conf.setInt(CompactSplit.SMALL_COMPACTION_THREADS, REGION_COUNT * 2);
+    conf.setInt(CompactSplit.LARGE_COMPACTION_THREADS, REGION_COUNT * 2);
+    conf.set(HConstants.BUCKET_CACHE_IOENGINE_KEY, "offheap");


Actually Bucket Cache is coming into pic here? I think No. We write some data and flush. 2 files are created. And then compact that. What we wanted is that the compaction is reading the file blocks from BC. But cache data on write is by default false. (hbase.rs.cacheblocksonwrite).

It is not related to BC.
Set to offheap to make sure the blocks are read to OffheapByteBuffer, see HFileReaderImpl#shouldUseHeap.

anoopsjohn · 2021-05-27T09:27:26Z

hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java

            progress.cancel();
            return false;
          }
-          if (kvs != null && bytesWrittenProgressForShippedCall > shippedCallSizeLimit) {


nit : Seems like only format change. Can u avoid?

It is not format change? Moved this code out of the for loop.

Apache-HBase · 2021-05-27T09:27:45Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	14m 33s	Docker mode activated.
-0 ⚠️	yetus	0m 3s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ master Compile Tests _
+1 💚	mvninstall	6m 54s	master passed
+1 💚	compile	1m 51s	master passed
+1 💚	shadedjars	11m 35s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	1m 2s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	6m 36s	the patch passed
+1 💚	compile	1m 40s	the patch passed
+1 💚	javac	1m 40s	the patch passed
+1 💚	shadedjars	9m 10s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 41s	the patch passed
		_ Other Tests _
-1 ❌	unit	205m 28s	hbase-server in the patch failed.
		261m 41s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR	#3318
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 4e42f9079134 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `a22e418`
Default Java	AdoptOpenJDK-11.0.10+9
unit	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/1/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt
Test Results	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/1/testReport/
Max. process+thread count	2686 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/1/console
versions	git=2.17.1 maven=3.6.3
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

mymeiyi · 2021-05-27T09:39:37Z

I write a case how this error could happen in the doc: https://docs.google.com/document/d/1_3HXgOSGHsHFqLiOWUsE3m6pHKjebSwrRObcPTqkSTE/edit?usp=sharing, please have a look, thanks @Apache9 @anoopsjohn

Apache-HBase · 2021-05-27T10:16:05Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 9s	Docker mode activated.
-0 ⚠️	yetus	0m 3s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ master Compile Tests _
+1 💚	mvninstall	5m 5s	master passed
+1 💚	compile	1m 17s	master passed
+1 💚	shadedjars	9m 6s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 43s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	4m 44s	the patch passed
+1 💚	compile	1m 18s	the patch passed
+1 💚	javac	1m 18s	the patch passed
+1 💚	shadedjars	8m 58s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 41s	the patch passed
		_ Other Tests _
-1 ❌	unit	12m 3s	hbase-server in the patch failed.
		46m 25s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/2/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR	#3318
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 6227e03e4dcc 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `63141bf`
Default Java	AdoptOpenJDK-11.0.10+9
unit	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/2/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt
Test Results	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/2/testReport/
Max. process+thread count	646 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/2/console
versions	git=2.17.1 maven=3.6.3
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2021-05-27T10:20:13Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 34s	Docker mode activated.
-0 ⚠️	yetus	0m 3s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ master Compile Tests _
+1 💚	mvninstall	5m 0s	master passed
+1 💚	compile	1m 12s	master passed
+1 💚	shadedjars	10m 29s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 48s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	5m 9s	the patch passed
+1 💚	compile	1m 14s	the patch passed
+1 💚	javac	1m 14s	the patch passed
+1 💚	shadedjars	9m 46s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 42s	the patch passed
		_ Other Tests _
-1 ❌	unit	13m 10s	hbase-server in the patch failed.
		50m 29s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/2/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR	#3318
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 593c58ccff3a 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `63141bf`
Default Java	AdoptOpenJDK-1.8.0_282-b08
unit	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/2/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt
Test Results	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/2/testReport/
Max. process+thread count	650 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/2/console
versions	git=2.17.1 maven=3.6.3
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2021-05-27T10:23:34Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	0m 38s	Docker mode activated.
		_ Prechecks _
+1 💚	dupname	0m 0s	No case conflicting files found.
+1 💚	hbaseanti	0m 0s	Patch does not have any anti-patterns.
+1 💚	@author	0m 0s	The patch does not contain any @author tags.
		_ master Compile Tests _
+1 💚	mvninstall	4m 29s	master passed
+1 💚	compile	3m 39s	master passed
+1 💚	checkstyle	1m 13s	master passed
+1 💚	spotbugs	2m 18s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	4m 16s	the patch passed
+1 💚	compile	3m 28s	the patch passed
+1 💚	javac	3m 28s	the patch passed
+1 💚	checkstyle	1m 13s	the patch passed
+1 💚	whitespace	0m 0s	The patch has no whitespace issues.
+1 💚	hadoopcheck	20m 36s	Patch does not cause any errors with Hadoop 3.1.2 3.2.1 3.3.0.
+1 💚	spotbugs	2m 37s	the patch passed
		_ Other Tests _
+1 💚	asflicense	0m 15s	The patch does not generate ASF License warnings.
		53m 48s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/2/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#3318
Optional Tests	dupname asflicense javac spotbugs hadoopcheck hbaseanti checkstyle compile
uname	Linux 33bf26907363 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `63141bf`
Default Java	AdoptOpenJDK-1.8.0_282-b08
Max. process+thread count	96 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/2/console
versions	git=2.17.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2021-05-27T11:16:35Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 27s	Docker mode activated.
		_ Prechecks _
+1 💚	dupname	0m 0s	No case conflicting files found.
+1 💚	hbaseanti	0m 0s	Patch does not have any anti-patterns.
+1 💚	@author	0m 0s	The patch does not contain any @author tags.
		_ master Compile Tests _
+1 💚	mvninstall	4m 7s	master passed
+1 💚	compile	3m 36s	master passed
+1 💚	checkstyle	1m 9s	master passed
+1 💚	spotbugs	2m 9s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	3m 54s	the patch passed
+1 💚	compile	3m 34s	the patch passed
+1 💚	javac	3m 34s	the patch passed
+1 💚	checkstyle	1m 9s	the patch passed
+1 💚	whitespace	0m 0s	The patch has no whitespace issues.
+1 💚	hadoopcheck	20m 21s	Patch does not cause any errors with Hadoop 3.1.2 3.2.1 3.3.0.
+1 💚	spotbugs	2m 17s	the patch passed
		_ Other Tests _
+1 💚	asflicense	0m 13s	The patch does not generate ASF License warnings.
		52m 7s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/3/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#3318
Optional Tests	dupname asflicense javac spotbugs hadoopcheck hbaseanti checkstyle compile
uname	Linux 6011e13b8a6c 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `63141bf`
Default Java	AdoptOpenJDK-1.8.0_282-b08
Max. process+thread count	96 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/3/console
versions	git=2.17.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2021-05-27T14:36:40Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 44s	Docker mode activated.
-0 ⚠️	yetus	0m 3s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ master Compile Tests _
+1 💚	mvninstall	4m 33s	master passed
+1 💚	compile	1m 14s	master passed
+1 💚	shadedjars	10m 4s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 41s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	4m 29s	the patch passed
+1 💚	compile	1m 18s	the patch passed
+1 💚	javac	1m 18s	the patch passed
+1 💚	shadedjars	9m 55s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 40s	the patch passed
		_ Other Tests _
-1 ❌	unit	215m 28s	hbase-server in the patch failed.
		252m 9s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/3/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR	#3318
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 6a3daed19729 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `63141bf`
Default Java	AdoptOpenJDK-1.8.0_282-b08
unit	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/3/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt
Test Results	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/3/testReport/
Max. process+thread count	2813 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/3/console
versions	git=2.17.1 maven=3.6.3
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2021-05-27T15:01:47Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 24s	Docker mode activated.
-0 ⚠️	yetus	0m 3s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ master Compile Tests _
+1 💚	mvninstall	5m 11s	master passed
+1 💚	compile	1m 21s	master passed
+1 💚	shadedjars	9m 13s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 46s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	5m 8s	the patch passed
+1 💚	compile	1m 20s	the patch passed
+1 💚	javac	1m 20s	the patch passed
+1 💚	shadedjars	9m 18s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 46s	the patch passed
		_ Other Tests _
-1 ❌	unit	240m 57s	hbase-server in the patch failed.
		277m 23s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/3/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR	#3318
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 8f0dee8df4f1 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `63141bf`
Default Java	AdoptOpenJDK-11.0.10+9
unit	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/3/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt
Test Results	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/3/testReport/
Max. process+thread count	2712 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/3/console
versions	git=2.17.1 maven=3.6.3
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

anoopsjohn · 2021-05-28T14:21:10Z

I write a case how this error could happen in the doc: https://docs.google.com/document/d/1_3HXgOSGHsHFqLiOWUsE3m6pHKjebSwrRObcPTqkSTE/edit?usp=sharing, please have a look, thanks @Apache9 @anoopsjohn

The fix is simple, move the shipped method out of the for loop

Ya the issue is very clear for me. Actually in code, we clone the cells for which we keep the ref. Seems missed in this place.. BTW in ur 1st patch that clone of Cell was there but removed now? And instead did this move of shipped out of that for loop?

mymeiyi

BTW in ur 1st patch that clone of Cell was there but removed now? And instead did this move of shipped out of that for loop?

Thanks for your reply.
In the 1st patch, I modified two places:

move the ship method to avoid RS crash in StoreFileWriter.append method;
copy the first cell in block to avoid RS crash in HFileWriterImpl.getMidpoin method.

But I found out that, with fix1, the first cell must be copied by ((ShipperListener) writer).beforeShipped() then call kvs.shipped() to release read blocks, the problem2 can not happen anymore, so I remove fix2.

mymeiyi · 2021-05-27T09:35:48Z

hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java

            progress.cancel();
            return false;
          }
-          if (kvs != null && bytesWrittenProgressForShippedCall > shippedCallSizeLimit) {


It is not format change? Moved this code out of the for loop.

mymeiyi · 2021-05-27T09:59:21Z

hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactionWithByteBuff.java

+    conf.setInt(ByteBuffAllocator.BUFFER_SIZE_KEY, 1024 * 5);
+    conf.setInt(CompactSplit.SMALL_COMPACTION_THREADS, REGION_COUNT * 2);
+    conf.setInt(CompactSplit.LARGE_COMPACTION_THREADS, REGION_COUNT * 2);
+    conf.set(HConstants.BUCKET_CACHE_IOENGINE_KEY, "offheap");


It is not related to BC.
Set to offheap to make sure the blocks are read to OffheapByteBuffer, see HFileReaderImpl#shouldUseHeap.

mymeiyi · 2021-06-01T06:08:13Z

Is there any other problems about this issue? @Apache9 @anoopsjohn

anoopsjohn · 2021-06-01T10:08:15Z

So the latest version change is moving the check for possible shipped() called out of the for loop. The cons here is that when we have wide rows with so many cells in it, this will delay the process of release of blocks and so release from cache. That is why it was kept in for loop.
But the bug is also a big concern.
Now seeing why and when we use this 'lastCleanCell'. This is to reset the seqId on a cell. We set that to be 0 while we pass the cell to the writer (the output writer for the new compacted file(s)). But we try reset that value on the original cell (See details in HBASE-16931).
PrivateCellUtil.setSequenceId(lastCleanCell, lastCleanCellSeqId) is being done inside the for loop before doing shipped call as well as outside the for loop. If the call is happening only within the for() loop, there is no chance of the corruption. But now it can so happen that the call can happen on SAME cell object inside and outside of the loop.
So if we keep the code as is (not moving the shipped call out of for loop) and do like
if (kvs != null && bytesWrittenProgressForShippedCall > shippedCallSizeLimit) {
if (lastCleanCell != null) {
// HBASE-16931, set back sequence id to avoid affecting scan order unexpectedly.
// ShipperListener will do a clone of the last cells it refer, so need to set back
// sequence id before ShipperListener.beforeShipped
PrivateCellUtil.setSequenceId(lastCleanCell, lastCleanCellSeqId);
lastCleanCell = null; // The reset of the seqId for this cell object happened already. Just nullify it
}
We wont get to any issue. Correct? Seems that will be the best way?

mymeiyi · 2021-06-03T02:57:48Z

So the latest version change is moving the check for possible shipped() called out of the for loop. The cons here is that when we have wide rows with so many cells in it, this will delay the process of release of blocks and so release from cache. That is why it was kept in for loop.

Yes, the block release may be delayed. There is a configuration named "hbase.hstore.compaction.kv.max" and default value is 10 to limit scan cell count in compaction. But if a cell size is huge, we can not avoid the delay.

hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java

Apache-HBase · 2021-06-03T03:33:15Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 44s	Docker mode activated.
		_ Prechecks _
+1 💚	dupname	0m 0s	No case conflicting files found.
+1 💚	hbaseanti	0m 0s	Patch does not have any anti-patterns.
+1 💚	@author	0m 0s	The patch does not contain any @author tags.
		_ master Compile Tests _
+1 💚	mvninstall	4m 4s	master passed
+1 💚	compile	3m 20s	master passed
+1 💚	checkstyle	1m 3s	master passed
+1 💚	spotbugs	2m 8s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	3m 39s	the patch passed
+1 💚	compile	3m 8s	the patch passed
+1 💚	javac	3m 8s	the patch passed
+1 💚	checkstyle	1m 3s	the patch passed
+1 💚	whitespace	0m 0s	The patch has no whitespace issues.
+1 💚	hadoopcheck	20m 42s	Patch does not cause any errors with Hadoop 3.1.2 3.2.1 3.3.0.
+1 💚	spotbugs	2m 51s	the patch passed
		_ Other Tests _
+1 💚	asflicense	0m 17s	The patch does not generate ASF License warnings.
		53m 21s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/4/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#3318
Optional Tests	dupname asflicense javac spotbugs hadoopcheck hbaseanti checkstyle compile
uname	Linux 41ba0197825e 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `335305e`
Default Java	AdoptOpenJDK-1.8.0_282-b08
Max. process+thread count	96 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/4/console
versions	git=2.17.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2021-06-03T05:43:47Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 40s	Docker mode activated.
-0 ⚠️	yetus	0m 3s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ master Compile Tests _
+1 💚	mvninstall	4m 3s	master passed
+1 💚	compile	1m 4s	master passed
+1 💚	shadedjars	8m 22s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 40s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	3m 43s	the patch passed
+1 💚	compile	1m 1s	the patch passed
+1 💚	javac	1m 1s	the patch passed
+1 💚	shadedjars	8m 15s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 37s	the patch passed
		_ Other Tests _
+1 💚	unit	152m 14s	hbase-server in the patch passed.
		183m 48s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/4/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR	#3318
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 12cbad7d7ee3 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `335305e`
Default Java	AdoptOpenJDK-1.8.0_282-b08
Test Results	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/4/testReport/
Max. process+thread count	4636 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/4/console
versions	git=2.17.1 maven=3.6.3
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Apache9 · 2021-06-03T06:35:33Z

Hi, Anoop, after reviewin the code, I do not think the problem is lastCleanCell.

The problem here is that, after calling scanner.next(cells, scannerContext), we then write these cells out to writer. These cells will reference HFileBlock inside the kvs, and after calling kvs.shipped(), all these cells will be invalid because we will release all the HFileBlock which are referenced by them.

So here we can not call kvs.shipped inside the loop. The writer.beforeShipped call will only clone the cells which have been written to the writer but haven't been flushed out by the writer. But kvs.shipped will also release the HFileBlock for the cells which haven't been written to the writer yet, so calling writer.beforeShipped does not help here, in the next loop round we will write an invalid cell to writer and crash the regionserver.

@mymeiyi Do I understand correctly?

Thanks.

anoopsjohn · 2021-06-03T06:46:48Z

Thanks Duo. Ya I got it.. Was looking around that lastCleanCell ref as that was initially been referred. I got it very clear from @mymeiyi reply. Ya change looks good.

mymeiyi · 2021-06-03T07:48:18Z

Hi, Anoop, after reviewin the code, I do not think the problem is lastCleanCell.

The problem here is that, after calling scanner.next(cells, scannerContext), we then write these cells out to writer. These cells will reference HFileBlock inside the kvs, and after calling kvs.shipped(), all these cells will be invalid because we will release all the HFileBlock which are referenced by them.

So here we can not call kvs.shipped inside the loop. The writer.beforeShipped call will only clone the cells which have been written to the writer but haven't been flushed out by the writer. But kvs.shipped will also release the HFileBlock for the cells which haven't been written to the writer yet, so calling writer.beforeShipped does not help here, in the next loop round we will write an invalid cell to writer and crash the regionserver.

@mymeiyi Do I understand correctly?

Thanks.

Yes, this explanation is very clear.

Apache9

+1, nice catch!

Apache-HBase · 2021-06-03T08:45:00Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 45s	Docker mode activated.
		_ Prechecks _
+1 💚	dupname	0m 0s	No case conflicting files found.
+1 💚	hbaseanti	0m 0s	Patch does not have any anti-patterns.
+1 💚	@author	0m 0s	The patch does not contain any @author tags.
		_ master Compile Tests _
+1 💚	mvninstall	3m 43s	master passed
+1 💚	compile	3m 17s	master passed
+1 💚	checkstyle	1m 1s	master passed
+1 💚	spotbugs	2m 4s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	3m 43s	the patch passed
+1 💚	compile	3m 19s	the patch passed
+1 💚	javac	3m 19s	the patch passed
+1 💚	checkstyle	1m 2s	the patch passed
+1 💚	whitespace	0m 0s	The patch has no whitespace issues.
+1 💚	hadoopcheck	20m 32s	Patch does not cause any errors with Hadoop 3.1.2 3.2.1 3.3.0.
+1 💚	spotbugs	2m 46s	the patch passed
		_ Other Tests _
+1 💚	asflicense	0m 13s	The patch does not generate ASF License warnings.
		52m 53s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/5/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#3318
Optional Tests	dupname asflicense javac spotbugs hadoopcheck hbaseanti checkstyle compile
uname	Linux 8093e724d6e6 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `426c3c1`
Default Java	AdoptOpenJDK-1.8.0_282-b08
Max. process+thread count	96 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/5/console
versions	git=2.17.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Signed-off-by: Duo Zhang <[email protected]>

Apache-HBase · 2021-06-03T10:54:58Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 43s	Docker mode activated.
-0 ⚠️	yetus	0m 3s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ master Compile Tests _
+1 💚	mvninstall	3m 46s	master passed
+1 💚	compile	1m 2s	master passed
+1 💚	shadedjars	8m 15s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 37s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	3m 55s	the patch passed
+1 💚	compile	0m 59s	the patch passed
+1 💚	javac	0m 59s	the patch passed
+1 💚	shadedjars	8m 17s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 38s	the patch passed
		_ Other Tests _
+1 💚	unit	151m 17s	hbase-server in the patch passed.
		182m 47s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/5/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR	#3318
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 3f0ca3f7c040 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `426c3c1`
Default Java	AdoptOpenJDK-1.8.0_282-b08
Test Results	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/5/testReport/
Max. process+thread count	4256 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/5/console
versions	git=2.17.1 maven=3.6.3
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2021-06-03T12:20:51Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 26s	Docker mode activated.
-0 ⚠️	yetus	0m 2s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ master Compile Tests _
+1 💚	mvninstall	5m 39s	master passed
+1 💚	compile	1m 40s	master passed
+1 💚	shadedjars	10m 37s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 49s	master passed
		_ Patch Compile Tests _
+1 💚	mvninstall	5m 42s	the patch passed
+1 💚	compile	1m 34s	the patch passed
+1 💚	javac	1m 34s	the patch passed
+1 💚	shadedjars	9m 46s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 46s	the patch passed
		_ Other Tests _
+1 💚	unit	228m 42s	hbase-server in the patch passed.
		268m 40s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/5/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR	#3318
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 93428b748912 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `426c3c1`
Default Java	AdoptOpenJDK-11.0.10+9
Test Results	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/5/testReport/
Max. process+thread count	3006 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3318/5/console
versions	git=2.17.1 maven=3.6.3
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

mymeiyi commented May 27, 2021

View reviewed changes

anoopsjohn reviewed May 27, 2021

View reviewed changes

mymeiyi force-pushed the HBASE-25929 branch from 521c8b9 to 4cf2cdc Compare May 27, 2021 10:22

mymeiyi commented May 29, 2021

View reviewed changes

mymeiyi force-pushed the HBASE-25929 branch from 4cf2cdc to 0988192 Compare June 3, 2021 02:38

anoopsjohn reviewed Jun 3, 2021

View reviewed changes

hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java Outdated Show resolved Hide resolved

HBASE-25929 RegionServer JVM crash when compaction

24a5ab4

mymeiyi force-pushed the HBASE-25929 branch from 0988192 to 24a5ab4 Compare June 3, 2021 07:51

Apache9 approved these changes Jun 3, 2021

View reviewed changes

mymeiyi merged commit 4671cb1 into apache:master Jun 3, 2021

mymeiyi added a commit that referenced this pull request Jun 3, 2021

HBASE-25929 RegionServer JVM crash when compaction (#3318)

1267bdb

Signed-off-by: Duo Zhang <[email protected]>

mymeiyi added a commit that referenced this pull request Jun 3, 2021

HBASE-25929 RegionServer JVM crash when compaction (#3318)

2dd7ec9

Signed-off-by: Duo Zhang <[email protected]>

mymeiyi added a commit that referenced this pull request Jun 3, 2021

HBASE-25929 RegionServer JVM crash when compaction (#3318)

62f3b1c

Signed-off-by: Duo Zhang <[email protected]>

mymeiyi mentioned this pull request Jul 7, 2022

HBASE-26827 RegionServer JVM crash when compact mob table #4206

Open

HBASE-25929 RegionServer JVM crash when compaction #3318

HBASE-25929 RegionServer JVM crash when compaction #3318

Uh oh!

Conversation

mymeiyi commented May 27, 2021

Uh oh!

Apache-HBase commented May 27, 2021

Uh oh!

Apache9 commented May 27, 2021

Uh oh!

Apache-HBase commented May 27, 2021

Uh oh!

mymeiyi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Apache-HBase commented May 27, 2021

Uh oh!

mymeiyi commented May 27, 2021

Uh oh!

Apache-HBase commented May 27, 2021

Uh oh!

Apache-HBase commented May 27, 2021

Uh oh!

Apache-HBase commented May 27, 2021

Uh oh!

Apache-HBase commented May 27, 2021

Uh oh!

Apache-HBase commented May 27, 2021

Uh oh!

Apache-HBase commented May 27, 2021

Uh oh!

anoopsjohn commented May 28, 2021

Uh oh!

mymeiyi left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mymeiyi commented Jun 1, 2021

Uh oh!

anoopsjohn commented Jun 1, 2021

Uh oh!

mymeiyi commented Jun 3, 2021

Uh oh!

Uh oh!

Apache-HBase commented Jun 3, 2021

Uh oh!

Apache-HBase commented Jun 3, 2021

Uh oh!

Apache9 commented Jun 3, 2021

Uh oh!

anoopsjohn commented Jun 3, 2021

Uh oh!

mymeiyi commented Jun 3, 2021

Uh oh!

Apache9 left a comment

Choose a reason for hiding this comment

Uh oh!

Apache-HBase commented Jun 3, 2021

Uh oh!

Apache-HBase commented Jun 3, 2021

Uh oh!

Apache-HBase commented Jun 3, 2021

Uh oh!

Reviewers

mymeiyi left a comment •

edited

Loading