-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-25929 RegionServer JVM crash when compaction #3318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
🎊 +1 overall
This message was automatically generated. |
|
Mind explaing a bit about the fix? |
|
💔 -1 overall
This message was automatically generated. |
mymeiyi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mind explaing a bit about the fix?
Sure, it is not easy to explain. Let me draw a simple picture to show how this error could happen.
| // may clear prevBlocks list. | ||
| kvs.shipped(); | ||
| bytesWrittenProgressForShippedCall = 0; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this out of "for loop", because the "kvs.shipped()" will release the pre blocks in HFileReader, but some cells may have not been shipped by 'writer.append(c)' in line 441.
| import org.junit.rules.TestName; | ||
|
|
||
| @Category(LargeTests.class) | ||
| public class TestCompactionWithByteBuff { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test guarantee that early release of BB happens and so possibly hit that issue? I dont think so. May be am missing something!
| conf.setInt(ByteBuffAllocator.BUFFER_SIZE_KEY, 1024 * 5); | ||
| conf.setInt(CompactSplit.SMALL_COMPACTION_THREADS, REGION_COUNT * 2); | ||
| conf.setInt(CompactSplit.LARGE_COMPACTION_THREADS, REGION_COUNT * 2); | ||
| conf.set(HConstants.BUCKET_CACHE_IOENGINE_KEY, "offheap"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually Bucket Cache is coming into pic here? I think No. We write some data and flush. 2 files are created. And then compact that. What we wanted is that the compaction is reading the file blocks from BC. But cache data on write is by default false. (hbase.rs.cacheblocksonwrite).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not related to BC.
Set to offheap to make sure the blocks are read to OffheapByteBuffer, see HFileReaderImpl#shouldUseHeap.
| progress.cancel(); | ||
| return false; | ||
| } | ||
| if (kvs != null && bytesWrittenProgressForShippedCall > shippedCallSizeLimit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit : Seems like only format change. Can u avoid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not format change? Moved this code out of the for loop.
|
💔 -1 overall
This message was automatically generated. |
|
I write a case how this error could happen in the doc: https://docs.google.com/document/d/1_3HXgOSGHsHFqLiOWUsE3m6pHKjebSwrRObcPTqkSTE/edit?usp=sharing, please have a look, thanks @Apache9 @anoopsjohn |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
Ya the issue is very clear for me. Actually in code, we clone the cells for which we keep the ref. Seems missed in this place.. BTW in ur 1st patch that clone of Cell was there but removed now? And instead did this move of shipped out of that for loop? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW in ur 1st patch that clone of Cell was there but removed now? And instead did this move of shipped out of that for loop?
Thanks for your reply.
In the 1st patch, I modified two places:
- move the ship method to avoid RS crash in
StoreFileWriter.appendmethod; - copy the first cell in block to avoid RS crash in
HFileWriterImpl.getMidpoinmethod.
But I found out that, with fix1, the first cell must be copied by ((ShipperListener) writer).beforeShipped() then call kvs.shipped() to release read blocks, the problem2 can not happen anymore, so I remove fix2.
| progress.cancel(); | ||
| return false; | ||
| } | ||
| if (kvs != null && bytesWrittenProgressForShippedCall > shippedCallSizeLimit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not format change? Moved this code out of the for loop.
| conf.setInt(ByteBuffAllocator.BUFFER_SIZE_KEY, 1024 * 5); | ||
| conf.setInt(CompactSplit.SMALL_COMPACTION_THREADS, REGION_COUNT * 2); | ||
| conf.setInt(CompactSplit.LARGE_COMPACTION_THREADS, REGION_COUNT * 2); | ||
| conf.set(HConstants.BUCKET_CACHE_IOENGINE_KEY, "offheap"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not related to BC.
Set to offheap to make sure the blocks are read to OffheapByteBuffer, see HFileReaderImpl#shouldUseHeap.
|
Is there any other problems about this issue? @Apache9 @anoopsjohn |
|
So the latest version change is moving the check for possible shipped() called out of the for loop. The cons here is that when we have wide rows with so many cells in it, this will delay the process of release of blocks and so release from cache. That is why it was kept in for loop. |
Yes, the block release may be delayed. There is a configuration named "hbase.hstore.compaction.kv.max" and default value is 10 to limit scan cell count in compaction. But if a cell size is huge, we can not avoid the delay. |
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/compactions/Compactor.java
Outdated
Show resolved
Hide resolved
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
Hi, Anoop, after reviewin the code, I do not think the problem is The problem here is that, after calling So here we can not call @mymeiyi Do I understand correctly? Thanks. |
|
Thanks Duo. Ya I got it.. Was looking around that lastCleanCell ref as that was initially been referred. I got it very clear from @mymeiyi reply. Ya change looks good. |
Yes, this explanation is very clear. |
Apache9
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, nice catch!
|
🎊 +1 overall
This message was automatically generated. |
Signed-off-by: Duo Zhang <[email protected]>
Signed-off-by: Duo Zhang <[email protected]>
Signed-off-by: Duo Zhang <[email protected]>
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
No description provided.