-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-24382 Flush partial stores of region filtered by seqId when arc… #1737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
The change is straight forward but I'm not sure whether it is worth to do this optimization, as it does increase the complexity of code. Have you met the problem introduced by the old simple logic? Thanks. |
|
It is from a discussion in our team, about whether a table could be set many column family(10~100) or not. |
|
OK, sounds reasonable. On the patch, please just use byte[] as the family name? We use byte[] or String as family name everywhere, especially when passing as a parameter, so using ImmutableByteArray seems a bit strange. |
|
Ok, will fix tomorrow. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
@Apache9 Could you help to review it? Thanks. |
Apache9
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why we choose String instead of byte[] here?
| Collection<HStore> specificStoresToFlush = null; | ||
| if (!forceFlushAllStores && families != null) { | ||
| specificStoresToFlush = stores.entrySet().stream() | ||
| .filter(e -> families.contains(Bytes.toString(e.getKey()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to use contains then we better use a Set instead of a List?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why we choose String instead of byte[] here?
1、The family name stored as private byte[] in ImmutableByteArray, if use byte[], it need twice convertion.
2、We use the method of contains to filter them later, use string seems esaliy to do.
Correct me if i have some mistake.
If we want to use contains then we better use a Set instead of a List?
Yeah, you are right, will fix.
Thanks a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the String is ok, i will push a commit later, or else i will research more.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the WAL interface, we all use byte[] as family name, so I prefer we also use byte[] here. You could use TreeSet with BytesComparator to hold byte[].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will use byte[] instead of String.
In addition,since we do not need a sorted collection, i could iterate the families to avoid use contains.
Thanks.
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
rebase |
| encodedRegionName, r); | ||
| return; | ||
| } | ||
| // force flushing all stores to clean old logs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| for (Map.Entry<ImmutableByteArray, Long> me : m.entrySet()) { | ||
| if (me.getValue() <= e.getValue()) { | ||
| if (toFlush == null) { | ||
| toFlush = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better use a TreeMap with BytesComparator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Thanks a lot.
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
FlushPolicy does seem to cut across what you are trying to do here where you only flush a subset.... even though reading the original issue, it seems to be trying to do subset only: https://issues.apache.org/jira/browse/HBASE-10201 We should change flushpolicy to accommodate your need? If we flush all regardless of the policy, that sounds like a problem. Thanks. |
|
Skimed HBASE-10201, still can not get the point that how to change the flushpolicy... |
|
@saintstack Ping |
Apache9
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with the approach but I guess there are still checkstyle issues? Please fix them before merging.
Thanks.
| */ | ||
| package org.apache.hadoop.hbase.regionserver; | ||
|
|
||
| import java.util.ArrayList; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should remove these imports? Seems we add a method in this class and then moved it out but forgot to remove these imports?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you are right, fixed.
Thanks.
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
bq. The former is about how to select stores that could be flushed, and the latter is about select stores that must be flushed. Ok. Did you stamp this into the code as comments? Let me look. |
Will do it later, thanks. |
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/FlushRequester.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/FlushRequester.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
Show resolved
Hide resolved
| // force flushing all stores to clean old logs | ||
| requester.requestFlush(r, true, FlushLifeCycleTracker.DUMMY); | ||
| // force flushing specified stores to clean old logs | ||
| requester.requestFlush(r, false, families, FlushLifeCycleTracker.DUMMY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The flag is set when we are in a critical condition -- the WAL count is in excess of our WAL limit. The flag's intent IIRC is that we flush all stores regardless of what determination is made at flush time as to which stores are in need of flush or not; the old edit may actually be hanging out in a store that is small and not in need of flush normally or in accordance w/ some flush policy. The flag says 'force' the flush. My understanding is that this is a FlushRequest usually but the flag changes the request to a demand.
Here you are passing a set of stores instead where the stores chosen are the ones that will free up WALs. So, this flag is now redundant? We should purge it. Now it will confuse since the 'force' instead is a list of families to flush -- else its null if old behavior.
| // force flushing all stores to clean old logs | ||
| requester.requestFlush(r, true, FlushLifeCycleTracker.DUMMY); | ||
| // force flushing specified stores to clean old logs | ||
| requester.requestFlush(r, false, families, FlushLifeCycleTracker.DUMMY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking around, this is the only time forceFlushAllStores is set to true -- when we have too many WALs and we need to clear the old ones out.
It also set to true in RSRpcServices.flushRegion, would be called by user. |
Yeah, i am agree with you, maybe the forceFlushAllStores should be enum, named selectType that include "FORCE_ALL","FORCE_WITH_SPECIFIED","FLUSH_POLICY"? And it is a long story yet, i can do it in follow-on jira, WDYT? |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
bq. It also set to true in RSRpcServices.flushRegion, would be called by user. ok. The user is asking to flush all regions. If user sets it, could convert it to a list of all families? bq. Yeah, i am agree with you, maybe the forceFlushAllStores should be enum, named selectType that include "FORCE_ALL","FORCE_WITH_SPECIFIED","FLUSH_POLICY"? And it is a long story yet, i can do it in follow-on jira, WDYT? Why not just remove it? The above seems to be offering too much choice. Navigating all of the options in flushcache will be awkward. Thanks. |
Ok, remove it make sense, let me do it later. |
… how to select stores to flush
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
| families.addAll(this.getTableDescriptor().getColumnFamilyNames()); | ||
| } | ||
| return this.flushcache(families, writeFlushRequestWalMarker, tracker); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good
|
Thanks all the comments. |
…hive wal due to too many wals