Skip to content

Conversation

@bsglz
Copy link
Contributor

@bsglz bsglz commented Apr 11, 2023

No description provided.

@Apache-HBase

This comment has been minimized.

@Apache9
Copy link
Contributor

Apache9 commented Apr 11, 2023

I used to consider the same but soon I found that there are some tricks in our implementation, as we may generate some fake cell for seeking.

I saw that you have already handled the case where left family length or right family length are zero, this is for some key only cell, but I'm afraid there could still be other type of fake cells, for example, when querying on a bloom filter enabled region, we may use bloom filter to generate a fake cell, if the type of the bloom filter is ROWCOL, maybe we could also include a fake family?

@bsglz
Copy link
Contributor Author

bsglz commented Apr 11, 2023

I used to consider the same but soon I found that there are some tricks in our implementation, as we may generate some fake cell for seeking.

I saw that you have already handled the case where left family length or right family length are zero, this is for some key only cell, but I'm afraid there could still be other type of fake cells, for example, when querying on a bloom filter enabled region, we may use bloom filter to generate a fake cell, if the type of the bloom filter is ROWCOL, maybe we could also include a fake family?

Yeah, this is a fairly important class and does require detailed testing.
Let me see the case you mentioned first.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@bsglz
Copy link
Contributor Author

bsglz commented Apr 11, 2023

@Apache9 Checked the code, the bloom filter use empty family both in storing and checking, so seems ok.

 case ROWCOL:
        if (!scan.isGetScan()) {
          return true;
        }
        if (columns != null && columns.size() == 1) {
          byte[] column = columns.first();
          // create the required fake key
          Cell kvKey = PrivateCellUtil.createFirstOnRow(row, HConstants.EMPTY_BYTE_ARRAY, column);
          return passesGeneralRowColBloomFilter(kvKey);
        }

        // For multi-column queries the Bloom filter is checked from the
        // seekExact operation.
        return true;
/**
 * An hash key for ROWCOL bloom. This assumes the cells to be serialized in the Keyvalue
 * serialization format with Empty column family. Note that the byte representing the family length
 * is considered to be 0
 */
@InterfaceAudience.Private
public class RowColBloomHashKey extends CellHashKey {

BTW, the method of checkGeneralBloomFilter use CellComparator.getInstance() directly which return CellComparatorImpl, will change them later.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

if (comparator == MetaCellComparator.META_COMPARATOR) {
this.comparator = comparator;
} else {
this.comparator = InnerStoreCellComparator.INNER_STORE_COMPARATOR;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit strange, as we may not use the comparator passed in. We should use a new pattern to set the comparator here, otherwise it will confuse developers...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, will refactor here.
Thanks.

@Apache-HBase

This comment has been minimized.

@bsglz
Copy link
Contributor Author

bsglz commented Apr 12, 2023

Writing a perf test class for cellComparator, don't merge yet even if the tests pass.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@bsglz
Copy link
Contributor Author

bsglz commented Apr 12, 2023

Added perf test class named PerfTestCellComparator.
Below is once running result:

<style type='text/css'></style>
leftFamLen rightFamLen comparator cost(ms)
0 0 CellComparatorImpl 12738
0 0 InnerStoreCellComparator 12548
0 4 CellComparatorImpl 8541
0 4 InnerStoreCellComparator 8142
4 0 CellComparatorImpl 8435
4 0 InnerStoreCellComparator 8327
4 4 CellComparatorImpl 13662
4 4 InnerStoreCellComparator 11551

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@bsglz bsglz requested a review from Apache9 April 13, 2023 10:56
@Apache-HBase

This comment has been minimized.

@bsglz
Copy link
Contributor Author

bsglz commented Apr 14, 2023

More detailed test report, compareCnt is 1 billion.

<style type='text/css'></style>
compareMethod leftFamLen rightFamLen comparator cost(ms) diff
compareKV 0 0 CellComparatorImpl 28850  
compareKV 0 0 InnerStoreCellComparator 27478 -5.00%
compareKV 0 4 CellComparatorImpl 19041  
compareKV 0 4 InnerStoreCellComparator 17391 -9.00%
compareKV 4 0 CellComparatorImpl 18988  
compareKV 4 0 InnerStoreCellComparator 17375 -8.00%
compareKV 4 4 CellComparatorImpl 33360  
compareKV 4 4 InnerStoreCellComparator 27083 -19.00%
compareBBKV 0 0 CellComparatorImpl 34014  
compareBBKV 0 0 InnerStoreCellComparator 31660 -7.00%
compareBBKV 0 4 CellComparatorImpl 20780  
compareBBKV 0 4 InnerStoreCellComparator 20847 0.00%
compareBBKV 4 0 CellComparatorImpl 23540  
compareBBKV 4 0 InnerStoreCellComparator 21751 -8.00%
compareBBKV 4 4 CellComparatorImpl 40192  
compareBBKV 4 4 InnerStoreCellComparator 31522 -22.00%
compareKVVsBBKV 0 0 CellComparatorImpl 30979  
compareKVVsBBKV 0 0 InnerStoreCellComparator 29827 -4.00%
compareKVVsBBKV 0 4 CellComparatorImpl 21918  
compareKVVsBBKV 0 4 InnerStoreCellComparator 19143 -13.00%
compareKVVsBBKV 4 0 CellComparatorImpl 22605  
compareKVVsBBKV 4 0 InnerStoreCellComparator 20952 -7.00%
compareKVVsBBKV 4 4 CellComparatorImpl 35561  
compareKVVsBBKV 4 4 InnerStoreCellComparator 29150 -18.00%

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@bsglz bsglz requested a review from bbeaudreault April 27, 2023 08:01
@bsglz
Copy link
Contributor Author

bsglz commented Apr 28, 2023

Any other comments? @Apache9 @bbeaudreault
Thanks.


public static CellComparator getInnerStoreCellComparator(Configuration conf, byte[] tableName) {
if (
conf != null && conf.getBoolean(HConstants.USE_META_CELL_COMPARATOR,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this judgement in upper layer so we do not need to move USE_META_CELL_COMPARATOR to HConstants? It is just for internal use, we'd better not put it into an IA.Public class...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@bsglz bsglz requested a review from Apache9 May 1, 2023 10:12
*/
@Category({ MediumTests.class })
@RunWith(Parameterized.class)
public class PerfTestCellComparator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd better just post the JMH code on the jira issue? Without JMH, a micro bench is not very stable...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

*/
// We could write the actual class name from 2.0 onwards and handle BC
private String comparatorClassName = CellComparator.getInstance().getClass().getName();
private String comparatorClassName =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No compatibility issues?

Copy link
Contributor Author

@bsglz bsglz May 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No compatibility issues?

Thought about it and i think so, because we do the conversion when we save and read the hfile, the actual store is the KVComparator, we can't change this since we want hbase1.x to be able to read the files generated in the new version. See the comment of getHBase1CompatibleName for more detail.
Thanks.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 0s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ master Compile Tests _
+0 🆗 mvndep 0m 10s Maven dependency ordering for branch
+1 💚 mvninstall 3m 50s master passed
+1 💚 compile 3m 7s master passed
+1 💚 checkstyle 0m 49s master passed
+1 💚 spotless 0m 43s branch has no errors when running spotless:check.
+1 💚 spotbugs 2m 10s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 10s Maven dependency ordering for patch
+1 💚 mvninstall 3m 36s the patch passed
+1 💚 compile 3m 2s the patch passed
+1 💚 javac 3m 2s the patch passed
+1 💚 checkstyle 0m 48s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 hadoopcheck 13m 1s Patch does not cause any errors with Hadoop 3.2.4 3.3.4.
+1 💚 spotless 0m 42s patch has no errors when running spotless:check.
+1 💚 spotbugs 2m 15s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 20s The patch does not generate ASF License warnings.
43m 18s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5171/18/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #5171
Optional Tests dupname asflicense javac spotbugs hadoopcheck hbaseanti spotless checkstyle compile
uname Linux 8457a4f9a2ea 5.4.0-144-generic #161-Ubuntu SMP Fri Feb 3 14:49:04 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 21d61cf
Default Java Eclipse Adoptium-11.0.17+8
Max. process+thread count 86 (vs. ulimit of 30000)
modules C: hbase-common hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5171/18/console
versions git=2.34.1 maven=3.8.6 spotbugs=4.7.3
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 24s Docker mode activated.
-0 ⚠️ yetus 0m 2s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 18s Maven dependency ordering for branch
+1 💚 mvninstall 2m 47s master passed
+1 💚 compile 0m 49s master passed
+1 💚 shadedjars 4m 38s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 33s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for patch
+1 💚 mvninstall 2m 49s the patch passed
+1 💚 compile 0m 48s the patch passed
+1 💚 javac 0m 48s the patch passed
+1 💚 shadedjars 4m 39s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 32s the patch passed
_ Other Tests _
+1 💚 unit 1m 41s hbase-common in the patch passed.
+1 💚 unit 208m 33s hbase-server in the patch passed.
232m 42s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5171/18/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR #5171
Optional Tests javac javadoc unit shadedjars compile
uname Linux 03ca98afe57b 5.4.0-1099-aws #107~18.04.1-Ubuntu SMP Fri Mar 17 16:49:05 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 21d61cf
Default Java Temurin-1.8.0_352-b08
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5171/18/testReport/
Max. process+thread count 2849 (vs. ulimit of 30000)
modules C: hbase-common hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5171/18/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 48s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 19s Maven dependency ordering for branch
+1 💚 mvninstall 3m 21s master passed
+1 💚 compile 1m 5s master passed
+1 💚 shadedjars 4m 24s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 40s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 13s Maven dependency ordering for patch
+1 💚 mvninstall 3m 21s the patch passed
+1 💚 compile 1m 4s the patch passed
+1 💚 javac 1m 4s the patch passed
+1 💚 shadedjars 4m 25s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 41s the patch passed
_ Other Tests _
+1 💚 unit 2m 6s hbase-common in the patch passed.
+1 💚 unit 212m 44s hbase-server in the patch passed.
239m 52s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5171/18/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #5171
Optional Tests javac javadoc unit shadedjars compile
uname Linux 342853696fbb 5.4.0-137-generic #154-Ubuntu SMP Thu Jan 5 17:03:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 21d61cf
Default Java Eclipse Adoptium-11.0.17+8
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5171/18/testReport/
Max. process+thread count 2995 (vs. ulimit of 30000)
modules C: hbase-common hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5171/18/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@bsglz bsglz requested a review from Apache9 May 3, 2023 04:55
@bsglz
Copy link
Contributor Author

bsglz commented May 5, 2023

Any more comments? @Apache9 @bbeaudreault

@bsglz bsglz merged commit 5d82d4f into apache:master May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants