Skip to content

Conversation

@charlesconnell
Copy link
Contributor

I've looked at a lot of allocation profiles of RegionServers doing a read-heavy workload. Some allocations that dominate the chart can be easily avoided. The method StoreScanner#read() contains this code

        heap.recordBlockSize(blockSize -> {
          if (rpcCall.isPresent()) {
            rpcCall.get().incrementBlockBytesScanned(blockSize);
          }
          scannerContext.incrementBlockProgress(blockSize);
        });

that runs for every iteration of its main loop. A closure can be created before the loop and re-used instead.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 28s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+1 💚 mvninstall 3m 22s master passed
+1 💚 compile 3m 11s master passed
+1 💚 checkstyle 0m 39s master passed
+1 💚 spotbugs 1m 38s master passed
+1 💚 spotless 0m 49s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 💚 mvninstall 3m 4s the patch passed
+1 💚 compile 3m 13s the patch passed
+1 💚 javac 3m 13s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 38s the patch passed
+1 💚 spotbugs 1m 44s the patch passed
+1 💚 hadoopcheck 12m 9s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 0m 49s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 11s The patch does not generate ASF License warnings.
40m 38s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6901/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #6901
JIRA Issue HBASE-29253
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 03a4f2bba85c 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 75faaa7
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 86 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6901/1/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 30s Docker mode activated.
-0 ⚠️ yetus 0m 2s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 3m 15s master passed
+1 💚 compile 0m 57s master passed
+1 💚 javadoc 0m 28s master passed
+1 💚 shadedjars 5m 57s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 3m 3s the patch passed
+1 💚 compile 0m 57s the patch passed
+1 💚 javac 0m 57s the patch passed
+1 💚 javadoc 0m 27s the patch passed
+1 💚 shadedjars 5m 49s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 241m 39s hbase-server in the patch passed.
267m 55s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6901/1/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #6901
JIRA Issue HBASE-29253
Optional Tests javac javadoc unit compile shadedjars
uname Linux 8f9ea8ed4c0a 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 75faaa7
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6901/1/testReport/
Max. process+thread count 4968 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6901/1/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache9
Copy link
Contributor

Apache9 commented Apr 12, 2025

The java compiler will not do it automatically? Do you have microbenmark to show the improvement?

Thanks.

@charlesconnell
Copy link
Contributor Author

charlesconnell commented Apr 12, 2025

Sure, here's a microbenchmark:


@State(Scope.Benchmark)
public class ClosureBenchmark {

  @Param({ "10000" })
  public int loops;

  @Param({ "true", "false" })
  public boolean createNewClosure;

  @Benchmark
  public void test(Blackhole blackhole) {
    IntConsumer savedClosure = x -> {
      // Do some work that captures a reference to a variable from outside the lambda, in this case blackhole,
      // thus forcing the allocation of a closure.
      blackhole.consume(x);
    };
    for (int i = 0; i < loops; i++) {
      if (createNewClosure) {
        IntConsumer newClosure = x -> {
          // Do some work that captures a reference to a variable from outside the lambda, in this case blackhole,
          // thus forcing the allocation of a closure.
          // But the work inside the lambda doesn't change based on the value of i,
          // so in theory the compiler could avoid creating a new closure on each loop iteration
          blackhole.consume(x);
        };
        blackhole.consume(newClosure);
      } else {
        blackhole.consume(savedClosure);
      }
    }
  }

  public static void main(String[] args) throws RunnerException, IOException {
    org.openjdk.jmh.Main.main(args);
  }
}

which produced these results for me:


Benchmark                                               (createNewClosure)  (loops)   Mode  Cnt       Score     Error   Units
ClosureBenchmark.test                                                 true    10000  thrpt   25   33571.747 ± 331.844   ops/s
ClosureBenchmark.test:·gc.alloc.rate                                  true    10000  thrpt   25    4876.137 ±  48.299  MB/sec
ClosureBenchmark.test:·gc.alloc.rate.norm                             true    10000  thrpt   25  160003.792 ±   0.577    B/op
ClosureBenchmark.test:·gc.churn.G1_Eden_Space                         true    10000  thrpt   25    4875.828 ±  55.586  MB/sec
ClosureBenchmark.test:·gc.churn.G1_Eden_Space.norm                    true    10000  thrpt   25  159993.999 ± 945.061    B/op
ClosureBenchmark.test:·gc.churn.G1_Survivor_Space                     true    10000  thrpt   25       0.006 ±   0.001  MB/sec
ClosureBenchmark.test:·gc.churn.G1_Survivor_Space.norm                true    10000  thrpt   25       0.201 ±   0.030    B/op
ClosureBenchmark.test:·gc.count                                       true    10000  thrpt   25    1423.000            counts
ClosureBenchmark.test:·gc.time                                        true    10000  thrpt   25     865.000                ms
ClosureBenchmark.test                                                false    10000  thrpt   25   37243.796 ± 233.865   ops/s
ClosureBenchmark.test:·gc.alloc.rate                                 false    10000  thrpt   25       0.541 ±   0.004  MB/sec
ClosureBenchmark.test:·gc.alloc.rate.norm                            false    10000  thrpt   25      16.015 ±   0.022    B/op
ClosureBenchmark.test:·gc.churn.G1_Eden_Space                        false    10000  thrpt   25       0.761 ±   1.164  MB/sec
ClosureBenchmark.test:·gc.churn.G1_Eden_Space.norm                   false    10000  thrpt   25      22.485 ±  34.382    B/op
ClosureBenchmark.test:·gc.churn.G1_Survivor_Space                    false    10000  thrpt   25       0.142 ±   0.217  MB/sec
ClosureBenchmark.test:·gc.churn.G1_Survivor_Space.norm               false    10000  thrpt   25       4.192 ±   6.411    B/op
ClosureBenchmark.test:·gc.count                                      false    10000  thrpt   25       5.000            counts
ClosureBenchmark.test:·gc.time                                       false    10000  thrpt   25      11.000                ms

The test runs somewhat faster when it doesn't create a new lambda on each loop iteration (33571 op/sec versus 37243 op/sec). The allocation rate is vastly lower (4876 MB/sec versus 0.5 MB/sec).

My Java version:

openjdk version "21.0.6" 2025-01-21 LTS
OpenJDK Runtime Environment Temurin-21.0.6+7 (build 21.0.6+7-LTS)
OpenJDK 64-Bit Server VM Temurin-21.0.6+7 (build 21.0.6+7-LTS, mixed mode, sharing)

@Apache9
Copy link
Contributor

Apache9 commented Apr 12, 2025

Better put the if (createNewClosure) outside the for loop? Like

if (createNewClosure) {
  for (int i = 0; i < loops; i++) {
    blackhole.consume(x -> {
      blackhole.consume(x);
    });
  }
} else {
  IntConsumer savedClosure = x -> {
      blackhole.consume(x);
    };
  for (int i = 0; i < loops; i++) {
    blackhole.consume(savedClosure);
  }
}

@charlesconnell
Copy link
Contributor Author

Sure. The full code is now

@State(Scope.Benchmark)
public class ClosureBenchmark {

  @Param({ "10000" })
  public int loops;

  @Param({ "true", "false" })
  public boolean createNewClosure;

  @Benchmark
  public void test(Blackhole blackhole) {
    if (createNewClosure) {
      for (int i = 0; i < loops; i++) {
        IntConsumer newClosure = x -> {
          blackhole.consume(x);
        };
        blackhole.consume(newClosure);
      }
    } else {
      IntConsumer savedClosure = x -> {
        blackhole.consume(x);
      };
      for (int i = 0; i < loops; i++) {
        blackhole.consume(savedClosure);
      }
    }
  }

  public static void main(String[] args) throws RunnerException, IOException {
    org.openjdk.jmh.Main.main(args);
  }
}

and the results are similar:

Benchmark                                               (createNewClosure)  (loops)   Mode  Cnt       Score     Error   Units
ClosureBenchmark.test                                                 true    10000  thrpt   25   33490.255 ± 278.683   ops/s
ClosureBenchmark.test:·gc.alloc.rate                                  true    10000  thrpt   25    4863.875 ±  40.472  MB/sec
ClosureBenchmark.test:·gc.alloc.rate.norm                             true    10000  thrpt   25  160003.763 ±   0.593    B/op
ClosureBenchmark.test:·gc.churn.G1_Eden_Space                         true    10000  thrpt   25    4863.175 ±  43.757  MB/sec
ClosureBenchmark.test:·gc.churn.G1_Eden_Space.norm                    true    10000  thrpt   25  159981.872 ± 719.752    B/op
ClosureBenchmark.test:·gc.churn.G1_Survivor_Space                     true    10000  thrpt   25       0.006 ±   0.001  MB/sec
ClosureBenchmark.test:·gc.churn.G1_Survivor_Space.norm                true    10000  thrpt   25       0.204 ±   0.040    B/op
ClosureBenchmark.test:·gc.count                                       true    10000  thrpt   25    1411.000            counts
ClosureBenchmark.test:·gc.time                                        true    10000  thrpt   25     852.000                ms
ClosureBenchmark.test                                                false    10000  thrpt   25   37070.851 ± 195.065   ops/s
ClosureBenchmark.test:·gc.alloc.rate                                 false    10000  thrpt   25       0.539 ±   0.003  MB/sec
ClosureBenchmark.test:·gc.alloc.rate.norm                            false    10000  thrpt   25      16.016 ±   0.022    B/op
ClosureBenchmark.test:·gc.churn.G1_Eden_Space                        false    10000  thrpt   25       0.761 ±   1.164  MB/sec
ClosureBenchmark.test:·gc.churn.G1_Eden_Space.norm                   false    10000  thrpt   25      22.643 ±  34.623    B/op
ClosureBenchmark.test:·gc.churn.G1_Survivor_Space                    false    10000  thrpt   25       0.142 ±   0.216  MB/sec
ClosureBenchmark.test:·gc.churn.G1_Survivor_Space.norm               false    10000  thrpt   25       4.212 ±   6.441    B/op
ClosureBenchmark.test:·gc.count                                      false    10000  thrpt   25       5.000            counts
ClosureBenchmark.test:·gc.time                                       false    10000  thrpt   25      10.000                ms

@Apache9
Copy link
Contributor

Apache9 commented Apr 13, 2025

Please use this form

    blackhole.consume(x -> {
      blackhole.consume(x);
    });

Not

        IntConsumer newClosure = x -> {
          blackhole.consume(x);
        };
        blackhole.consume(newClosure);

If compile can not pass, use casting instead of storing it to a local variable.

Copy link
Contributor

@Apache9 Apache9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, asked chatgpt, if a lambda captures external variables, even if these vars are not changed during the whole loop, it will still create a instance every time...

@charlesconnell
Copy link
Contributor Author

Yeah. To be sure, I just tried your suggestion like so:

@State(Scope.Benchmark)
public class ClosureBenchmark {

  @Param({ "10000" })
  public int loops;

  @Param({ "true", "false" })
  public boolean createNewClosure;

  @Benchmark
  public void test(Blackhole blackhole) {
    if (createNewClosure) {
      for (int i = 0; i < loops; i++) {
        blackhole.consume(
          (IntConsumer) (x -> {
              blackhole.consume(x);
            })
        );
      }
    } else {
      IntConsumer savedClosure = x -> {
        blackhole.consume(x);
      };
      for (int i = 0; i < loops; i++) {
        blackhole.consume(savedClosure);
      }
    }
  }

  public static void main(String[] args) throws RunnerException, IOException {
    org.openjdk.jmh.Main.main(args);
  }
}

and the results stay the same.

@Apache9 Apache9 merged commit abc8b43 into apache:master Apr 20, 2025
1 check passed
Apache9 pushed a commit that referenced this pull request Apr 20, 2025
…StoreScanner (#6901)

Signed-off-by: Duo Zhang <[email protected]>
(cherry picked from commit abc8b43)
Apache9 pushed a commit that referenced this pull request Apr 20, 2025
…StoreScanner (#6901)

Signed-off-by: Duo Zhang <[email protected]>
(cherry picked from commit abc8b43)
Apache9 pushed a commit that referenced this pull request Apr 20, 2025
…StoreScanner (#6901)

Signed-off-by: Duo Zhang <[email protected]>
(cherry picked from commit abc8b43)
mokai87 pushed a commit to mokai87/hbase that referenced this pull request Aug 7, 2025
…StoreScanner (apache#6901)

Signed-off-by: Duo Zhang <[email protected]>
(cherry picked from commit abc8b43)
@charlesconnell charlesconnell deleted the HBASE-29253/store-scanner-lambda branch September 19, 2025 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants