Skip to content

Conversation

@duongkame
Copy link
Contributor

What changes were proposed in this pull request?

Simple and efficient (to be verified) way to do leak detection without relying on finalize().

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-9528

How was this patch tested?

CI.

@duongkame duongkame marked this pull request as ready for review December 21, 2023 21:13
@duongkame
Copy link
Contributor Author

I think this can be much simpler to implement within RocksDB java lib (and a good contribution to RocksDB too). I will submit a patch to rocksdb, but we need to proceed with this change too.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @duongkame for the patch, LGTM.

Few trivial items noted. I'll commit them to the PR soon to trigger CI.

I also have a few ideas for minor improvements to be done as a follow-up. Filed the task for myself for now.

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duongkame , thanks a lot for working on this! Please see the comments inlined and also https://issues.apache.org/jira/secure/attachment/13065566/5853_review.patch

BTW, have you done some benchmarks to see if this can improve the performance?

* limitations under the License.
*
*/
package org.apache.hadoop.hdds.resource;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should use the org.apache.hadoop.hdds.utils package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duongkame , any comments on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved.

static void assertClosed(RocksObject rocksObject, String stackTrace) {
private static final LeakDetector LEAK_DETECTOR = new LeakDetector("ManagedRocksObject");

static LeakTracker track(AutoCloseable object, @Nullable StackTraceElement[] stackTrace) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Use RocksObject instead of AutoCloseable.
  • Get stackTrace inside this method instead of passing it in order to reduce code duplication.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Get stackTrace inside this method instead of passing it in order to reduce code duplication.

This one I also included in HDDS-10000. Since the stackTrace members already exist, i.e. the duplication is not introduced here, I thought it can be done as a follow-up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also RocksMutableObject and other custom object in the same package to be tracked here. AutoCloseable is the most convenient here. Btw, it's package private and only used for managed object in this package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use RocksObject instead of AutoCloseable.
Get stackTrace inside this method instead of passing it in order to reduce code duplication.

This is great. Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increment startIndex (to 4) here:

static String formatStackTrace(@Nullable StackTraceElement[] elements) {
return HddsUtils.formatStackTrace(elements, 3);

if (rocksObject.isOwningHandle()) {
reportLeak(rocksObject, stackTrace);
}
final String name = object.getClass().getSimpleName();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use object.toString() instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duongkame , any comments on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need anything specific in toString(). Plus, toString gives an uncertain performance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fine to use Class.getSimpleName() although it is known to have bad performance; see https://stackoverflow.com/questions/17369304/why-is-class-getsimplename-not-cached
It seems minor.

*
* }</pre>
*/
public class LeakDetector implements Runnable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use generic type and don't implements Runnable.

Comment on lines 67 to 75
Thread t = new Thread(this);
t.setName(LeakDetector.class.getSimpleName() + "-" + name);
t.setDaemon(true);
LOG.info("Starting leak detector thread {}.", name);
t.start();
}

@Override
public void run() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make a private and pass it using new Thread(this::run).

Copy link
Contributor Author

@duongkame duongkame Dec 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense.

@adoroszlai adoroszlai requested a review from szetszwo December 28, 2023 10:00
Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... have you done some benchmarks to see if this can improve the performance?

@duongkame , have you got a chance to run some benchmarks?

* limitations under the License.
*
*/
package org.apache.hadoop.hdds.resource;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duongkame , any comments on this?

if (rocksObject.isOwningHandle()) {
reportLeak(rocksObject, stackTrace);
}
final String name = object.getClass().getSimpleName();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duongkame , any comments on this?

@duongkame
Copy link
Contributor Author

duongkame commented Jan 3, 2024

... have you done some benchmarks to see if this can improve the performance?

@duongkame , have you got a chance to run some benchmarks?

I didn't run benchmarks on any ozone scenarios. However, I did a simple benchmark scenario each creating 100M managed objects with a percentage of leaks. The new solution consistently beats the old one by a 33% gap.

New solution result:

Finish benchmarking. Total time (ms): 38008
Total objects: 100001000
Total leaks found: 1000

Old solution (with finalizer) result.

Finish benchmarking. Total time (ms): 56527
Total objects: 100001000
Total leaks found: 1000

Benchmarking code:

public class ManagedObjectBenchmarking {
  public static void main(String[] args) throws Exception {
    NativeLibraryLoader.getInstance().loadLibrary("/tmp");
    long start = System.currentTimeMillis();

    int rounds = 100;
    for (int i = 0; i < rounds; i++) {
      create(1_000_000, true);
      create(10, false);
      System.gc();
    }

    long duration = System.currentTimeMillis() - start;
    Thread.sleep(1000);

    System.out.println("Finish benchmarking. Total time (ms): " + duration);
    System.out.println("Total objects: " + ManagedRocksObjectMetrics.INSTANCE.totalManagedObjects());
    System.out.println("Total leaks found: " + ManagedRocksObjectMetrics.INSTANCE.totalLeakObjects());
  }
  private static void create(int n, boolean close) {
    for (int i = 0; i < n; i++) {
      ManagedWriteBatch object = new ManagedWriteBatch();
      if (close) {
        object.close();
      }
    }
  }
}

Theoretically, this improvement unblocks GC and optimizes leak detection by only processing unclosed resources (closed ones are not enqueued).

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the change looks good. Just some very minor comments.

Comment on lines 1 to 22
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

/**
* Contains utilities for resource management.
*/
package org.apache.hadoop.hdds.resource;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should be removed.

if (rocksObject.isOwningHandle()) {
reportLeak(rocksObject, stackTrace);
}
final String name = object.getClass().getSimpleName();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fine to use Class.getSimpleName() although it is known to have bad performance; see https://stackoverflow.com/questions/17369304/why-is-class-getsimplename-not-cached
It seems minor.

@szetszwo
Copy link
Contributor

szetszwo commented Jan 3, 2024

... The new solution consistently beats the old one by a 33% gap.

Good to know, thanks!

(Sorry that I thought you would run the same OM performance benchmarks. I understand that ref queue is better solution than finalizer. However, I am not sure if it would help the OM performance.)

@adoroszlai adoroszlai merged commit b13f01c into apache:master Jan 3, 2024
@adoroszlai
Copy link
Contributor

Thanks @duongkame for the patch, @szetszwo for the review.

adoroszlai pushed a commit to adoroszlai/ozone that referenced this pull request Jan 25, 2024
@duongkame duongkame deleted the HDDS-9528 branch April 12, 2025 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants