Skip to content

Conversation

@alexeykudinkin
Copy link
Contributor

@alexeykudinkin alexeykudinkin commented Sep 28, 2022

Change Logs

Currently, HoodieTable is holding HoodieBackedTableMetadata that is setup not to reuse actual LogScanner and HFileReader used to read MT itself.

This is proving to be wasteful on a number of occasions already, including (not an exhaustive list):

Changes

  • Eliminating reuse flag from the HoodieBackedTableMetadata ctor to make sure readers are always reused
  • Holding HoodieBackedTableMetadata as Broadcast in HoodieTable to make sure it's being deserialized once per executor (instead of currently being once per task)
  • Made HoodieBackedTableMetadata serializable object (still omitting the readers though)
  • Restored HoodieSparkKryoRegistrar to be able to register serializer for SerializableConfiguration

Impact

Risk level: Low

Documentation Update

Documentation update is required now specifying that --conf spark.kryo.registrator=org.apache.spark.HoodieSparkKryoRegistrar property will be mandatory

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@nsivabalan
Copy link
Contributor

hey Alexey. I am not sure if we can completely remove the reuse flag.
for eg, even within the lifespan of a commit, when the writeClient started, the tableMetadata is instantiated ... and then index lookup, partitioner and then write to actual write (by executors). in preCommit, we do conflict resolution. So, even if we have not refreshed the hoodie timeline so far(implicitly file readers from metadata table ), we need to refresh it here to get the latest data.

Vinoth was involved while designing the reuse flag.

CC @prasannarajaperumal @vinothchandar

@alexeykudinkin
Copy link
Contributor Author

@hudi-bot run azure

Copy link
Contributor

@nsivabalan nsivabalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be thoroughly vetted. We have some code paths, where we disabled reuse explicitly.

Can we list out all callers where re-use is false and we can confirm that enabling re-use is the right fix.

From what I glean, I see two places.

  1. Metadata table writer initalized w/ reuse = false, which makes sense.
  static HoodieTableMetadata create(HoodieEngineContext engineContext, HoodieMetadataConfig metadataConfig, String datasetBasePath,
                                    String spillableMapPath) {
    return create(engineContext, metadataConfig, datasetBasePath, spillableMapPath, false);
  }

But this method has quite a few callers.
Screen Shot 2022-10-07 at 5 44 39 PM

let's sync up sometime and go over the list to ensure we don't cause any regression w/ unintentional revert.

@xushiyan xushiyan added priority:high Significant impact; potential bugs metadata labels Oct 31, 2022
@alexeykudinkin alexeykudinkin added priority:critical Production degraded; pipelines stalled and removed priority:high Significant impact; potential bugs labels Nov 30, 2022
@nsivabalan nsivabalan added the release-0.12.2 Patches targetted for 0.12.2 label Dec 6, 2022
@alexeykudinkin alexeykudinkin removed the release-0.12.2 Patches targetted for 0.12.2 label Dec 6, 2022
@alexeykudinkin alexeykudinkin force-pushed the ak/mt-repars-fix branch 2 times, most recently from 13fb788 to 3d90e88 Compare January 18, 2023 02:10
@alexeykudinkin alexeykudinkin added priority:blocker Production down; release blocker and removed priority:critical Production degraded; pipelines stalled labels Jan 18, 2023

private static final Logger LOG = Logger.getLogger(LeakTrackingFSDataInputStream.class.getName());

private final StackTraceElement[] callSite;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only for debug purpose, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct

@alexeykudinkin alexeykudinkin force-pushed the ak/mt-repars-fix branch 3 times, most recently from f8826f0 to 987bcb8 Compare January 19, 2023 01:25
@alexeykudinkin alexeykudinkin changed the title [HUDI-4937] Fix HoodieTable injecting non-reusable HoodieBackedTableMetadata aggressively flushing MT readers [HUDI-4937][Stacked on 7702] Fix HoodieTable injecting non-reusable HoodieBackedTableMetadata aggressively flushing MT readers Jan 19, 2023
@alexeykudinkin alexeykudinkin force-pushed the ak/mt-repars-fix branch 7 times, most recently from 031dc62 to 97af245 Compare January 20, 2023 06:16
}
}

public static <T> HoodieSparkTable<T> create(HoodieWriteConfig config, HoodieEngineContext context) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These ones were moved as is

super(config, context, metaClient);
}

@Override
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just been moved; no changes (there were other changes, but these were reverted back)

}

@Override
protected HoodieIndex getIndex(HoodieWriteConfig config, HoodieEngineContext context) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No changes

@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Copy link
Contributor

@nsivabalan nsivabalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for 1 comment.

protected static final long MAX_MEMORY_SIZE_IN_BYTES = 1024 * 1024 * 1024;
// NOTE: Buffer-size is deliberately set pretty low, since MT internally is relying
// on HFile (serving as persisted binary key-value mapping) to do caching
protected static final int BUFFER_SIZE = 10 * 1024; // 10Kb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you get a chance to run any benchmarks w/ this fix?
even w. Hfile, if the inputStream size is set to 10Kb, while original hfile block size is 64Kb, we might make more remote calls just to fetch 1 hfile block right? instead of just 1.

@alexeykudinkin alexeykudinkin added priority:critical Production degraded; pipelines stalled and removed priority:blocker Production down; release blocker labels Jan 24, 2023
@alexeykudinkin
Copy link
Contributor Author

This change is currently blocked on HUDI-3397: now that HoodieTable is going to become a stateful object, we need to make sure that resources are cleaned up appropriately. This in turn requires to make sure that all of the actual write operations (except being committed to the timeline) be completed by the time WriteClient returns from its operations (which is currently not the case)

@parisni
Copy link
Contributor

parisni commented Jun 28, 2023

hi @nsivabalan ! is this supposed to be integrated in 0.14 ? As of 0.13.1, cleaning w/ MDT still is incredibly slow in this context:

  • huge number of partitions >= 100k
  • huge number of affected partitions >=10k OR incremental cleaning not kicks in b/c no files to clean at all

We recently had a cleaning job w/ MDT running for 24h while dropping the MDT the job ended in few minutes.

Thanks

Copy link
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problems addressed by this PR are already resolved on the latest master. Closing this PR.

@yihua yihua closed this Dec 13, 2025
@github-project-automation github-project-automation bot moved this from 🏗 Under discussion to ✅ Done in Hudi PR Support Dec 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority:critical Production degraded; pipelines stalled size:L PR with lines of changes in (300, 1000]

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

8 participants