Skip to content

Comments

EntityTree: only check for entity deletions when necessary#8103

Merged
teh-cmc merged 4 commits intomainfrom
cmc/entity_tree_ddos
Nov 12, 2024
Merged

EntityTree: only check for entity deletions when necessary#8103
teh-cmc merged 4 commits intomainfrom
cmc/entity_tree_ddos

Conversation

@teh-cmc
Copy link
Member

@teh-cmc teh-cmc commented Nov 12, 2024

Before:
image

After:
image

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested the web demo (if applicable):
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
  • If applicable, add a new check to the release checklist!
  • If have noted any breaking changes to the log API in CHANGELOG.md and the migration guide

To run all checks from main, comment on the PR with @rerun-bot full-check.

@teh-cmc teh-cmc added 🪳 bug Something isn't working 📉 performance Optimization, memory use, etc 🦟 regression A thing that used to work in an earlier release include in changelog labels Nov 12, 2024
self.children.retain(|_, entity| {
// this is placed first, because we'll only know if the child entity is empty after telling it to clear itself.
entity.on_store_deletions(engine, events);
entity.on_store_deletions(engine, entity_paths_with_deletions, events);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not totally following how this improves performance.

The inclusion of entity_paths_with_deletions doesn't change the fact that on_store_deletions does an entire tree-walk.

Is the whole point of this optimization to bypass the overhead of the is_empty() call in cases where we know that the intermediate child couldn't have been deleted?

That said, this seems like this will no longer successfully delete intermediate children that only existed as containers for other entities but don't have their own data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inclusion of entity_paths_with_deletions doesn't change the fact that on_store_deletions does an entire tree-walk.

Is the whole point of this optimization to bypass the overhead of the is_empty() call in cases where we know that the intermediate child couldn't have been deleted?

Yeah, the tree walk in itself is imperceptible (it's just a few thousands recursions in the worst case, it's barely measurable) -- a few thousands is_empty() on the other hand is extremely costly.

This PR basically brings the latency down from several seconds (ever increasing) to a constant 10ms (using the benchmark script in the issue).

That said, this seems like this will no longer successfully delete intermediate children that only existed as containers for other entities but don't have their own data.

Haa! I didn't even know that that was the point of this thing. We need to tweak this slightly then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few thousands is_empty() on the other hand is extremely costly

Got it -- that wasn't obvious at all and definitely warrants a comment then. I wonder if is_empty() should be renamed to something like check_if_empty() to better imply there's an active cost to be paid and it's not just accessing some pre-computed state.

@teh-cmc teh-cmc marked this pull request as draft November 12, 2024 17:19
@teh-cmc teh-cmc marked this pull request as ready for review November 12, 2024 17:54
@teh-cmc teh-cmc merged commit f9eb660 into main Nov 12, 2024
@teh-cmc teh-cmc deleted the cmc/entity_tree_ddos branch November 12, 2024 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🪳 bug Something isn't working include in changelog 📉 performance Optimization, memory use, etc 🦟 regression A thing that used to work in an earlier release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Chunk ingestion performance regression because of compaction logic

2 participants