Skip to content

Conversation

@shanecelis
Copy link
Contributor

@shanecelis shanecelis commented Oct 21, 2025

Objective

Fix the stackoverflow on asset reload when asset contains its own path as a dependency.

Problem

There is a way to create a circular dependency graph with an asset loader. I don't know how I managed to do it. I have tried to create a minimal example, but it has not exhibited the error yet. But I do have a means of exhibiting the error with my project Nano-9, reproduction details below.

Solution

This commit has two fixes: one at the insertion point (introducing self-reference), and one at the recursion point (following self-reference).

Insertion point

Issue warning when an asset wants to mark itself as a dependency and do not allow inserting itself as a dependent.

Recursion point

Check for self loops. Warn on self detection and do not loop.

It's likely that if you fix it at the insertion point, you don't need to worry about it at the recursion point. You will have stopped the cause of the issue. I left both in for transparency about where the issue lies so far as I could see.

Testing

I can reproduce this error with an example from my Nano-9 project. I wish it were a minimal example. It's not, but I have put it on a branch to isolate this issue. It uses my Bevy fork that is v0.16.1 plus a commit tagged v0.16.1b, which was required for it to build. This PR is a cherry pick of the fix commit against Bevy's main branch.

I can reproduce this error by doing the following:

git clone -b bevy-asset-stackoverflow https://github.com/shanecelis/nano-9.git
cd nano-9
cargo run --example sprite --features watcher --no-default-features &
touch assets/BirdSprite.png

Here is an excerpt of the crash report on macOS 15.6.1, M4 Max:

...
Thread 0 Crashed:: main Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib        	       0x18cc2a388 __pthread_kill + 8
1   libsystem_pthread.dylib       	       0x18cc6388c pthread_kill + 296
2   libsystem_c.dylib             	       0x18cb6ca3c abort + 124
3   sprite                        	       0x10598d09c std::sys::pal::unix::abort_internal::h1edcc850f5dec78e + 12
4   sprite                        	       0x10598c5b0 std::process::abort::hffd6db68ff0662a6 + 12
5   sprite                        	       0x10580d464 std::sys::pal::unix::stack_overflow::imp::signal_handler::h7b8eae417c5ee98d + 604
6   libsystem_platform.dylib      	       0x18cc9d6a4 _sigtramp + 56
7   sprite                        	       0x10573f890 _$LT$std..path..Path$u20$as$u20$core..hash..Hash$GT$::hash::h732e05949b6a170e + 136
8   sprite                        	       0x10573f890 _$LT$std..path..Path$u20$as$u20$core..hash..Hash$GT$::hash::h732e05949b6a170e + 136
9   sprite                        	       0x104d5f278 _$LT$atomicow..CowArc$LT$T$GT$$u20$as$u20$core..hash..Hash$GT$::hash::h6c45383281764a05 + 40
10  sprite                        	       0x104ed62d8 _$LT$bevy_asset..path..AssetPath$u20$as$u20$core..hash..Hash$GT$::hash::h11b348528182d76d + 52
11  sprite                        	       0x104ded598 hashbrown::map::make_hash::hb7812997186aa817 + 56
12  sprite                        	       0x104de8eec hashbrown::map::HashMap$LT$K$C$V$C$S$C$A$GT$::insert::h3a8b107dcf9615e9 + 64
13  sprite                        	       0x104da7264 hashbrown::set::HashSet$LT$T$C$S$C$A$GT$::insert::h44e988e7a7752688 + 24
14  sprite                        	       0x104e7b15c bevy_platform::collections::hash_set::HashSet$LT$T$C$S$GT$::insert::hb1d23d506d548fcc + 24
15  sprite                        	       0x104dba770 bevy_asset::server::handle_internal_asset_events::_$u7b$$u7b$closure$u7d$$u7d$::queue_ancestors::hd1670687bb18ae9f + 216
16  sprite                        	       0x104dba780 bevy_asset::server::handle_internal_asset_events::_$u7b$$u7b$closure$u7d$$u7d$::queue_ancestors::hd1670687bb18ae9f + 232
17  sprite                        	       0x104dba780 bevy_asset::server::handle_internal_asset_events::_$u7b$$u7b$closure$u7d$$u7d$::queue_ancestors::hd1670687bb18ae9f + 232
18  sprite                        	       0x104dba780 bevy_asset::server::handle_internal_asset_events::_$u7b$$u7b$closure$u7d$$u7d$::queue_ancestors::hd1670687bb18ae9f + 232
19  sprite                        	       0x104dba780 bevy_asset::server::handle_internal_asset_events::_$u7b$$u7b$closure$u7d$$u7d$::queue_ancestors::hd1670687bb18ae9f + 232
...

Exercising the fix

Alter Nano-9's Cargo file to use the fix branch:

-bevy = { git = "https://github.com/shanecelis/bevy.git", tag = "v0.16.1b" }
+bevy = { git = "https://github.com/shanecelis/bevy.git", branch = "fix/asset-reload-overflow" }

Run the same command again, and you will see a warning:

2025-10-21T04:10:52.348726Z  WARN bevy_asset::server::info: Asset 'BirdSprite.png' wants to treat itself as a dependency

Alternative Solution

This commit fixes the immediate stackoverflow issue; however, it would be even better if the user were prevented from making a circular dependency graph in the first place.

@alice-i-cecile alice-i-cecile added C-Bug An unexpected or incorrect behavior A-Assets Load files from disk to use for things like images, models, and sounds S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged labels Oct 21, 2025
@alice-i-cecile
Copy link
Member

This seems like a reasonable fix but I'd really like to add a test along with this PR.

Copy link
Contributor

@andriyDev andriyDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also seems sensible to me, but I would also like to block on a test for this.

@shanecelis
Copy link
Contributor Author

Understood. Are there any asset-loader tests that I might emulate?

@andriyDev
Copy link
Contributor

Pretty much all the tests in crates/bevy_asset/src/lib.rs are sensible tests to emulate. Here's a test for nested immediate loading

fn error_on_nested_immediate_load_of_subasset() {
which could help.

@shanecelis
Copy link
Contributor Author

Thank you for the test reference. They've been very helpful.

Coming up with a test has been tricky but instructive. I finally figured out how to provoke the error. If I use an immediate load in an asset loader that tries to load its own asset path, it will cause a stack overflow, which is the behavior I originally saw.

A asset loader with a self-path deferred load's does not provoke an error or overflow, but it does behave differently than a normal asset. It will emit asset Added and Modified but there won't be any LoadedWithDependencies event.

I'll try to get these tests in with some better fixes tomorrow night.

@shanecelis shanecelis force-pushed the fix/asset-reload-overflow-v0.17 branch from 90a37d4 to 36c5537 Compare October 24, 2025 04:37
Copy link
Contributor

@andriyDev andriyDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems good to me! Thank you!

@andriyDev
Copy link
Contributor

Might need a rebase after #21626.

@alice-i-cecile
Copy link
Member

@shanecelis let me know when CI is green and I'll do a proper review for you.

@shanecelis shanecelis marked this pull request as draft October 24, 2025 06:05
@shanecelis
Copy link
Contributor Author

Thanks. It looks better. In the tests I've tried to exercise asset loader that loads its own path in deferred, immediate, and unknown type, but I fear I've made it too restrictive now. If you exercise the tests with only the first commit, you'll see a stackoverflow error. I'm not certain it's the same stackoverflow I observed in the wild.

I'm marking this as a draft for the moment because I'm running into an issue when I use it in Nano-9. I have an asset loader for a SpriteSheet that loads an image path. In the indexed image case, it reads the PNG directly from the byte stream. In another case, it does an immediate load::<Image>() on that same path. This is would work previously but on reload it provoked a stackoverflow. With this change, that immediate load is prevented and a LoadDirectError is returned for loading the same path. I prefer failing fast to stackoverflow later, but I feel like I could make this avoid the reload error without restricting some useful behavior.

I think if a path is being loaded from a different AssetLoader or different Asset type, then we maybe should permit it. I'm not sure I can assess that where I need to yet. Perhaps it's premature to disallow same path loads except where they provoke serious errors.

Let me tinker on this a little longer.

@shanecelis shanecelis force-pushed the fix/asset-reload-overflow-v0.17 branch 3 times, most recently from 83df589 to ab71ad1 Compare October 27, 2025 03:01
@shanecelis
Copy link
Contributor Author

@alice-i-cecile, this PR is now ready. I added a test to cover the read_asset_bytes(). I had originally thought read_asset_bytes() ought to error with the AssetDependentOnSelf similarly to an immediate load, but after exercising the test, it seems reasonable to permit self-path byte loads so long as sure we ensure the self-path is not inserted into the loader_dependencies.

I left an assert! at my original fix point, which is where the stackoverflow for reloading was first exhibited. Perhaps it could be a debug_assert!. If it's triggered, I'd think of that as an internal error where we somehow polluted our dependency list with ourselves that this PR is trying to guard against happening in general. But an assert panic seems much better to deal with than a stackoverflow.

I'm not certain AssetDependentOnSelf is the best name. Perhaps another contender is AssetLoadedSelfPath. That is only emitted by the immediate loaders.

The deferred loaders will work with self-paths and be given the same handle. This PR merely emit a warning when this is detected:

Asset from path {} loaded the same path as a dependent, ignoring.

I'm not entirely convinced we ought to emit this warning since it works and we're now avoiding the reload bug. Also the message could be clearer. What's it ignoring? (Adding itself as a dependent.) This depends mostly on whether and how strongly you want to discourage self-path loads.

@shanecelis shanecelis marked this pull request as ready for review October 27, 2025 03:19
@shanecelis shanecelis requested a review from andriyDev October 27, 2025 03:22
@shanecelis
Copy link
Contributor Author

I took my own suggestions from my last comments and applied them thusly:

  • Removed struct AssetDependentOnSelf.
  • Added direct enum variant LoadSelfPath to LoadDirectError.
  • Changed warn!s in deferred loaders to debug!s with this message:

Asset from path {:?} loaded its self path

I changed the error name from being about the negative effect of polluting dependencies that caused the stackoverflow, which are now avoided in this PR, to being about the behavior—a load of self-path—that causes the error.

Copy link
Contributor

@andriyDev andriyDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that I think is missing here is that it only fixes the "shallow" case. If instead we have something like a load chain like root -> intermediate -> root, suddenly we're back to the stack overflow right?

@shanecelis
Copy link
Contributor Author

Correct.

@andriyDev
Copy link
Contributor

I'm fine to merge this as an easy fix without dealing with the "deep" case, but can you file an issue to track the deep case and then add a TODO on the loading checks to indicate we need a more comprehensive fix?

@shanecelis
Copy link
Contributor Author

Will file an issue. I changed the extension in the tests to '.rsp' (Recursive Self Path). I changed the error name from LoadSelfPath to RequestedSelfPath, which is a better description and is in line with the RequestedSubAsset variant, and I strived to make the error message clearer too:

The asset at path {0:?} requested to immediately load itself recursively, but this is not supported

Copy link
Contributor

@andriyDev andriyDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So sorry for the delay. I should have gotten to this sooner! I think it's really close now though. Just need to fixup that unknown test.

@shanecelis
Copy link
Contributor Author

Thanks, @andriyDev! The tests are working now.

@shanecelis
Copy link
Contributor Author

Looks like I spoke too soon. I don't understand how the CI tests are failing. They complain that LoadContext doesn't have the method asset_path(), but that's public and not conditional as far as I can tell. Any idea what's wrong? Should I rebase, @andriyDev?

@shanecelis shanecelis force-pushed the fix/asset-reload-overflow-v0.17 branch from de0c07b to 3881c8c Compare November 6, 2025 04:56
@shanecelis
Copy link
Contributor Author

I rebased, updated, and this is mostly ready to go. However, my last two commits basically made a platform-specific carve out for windows on the no_error_on_unknown_type_deferred_load_of_self_path test. On windows it seems I can request a self-path, wait for it to be in state Loaded, but then I can't retrieve the asset from the handle, which I didn't think was possible. Currently I've made the test error if it ever works on Windows.

@alice-i-cecile alice-i-cecile added S-Needs-Review Needs reviewer attention (from anyone!) to move forward and removed S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged labels Nov 6, 2025
@shanecelis shanecelis requested a review from andriyDev November 14, 2025 20:22
Copy link
Contributor

@andriyDev andriyDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last change. Sorry I missed this!

@alice-i-cecile alice-i-cecile added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Dec 14, 2025
@alice-i-cecile alice-i-cecile added this pull request to the merge queue Dec 14, 2025
Merged via the queue into bevyengine:main with commit ac7ee14 Dec 14, 2025
38 checks passed
andriyDev added a commit to andriyDev/bevy that referenced this pull request Dec 16, 2025
andriyDev added a commit to andriyDev/bevy that referenced this pull request Dec 16, 2025
github-merge-queue bot pushed a commit that referenced this pull request Dec 16, 2025
This reverts commit ac7ee14.

# Objective

- Our CI has been much more flaky with this PR.

## Solution

- Temporarily revert this PR while we investigate the flakes.

## Testing

- Running the asset tests on repeat fairly quickly results in a deadlock
with this PR. Without it is **much much** less likely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Assets Load files from disk to use for things like images, models, and sounds C-Bug An unexpected or incorrect behavior S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants