-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Refactor internals of table initialization and management #11416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor internals of table initialization and management #11416
Conversation
This commit refactors the evaluation of constant expressions during instantiation for example to use `Val` instead of `ValRaw`. Previously the usage of `ValRaw` meant that wasm was disallowed from performing a GC during const evaluation, but currently constant expressions can indeed perform a GC. The goal of this commit is to lift this limitation. This is expected to be a minor slowdown for modules that hit this path, but most modules shouldn't hit this in a hot loop since LLVM doesn't generate modules that use this branch of const eval. The usage of `Val` brings a number of benefits and refactorings associated with it: * Const-evaluation is generally safer than before since everything is higher-level. * GC types in const-eval were almost already using `Val` meaning that there's actually fewer conversions now. * Instantiation code was refactored to use `wasmtime::*`-API types instead of low-level VM types. This deduplicates a good deal and lifts complexity out of the raw VM bits. Another issue that this commit fixes is to change how table initialization is modeled internally in `vm::Instance::table_init_segment`. Previously this was done by removing the tables from an instance to get a split borrow into the store and the table. This is not valid though because if, during initialization, a GC is performed then the table is not present to find roots through the table. This function is refactored to scope borrows to within a loop instead of over a loop via various refactorings and such and usage of higher level APIs. This is again, like above, expected to pessimize performance but this is also not known to be a hot path for modules at this time.
This commit is a refactoring of how tables work within Wasmtime to avoid
funneling table elements through a `TableElement` enum internally.
Instead methods are "exploded" to `grow_{gc_ref,func,cont}` which means,
for example, funcrefs don't need a GcStore. The main motivation for this
change was to avoid the idiom where `TableElement` represents a cloned,
but unrooted, GC reference.
Prior to this commit there were a number of subtle bugs in the table
code for Wasmtime where write barriers were forgotten on `table.init`,
`table.set` (via the embedder API), and `table.grow`. While `table.fill`
correctly handled the GC references it was awkward to get everything
else working consistently so I opted to remove `TableElement` entirely
to make it more clear that `&VMGcRef` is ubiquitously used meaning that
the write barriers, for example, are the same as other parts of the
Wasmtime codebase.
This has a few extra tests for "make sure this doesn't leak" to ensure
that GC works correctly with new barriers in place.
Subscribe to Label Actioncc @fitzgen
This issue or pull request has been labeled: "wasmtime:api", "wasmtime:ref-types"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but withholding approval until we figure out the perf regression, since I expect that I will probably need to give this another once-over after resolving that
| // FIXME: `grow_gc_ref` shouldn't require the whole store | ||
| // and should require `AutoAssertNoGc`. For now though we | ||
| // know that table growth doesn't trigger GC so it should be | ||
| // ok to create a copy of the GC reference even though it's | ||
| // not tracked anywhere. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we make grow_gc_ref take an AutoAssertNoGc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That can't work right now as growth requires the whole dyn VMStore to perform a fiber suspension but AutoAssertNoGc only requires the StoreOpaque. Soon though...
| Val::AnyRef(None) | ||
| } | ||
|
|
||
| pub(crate) const fn null_top(top: WasmHeapTopType) -> Val { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside: we should probably add a wasmtime::HeapTopType because the environ version is so useful all over the place and it is really nice to exhaustive match on top types. Then we could expose this helper to embedders too, which seems like it would be useful.
copying a message I sent Alex in private chat here for posterity:
|
|
@fitzgen ok with the most recent commits I've clawed back all of the performance and then some, so should be good to go. This makes me want to generate a Cranelift-compiled function for wasm instance initialization really. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
…liance#11416) * Refactor const-eval to use `Val`, not `ValRaw` This commit refactors the evaluation of constant expressions during instantiation for example to use `Val` instead of `ValRaw`. Previously the usage of `ValRaw` meant that wasm was disallowed from performing a GC during const evaluation, but currently constant expressions can indeed perform a GC. The goal of this commit is to lift this limitation. This is expected to be a minor slowdown for modules that hit this path, but most modules shouldn't hit this in a hot loop since LLVM doesn't generate modules that use this branch of const eval. The usage of `Val` brings a number of benefits and refactorings associated with it: * Const-evaluation is generally safer than before since everything is higher-level. * GC types in const-eval were almost already using `Val` meaning that there's actually fewer conversions now. * Instantiation code was refactored to use `wasmtime::*`-API types instead of low-level VM types. This deduplicates a good deal and lifts complexity out of the raw VM bits. Another issue that this commit fixes is to change how table initialization is modeled internally in `vm::Instance::table_init_segment`. Previously this was done by removing the tables from an instance to get a split borrow into the store and the table. This is not valid though because if, during initialization, a GC is performed then the table is not present to find roots through the table. This function is refactored to scope borrows to within a loop instead of over a loop via various refactorings and such and usage of higher level APIs. This is again, like above, expected to pessimize performance but this is also not known to be a hot path for modules at this time. * Remove the `TableElement` type This commit is a refactoring of how tables work within Wasmtime to avoid funneling table elements through a `TableElement` enum internally. Instead methods are "exploded" to `grow_{gc_ref,func,cont}` which means, for example, funcrefs don't need a GcStore. The main motivation for this change was to avoid the idiom where `TableElement` represents a cloned, but unrooted, GC reference. Prior to this commit there were a number of subtle bugs in the table code for Wasmtime where write barriers were forgotten on `table.init`, `table.set` (via the embedder API), and `table.grow`. While `table.fill` correctly handled the GC references it was awkward to get everything else working consistently so I opted to remove `TableElement` entirely to make it more clear that `&VMGcRef` is ubiquitously used meaning that the write barriers, for example, are the same as other parts of the Wasmtime codebase. This has a few extra tests for "make sure this doesn't leak" to ensure that GC works correctly with new barriers in place. * Fix some lints and warnings * Fix wmemcheck build * Review comments * Optimize const eval and global initialization * Fix compile * Fix lints

This is a pair of commit inspired by semi-related work in Wasmtime to make progress on #11262. I got stuck in one particular area and as I stared more at the code I realized there were deeper issues that needed fixing before I start tackling #11262. Namely this PR fixes bugs such as:
table.init, meaning that overwritten values were leaked.table.setvia the embedder API, meaning overwritten values were leaked.table.growwas not properly dropped via the embedder API, meaning that whatever reference that was passed in was leaked.table.initbecause an instances tables were temporarily removed to help work within the borrow checker.All of the above have now been addressed with a set of new tests that assert no leaks and which trigger GC-during-const-eval. More details can be found in the commit messages but my hope with this refactoring is to close out this class of issues or make it more obvious when they show up in the future. Some algorithms were pessimized here by having to repeatedly re-access things from the store, but it's expected that these are not performance hot-spots and feature-complete-ness and correctness is more paramount.