-
Notifications
You must be signed in to change notification settings - Fork 1.6k
aarch64: Add support for "near" in LoadExtName
#11570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aarch64: Add support for "near" in LoadExtName
#11570
Conversation
The current aarch64 backend does not support `symbol_value` to get the
value of a function, for example, with a "near" relocation using a
relative relocation. Currently it uses an `Abs8` relocation which means
that it's not suitable in Wasmtime, for example.
This commit refactors relocation/external name support in the aarch64
backend to support this mode of relocation. The previous `LoadExtName`
was split into `LoadExtName{Got,Near,Far}` where the "near" bit is
what's new to the backend. The preexisting `symbol-value.clif`-style
tests were updated to match the x64 backend which has a more
comprehensive suite of examples of what it looks like to refer to
various symbols.
The goal of this commit is to enable Wasmtime to generate code which
refers to a relative point elsewhere in the code (e.g. an exception
handler) and load the value into a register. This part isn't filled out
yet, but it seemed good to at least in the meantime fill out these
missing relocations in the backend.
cfallin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This definitely makes sense to have.
|
|
||
| ; VCode: | ||
| ; stp fp, lr, [sp, #-16]! | ||
| ; mov fp, sp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort of curious that the leaf-function optimization is no longer happening here, and we're getting a frame now -- won't matter for Wasmtime since we always force frames, but is there a reason you're aware of that this is happening?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On looking a bit further, it seems that Function::is_leaf only looks to see if there are signatures -- so func_addr triggers the "not a leaf function" mode and forces the frame.
Mind filing an issue that we should probably determine leaf-ness by scanning VCode instead for any instructions that claim to be calls? It should be machine-dependent anyway since libcalls can happen even in functions without IR-level calls. (I suppose that is_leaf and the no-frame ABI optimization isn't even right, in that case -- but we get away with it because the aarch64 backend doesn't fall back to any libcalls for FP stuff, and also because this configuration isn't exposed to Wasmtime?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing! #11573
And yeah AFAIK we're ok in Wasmtime, but I''m not 100% sure in that judgment. Your reasoning sounds reasonable to me, however
Subscribe to Label Action
This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:machinst", "cranelift:area:x64", "isle"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
Needed for filetests
10d2cbc
This is the same as bytecodealliance#11570 but for the riscv64 backend. The intention is to support "near" relocations which don't require `Abs8` relocations for upcoming use in Wasmtime. The same design as bytecodealliance#11570 is used here.
This is the same as bytecodealliance#11570 but for the riscv64 backend. The intention is to support "near" relocations which don't require `Abs8` relocations for upcoming use in Wasmtime. The same design as bytecodealliance#11570 is used here.
Since Wasmtime's inception it's used the `setjmp` and `longjmp`
functions in C to implement handling of traps. While this solution was
easy to implement, relatively portable, and performant enough, there are
a number of downsides that have evolved over time to make this an
unattractive approach in the long run:
* Using `setjmp` fundamentally requires using C because Rust does not
understand a function that returns twice. It's fundamentally unsound
to invoke `setjmp` in Rust meaning that Wasmtime has forever needed a
C compiler configured and set up to build. This notably means that
`cargo check` cannot check other targets easily.
* Using `longjmp` means that Rust function frames are unwound on the
stack without running destructors. This is a dangerous operation of
which we get no protection from the compiler about. Both frames
entering wasm and frames exiting wasm are all skipped. Absolutely
minimizing this has been beneficial for portability to platforms such
as Pulley.
* Currently the no_std implementation of Wasmtime requires embedders to
provide `wasmtime_{setjmp,longjmp}` which is a thorn in the side of
what is otherwise a mostly entirely independent implementation of
Wasmtime.
* There is a performance floor to using `setjmp` and `longjmp`. Calling
`setjmp` requires using C but Wasmtime is otherwise written in Rust
meaning that there's a Rust->C->Rust->Wasm boundary which
fundamentally can't be inlined without cross-language LTO which is
difficult to configure.
* With the implementation of the WebAssembly exceptions proposal
Wasmtime now has two means of unwinding the stack. Ideally Wasmtime
would only have one, and the more general one is the method of
exceptions.
* Jumping out of a signal handler on Unix is tricky business. While
we've made it work it's generally most robust of the signal handler
simply returns which it now does.
With all of that in mind the purpose of this commit is to replace the
setjmp/longjmp mechanism of handling traps with the recently implemented
support for exceptions in Cranelift. That is intended to resolve all of
the above points in one swoop.
One point in particular though that's nice about setjmp/longjmp is that
unwinding the stack on a trap is an O(1) operation. For situations such
as stack overflow that's a particularly nice property to have as we can
guarantee embedders that traps are a constant time (albeit somewhat
expensive with signals) operation. Exceptions naively require unwinding
the entire stack, and although frame pointers mean we're just traversing
a linked list I wanted to preserve the O(1) property here nonetheless.
To achieve this a solution is implemented where the array-to-wasm
(host-to-wasm) trampolines setup state in `VMStoreContext` so looking up
the current trap handler frame is an O(1) operation. Namely the sp/fp/pc
values for a `Handler` are stored inline.
Implementing this feature required supporting
relocations-to-offsets-in-functions which was not previously supported
by Wasmtime. This required Cranelift refactorings such as bytecodealliance#11570, bytecodealliance#11585,
and bytecodealliance#11576. This then additionally required some more refactoring in
this commit which was difficult to split out as it otherwise wouldn't be
tested.
Apart from the relocation-related business much of this change is about
updating the platform signal handlers to use exceptions instead of
longjmp to return. For example on Unix this means updating the
`ucontext_t` with register values that the handler specifies. Windows
involves updating similar contexts, and macOS mach ports ended up not
needing too many changes.
In terms of overall performance the relevant benchmark from this
repository, compared to before this commit, is:
sync/no-hook/core - host-to-wasm - typed - nop
time: [10.552 ns 10.561 ns 10.571 ns]
change: [−7.5238% −7.4011% −7.2786%] (p = 0.00 < 0.05)
Performance has improved.
Closes bytecodealliance#3927
cc bytecodealliance#10923
prtest:full
This is the same as bytecodealliance#11570 but for the riscv64 backend. The intention is to support "near" relocations which don't require `Abs8` relocations for upcoming use in Wasmtime. The same design as bytecodealliance#11570 is used here.
This is the same as bytecodealliance#11570 but for the riscv64 backend. The intention is to support "near" relocations which don't require `Abs8` relocations for upcoming use in Wasmtime. The same design as bytecodealliance#11570 is used here.
Since Wasmtime's inception it's used the `setjmp` and `longjmp`
functions in C to implement handling of traps. While this solution was
easy to implement, relatively portable, and performant enough, there are
a number of downsides that have evolved over time to make this an
unattractive approach in the long run:
* Using `setjmp` fundamentally requires using C because Rust does not
understand a function that returns twice. It's fundamentally unsound
to invoke `setjmp` in Rust meaning that Wasmtime has forever needed a
C compiler configured and set up to build. This notably means that
`cargo check` cannot check other targets easily.
* Using `longjmp` means that Rust function frames are unwound on the
stack without running destructors. This is a dangerous operation of
which we get no protection from the compiler about. Both frames
entering wasm and frames exiting wasm are all skipped. Absolutely
minimizing this has been beneficial for portability to platforms such
as Pulley.
* Currently the no_std implementation of Wasmtime requires embedders to
provide `wasmtime_{setjmp,longjmp}` which is a thorn in the side of
what is otherwise a mostly entirely independent implementation of
Wasmtime.
* There is a performance floor to using `setjmp` and `longjmp`. Calling
`setjmp` requires using C but Wasmtime is otherwise written in Rust
meaning that there's a Rust->C->Rust->Wasm boundary which
fundamentally can't be inlined without cross-language LTO which is
difficult to configure.
* With the implementation of the WebAssembly exceptions proposal
Wasmtime now has two means of unwinding the stack. Ideally Wasmtime
would only have one, and the more general one is the method of
exceptions.
* Jumping out of a signal handler on Unix is tricky business. While
we've made it work it's generally most robust of the signal handler
simply returns which it now does.
With all of that in mind the purpose of this commit is to replace the
setjmp/longjmp mechanism of handling traps with the recently implemented
support for exceptions in Cranelift. That is intended to resolve all of
the above points in one swoop.
One point in particular though that's nice about setjmp/longjmp is that
unwinding the stack on a trap is an O(1) operation. For situations such
as stack overflow that's a particularly nice property to have as we can
guarantee embedders that traps are a constant time (albeit somewhat
expensive with signals) operation. Exceptions naively require unwinding
the entire stack, and although frame pointers mean we're just traversing
a linked list I wanted to preserve the O(1) property here nonetheless.
To achieve this a solution is implemented where the array-to-wasm
(host-to-wasm) trampolines setup state in `VMStoreContext` so looking up
the current trap handler frame is an O(1) operation. Namely the sp/fp/pc
values for a `Handler` are stored inline.
Implementing this feature required supporting
relocations-to-offsets-in-functions which was not previously supported
by Wasmtime. This required Cranelift refactorings such as bytecodealliance#11570, bytecodealliance#11585,
and bytecodealliance#11576. This then additionally required some more refactoring in
this commit which was difficult to split out as it otherwise wouldn't be
tested.
Apart from the relocation-related business much of this change is about
updating the platform signal handlers to use exceptions instead of
longjmp to return. For example on Unix this means updating the
`ucontext_t` with register values that the handler specifies. Windows
involves updating similar contexts, and macOS mach ports ended up not
needing too many changes.
In terms of overall performance the relevant benchmark from this
repository, compared to before this commit, is:
sync/no-hook/core - host-to-wasm - typed - nop
time: [10.552 ns 10.561 ns 10.571 ns]
change: [−7.5238% −7.4011% −7.2786%] (p = 0.00 < 0.05)
Performance has improved.
Closes bytecodealliance#3927
cc bytecodealliance#10923
prtest:full
Since Wasmtime's inception it's used the `setjmp` and `longjmp`
functions in C to implement handling of traps. While this solution was
easy to implement, relatively portable, and performant enough, there are
a number of downsides that have evolved over time to make this an
unattractive approach in the long run:
* Using `setjmp` fundamentally requires using C because Rust does not
understand a function that returns twice. It's fundamentally unsound
to invoke `setjmp` in Rust meaning that Wasmtime has forever needed a
C compiler configured and set up to build. This notably means that
`cargo check` cannot check other targets easily.
* Using `longjmp` means that Rust function frames are unwound on the
stack without running destructors. This is a dangerous operation of
which we get no protection from the compiler about. Both frames
entering wasm and frames exiting wasm are all skipped. Absolutely
minimizing this has been beneficial for portability to platforms such
as Pulley.
* Currently the no_std implementation of Wasmtime requires embedders to
provide `wasmtime_{setjmp,longjmp}` which is a thorn in the side of
what is otherwise a mostly entirely independent implementation of
Wasmtime.
* There is a performance floor to using `setjmp` and `longjmp`. Calling
`setjmp` requires using C but Wasmtime is otherwise written in Rust
meaning that there's a Rust->C->Rust->Wasm boundary which
fundamentally can't be inlined without cross-language LTO which is
difficult to configure.
* With the implementation of the WebAssembly exceptions proposal
Wasmtime now has two means of unwinding the stack. Ideally Wasmtime
would only have one, and the more general one is the method of
exceptions.
* Jumping out of a signal handler on Unix is tricky business. While
we've made it work it's generally most robust of the signal handler
simply returns which it now does.
With all of that in mind the purpose of this commit is to replace the
setjmp/longjmp mechanism of handling traps with the recently implemented
support for exceptions in Cranelift. That is intended to resolve all of
the above points in one swoop.
One point in particular though that's nice about setjmp/longjmp is that
unwinding the stack on a trap is an O(1) operation. For situations such
as stack overflow that's a particularly nice property to have as we can
guarantee embedders that traps are a constant time (albeit somewhat
expensive with signals) operation. Exceptions naively require unwinding
the entire stack, and although frame pointers mean we're just traversing
a linked list I wanted to preserve the O(1) property here nonetheless.
To achieve this a solution is implemented where the array-to-wasm
(host-to-wasm) trampolines setup state in `VMStoreContext` so looking up
the current trap handler frame is an O(1) operation. Namely the sp/fp/pc
values for a `Handler` are stored inline.
Implementing this feature required supporting
relocations-to-offsets-in-functions which was not previously supported
by Wasmtime. This required Cranelift refactorings such as bytecodealliance#11570, bytecodealliance#11585,
and bytecodealliance#11576. This then additionally required some more refactoring in
this commit which was difficult to split out as it otherwise wouldn't be
tested.
Apart from the relocation-related business much of this change is about
updating the platform signal handlers to use exceptions instead of
longjmp to return. For example on Unix this means updating the
`ucontext_t` with register values that the handler specifies. Windows
involves updating similar contexts, and macOS mach ports ended up not
needing too many changes.
In terms of overall performance the relevant benchmark from this
repository, compared to before this commit, is:
sync/no-hook/core - host-to-wasm - typed - nop
time: [10.552 ns 10.561 ns 10.571 ns]
change: [−7.5238% −7.4011% −7.2786%] (p = 0.00 < 0.05)
Performance has improved.
Closes bytecodealliance#3927
cc bytecodealliance#10923
prtest:full
Since Wasmtime's inception it's used the `setjmp` and `longjmp`
functions in C to implement handling of traps. While this solution was
easy to implement, relatively portable, and performant enough, there are
a number of downsides that have evolved over time to make this an
unattractive approach in the long run:
* Using `setjmp` fundamentally requires using C because Rust does not
understand a function that returns twice. It's fundamentally unsound
to invoke `setjmp` in Rust meaning that Wasmtime has forever needed a
C compiler configured and set up to build. This notably means that
`cargo check` cannot check other targets easily.
* Using `longjmp` means that Rust function frames are unwound on the
stack without running destructors. This is a dangerous operation of
which we get no protection from the compiler about. Both frames
entering wasm and frames exiting wasm are all skipped. Absolutely
minimizing this has been beneficial for portability to platforms such
as Pulley.
* Currently the no_std implementation of Wasmtime requires embedders to
provide `wasmtime_{setjmp,longjmp}` which is a thorn in the side of
what is otherwise a mostly entirely independent implementation of
Wasmtime.
* There is a performance floor to using `setjmp` and `longjmp`. Calling
`setjmp` requires using C but Wasmtime is otherwise written in Rust
meaning that there's a Rust->C->Rust->Wasm boundary which
fundamentally can't be inlined without cross-language LTO which is
difficult to configure.
* With the implementation of the WebAssembly exceptions proposal
Wasmtime now has two means of unwinding the stack. Ideally Wasmtime
would only have one, and the more general one is the method of
exceptions.
* Jumping out of a signal handler on Unix is tricky business. While
we've made it work it's generally most robust of the signal handler
simply returns which it now does.
With all of that in mind the purpose of this commit is to replace the
setjmp/longjmp mechanism of handling traps with the recently implemented
support for exceptions in Cranelift. That is intended to resolve all of
the above points in one swoop.
One point in particular though that's nice about setjmp/longjmp is that
unwinding the stack on a trap is an O(1) operation. For situations such
as stack overflow that's a particularly nice property to have as we can
guarantee embedders that traps are a constant time (albeit somewhat
expensive with signals) operation. Exceptions naively require unwinding
the entire stack, and although frame pointers mean we're just traversing
a linked list I wanted to preserve the O(1) property here nonetheless.
To achieve this a solution is implemented where the array-to-wasm
(host-to-wasm) trampolines setup state in `VMStoreContext` so looking up
the current trap handler frame is an O(1) operation. Namely the sp/fp/pc
values for a `Handler` are stored inline.
Implementing this feature required supporting
relocations-to-offsets-in-functions which was not previously supported
by Wasmtime. This required Cranelift refactorings such as bytecodealliance#11570, bytecodealliance#11585,
and bytecodealliance#11576. This then additionally required some more refactoring in
this commit which was difficult to split out as it otherwise wouldn't be
tested.
Apart from the relocation-related business much of this change is about
updating the platform signal handlers to use exceptions instead of
longjmp to return. For example on Unix this means updating the
`ucontext_t` with register values that the handler specifies. Windows
involves updating similar contexts, and macOS mach ports ended up not
needing too many changes.
In terms of overall performance the relevant benchmark from this
repository, compared to before this commit, is:
sync/no-hook/core - host-to-wasm - typed - nop
time: [10.552 ns 10.561 ns 10.571 ns]
change: [−7.5238% −7.4011% −7.2786%] (p = 0.00 < 0.05)
Performance has improved.
Closes bytecodealliance#3927
cc bytecodealliance#10923
prtest:full
Since Wasmtime's inception it's used the `setjmp` and `longjmp`
functions in C to implement handling of traps. While this solution was
easy to implement, relatively portable, and performant enough, there are
a number of downsides that have evolved over time to make this an
unattractive approach in the long run:
* Using `setjmp` fundamentally requires using C because Rust does not
understand a function that returns twice. It's fundamentally unsound
to invoke `setjmp` in Rust meaning that Wasmtime has forever needed a
C compiler configured and set up to build. This notably means that
`cargo check` cannot check other targets easily.
* Using `longjmp` means that Rust function frames are unwound on the
stack without running destructors. This is a dangerous operation of
which we get no protection from the compiler about. Both frames
entering wasm and frames exiting wasm are all skipped. Absolutely
minimizing this has been beneficial for portability to platforms such
as Pulley.
* Currently the no_std implementation of Wasmtime requires embedders to
provide `wasmtime_{setjmp,longjmp}` which is a thorn in the side of
what is otherwise a mostly entirely independent implementation of
Wasmtime.
* There is a performance floor to using `setjmp` and `longjmp`. Calling
`setjmp` requires using C but Wasmtime is otherwise written in Rust
meaning that there's a Rust->C->Rust->Wasm boundary which
fundamentally can't be inlined without cross-language LTO which is
difficult to configure.
* With the implementation of the WebAssembly exceptions proposal
Wasmtime now has two means of unwinding the stack. Ideally Wasmtime
would only have one, and the more general one is the method of
exceptions.
* Jumping out of a signal handler on Unix is tricky business. While
we've made it work it's generally most robust of the signal handler
simply returns which it now does.
With all of that in mind the purpose of this commit is to replace the
setjmp/longjmp mechanism of handling traps with the recently implemented
support for exceptions in Cranelift. That is intended to resolve all of
the above points in one swoop.
One point in particular though that's nice about setjmp/longjmp is that
unwinding the stack on a trap is an O(1) operation. For situations such
as stack overflow that's a particularly nice property to have as we can
guarantee embedders that traps are a constant time (albeit somewhat
expensive with signals) operation. Exceptions naively require unwinding
the entire stack, and although frame pointers mean we're just traversing
a linked list I wanted to preserve the O(1) property here nonetheless.
To achieve this a solution is implemented where the array-to-wasm
(host-to-wasm) trampolines setup state in `VMStoreContext` so looking up
the current trap handler frame is an O(1) operation. Namely the sp/fp/pc
values for a `Handler` are stored inline.
Implementing this feature required supporting
relocations-to-offsets-in-functions which was not previously supported
by Wasmtime. This required Cranelift refactorings such as bytecodealliance#11570, bytecodealliance#11585,
and bytecodealliance#11576. This then additionally required some more refactoring in
this commit which was difficult to split out as it otherwise wouldn't be
tested.
Apart from the relocation-related business much of this change is about
updating the platform signal handlers to use exceptions instead of
longjmp to return. For example on Unix this means updating the
`ucontext_t` with register values that the handler specifies. Windows
involves updating similar contexts, and macOS mach ports ended up not
needing too many changes.
In terms of overall performance the relevant benchmark from this
repository, compared to before this commit, is:
sync/no-hook/core - host-to-wasm - typed - nop
time: [10.552 ns 10.561 ns 10.571 ns]
change: [−7.5238% −7.4011% −7.2786%] (p = 0.00 < 0.05)
Performance has improved.
Closes bytecodealliance#3927
cc bytecodealliance#10923
prtest:full
Since Wasmtime's inception it's used the `setjmp` and `longjmp`
functions in C to implement handling of traps. While this solution was
easy to implement, relatively portable, and performant enough, there are
a number of downsides that have evolved over time to make this an
unattractive approach in the long run:
* Using `setjmp` fundamentally requires using C because Rust does not
understand a function that returns twice. It's fundamentally unsound
to invoke `setjmp` in Rust meaning that Wasmtime has forever needed a
C compiler configured and set up to build. This notably means that
`cargo check` cannot check other targets easily.
* Using `longjmp` means that Rust function frames are unwound on the
stack without running destructors. This is a dangerous operation of
which we get no protection from the compiler about. Both frames
entering wasm and frames exiting wasm are all skipped. Absolutely
minimizing this has been beneficial for portability to platforms such
as Pulley.
* Currently the no_std implementation of Wasmtime requires embedders to
provide `wasmtime_{setjmp,longjmp}` which is a thorn in the side of
what is otherwise a mostly entirely independent implementation of
Wasmtime.
* There is a performance floor to using `setjmp` and `longjmp`. Calling
`setjmp` requires using C but Wasmtime is otherwise written in Rust
meaning that there's a Rust->C->Rust->Wasm boundary which
fundamentally can't be inlined without cross-language LTO which is
difficult to configure.
* With the implementation of the WebAssembly exceptions proposal
Wasmtime now has two means of unwinding the stack. Ideally Wasmtime
would only have one, and the more general one is the method of
exceptions.
* Jumping out of a signal handler on Unix is tricky business. While
we've made it work it's generally most robust of the signal handler
simply returns which it now does.
With all of that in mind the purpose of this commit is to replace the
setjmp/longjmp mechanism of handling traps with the recently implemented
support for exceptions in Cranelift. That is intended to resolve all of
the above points in one swoop.
One point in particular though that's nice about setjmp/longjmp is that
unwinding the stack on a trap is an O(1) operation. For situations such
as stack overflow that's a particularly nice property to have as we can
guarantee embedders that traps are a constant time (albeit somewhat
expensive with signals) operation. Exceptions naively require unwinding
the entire stack, and although frame pointers mean we're just traversing
a linked list I wanted to preserve the O(1) property here nonetheless.
To achieve this a solution is implemented where the array-to-wasm
(host-to-wasm) trampolines setup state in `VMStoreContext` so looking up
the current trap handler frame is an O(1) operation. Namely the sp/fp/pc
values for a `Handler` are stored inline.
Implementing this feature required supporting
relocations-to-offsets-in-functions which was not previously supported
by Wasmtime. This required Cranelift refactorings such as bytecodealliance#11570, bytecodealliance#11585,
and bytecodealliance#11576. This then additionally required some more refactoring in
this commit which was difficult to split out as it otherwise wouldn't be
tested.
Apart from the relocation-related business much of this change is about
updating the platform signal handlers to use exceptions instead of
longjmp to return. For example on Unix this means updating the
`ucontext_t` with register values that the handler specifies. Windows
involves updating similar contexts, and macOS mach ports ended up not
needing too many changes.
In terms of overall performance the relevant benchmark from this
repository, compared to before this commit, is:
sync/no-hook/core - host-to-wasm - typed - nop
time: [10.552 ns 10.561 ns 10.571 ns]
change: [−7.5238% −7.4011% −7.2786%] (p = 0.00 < 0.05)
Performance has improved.
Closes bytecodealliance#3927
cc bytecodealliance#10923
prtest:full
Since Wasmtime's inception it's used the `setjmp` and `longjmp`
functions in C to implement handling of traps. While this solution was
easy to implement, relatively portable, and performant enough, there are
a number of downsides that have evolved over time to make this an
unattractive approach in the long run:
* Using `setjmp` fundamentally requires using C because Rust does not
understand a function that returns twice. It's fundamentally unsound
to invoke `setjmp` in Rust meaning that Wasmtime has forever needed a
C compiler configured and set up to build. This notably means that
`cargo check` cannot check other targets easily.
* Using `longjmp` means that Rust function frames are unwound on the
stack without running destructors. This is a dangerous operation of
which we get no protection from the compiler about. Both frames
entering wasm and frames exiting wasm are all skipped. Absolutely
minimizing this has been beneficial for portability to platforms such
as Pulley.
* Currently the no_std implementation of Wasmtime requires embedders to
provide `wasmtime_{setjmp,longjmp}` which is a thorn in the side of
what is otherwise a mostly entirely independent implementation of
Wasmtime.
* There is a performance floor to using `setjmp` and `longjmp`. Calling
`setjmp` requires using C but Wasmtime is otherwise written in Rust
meaning that there's a Rust->C->Rust->Wasm boundary which
fundamentally can't be inlined without cross-language LTO which is
difficult to configure.
* With the implementation of the WebAssembly exceptions proposal
Wasmtime now has two means of unwinding the stack. Ideally Wasmtime
would only have one, and the more general one is the method of
exceptions.
* Jumping out of a signal handler on Unix is tricky business. While
we've made it work it's generally most robust of the signal handler
simply returns which it now does.
With all of that in mind the purpose of this commit is to replace the
setjmp/longjmp mechanism of handling traps with the recently implemented
support for exceptions in Cranelift. That is intended to resolve all of
the above points in one swoop.
One point in particular though that's nice about setjmp/longjmp is that
unwinding the stack on a trap is an O(1) operation. For situations such
as stack overflow that's a particularly nice property to have as we can
guarantee embedders that traps are a constant time (albeit somewhat
expensive with signals) operation. Exceptions naively require unwinding
the entire stack, and although frame pointers mean we're just traversing
a linked list I wanted to preserve the O(1) property here nonetheless.
To achieve this a solution is implemented where the array-to-wasm
(host-to-wasm) trampolines setup state in `VMStoreContext` so looking up
the current trap handler frame is an O(1) operation. Namely the sp/fp/pc
values for a `Handler` are stored inline.
Implementing this feature required supporting
relocations-to-offsets-in-functions which was not previously supported
by Wasmtime. This required Cranelift refactorings such as bytecodealliance#11570, bytecodealliance#11585,
and bytecodealliance#11576. This then additionally required some more refactoring in
this commit which was difficult to split out as it otherwise wouldn't be
tested.
Apart from the relocation-related business much of this change is about
updating the platform signal handlers to use exceptions instead of
longjmp to return. For example on Unix this means updating the
`ucontext_t` with register values that the handler specifies. Windows
involves updating similar contexts, and macOS mach ports ended up not
needing too many changes.
In terms of overall performance the relevant benchmark from this
repository, compared to before this commit, is:
sync/no-hook/core - host-to-wasm - typed - nop
time: [10.552 ns 10.561 ns 10.571 ns]
change: [−7.5238% −7.4011% −7.2786%] (p = 0.00 < 0.05)
Performance has improved.
Closes bytecodealliance#3927
cc bytecodealliance#10923
prtest:full
Since Wasmtime's inception it's used the `setjmp` and `longjmp`
functions in C to implement handling of traps. While this solution was
easy to implement, relatively portable, and performant enough, there are
a number of downsides that have evolved over time to make this an
unattractive approach in the long run:
* Using `setjmp` fundamentally requires using C because Rust does not
understand a function that returns twice. It's fundamentally unsound
to invoke `setjmp` in Rust meaning that Wasmtime has forever needed a
C compiler configured and set up to build. This notably means that
`cargo check` cannot check other targets easily.
* Using `longjmp` means that Rust function frames are unwound on the
stack without running destructors. This is a dangerous operation of
which we get no protection from the compiler about. Both frames
entering wasm and frames exiting wasm are all skipped. Absolutely
minimizing this has been beneficial for portability to platforms such
as Pulley.
* Currently the no_std implementation of Wasmtime requires embedders to
provide `wasmtime_{setjmp,longjmp}` which is a thorn in the side of
what is otherwise a mostly entirely independent implementation of
Wasmtime.
* There is a performance floor to using `setjmp` and `longjmp`. Calling
`setjmp` requires using C but Wasmtime is otherwise written in Rust
meaning that there's a Rust->C->Rust->Wasm boundary which
fundamentally can't be inlined without cross-language LTO which is
difficult to configure.
* With the implementation of the WebAssembly exceptions proposal
Wasmtime now has two means of unwinding the stack. Ideally Wasmtime
would only have one, and the more general one is the method of
exceptions.
* Jumping out of a signal handler on Unix is tricky business. While
we've made it work it's generally most robust of the signal handler
simply returns which it now does.
With all of that in mind the purpose of this commit is to replace the
setjmp/longjmp mechanism of handling traps with the recently implemented
support for exceptions in Cranelift. That is intended to resolve all of
the above points in one swoop.
One point in particular though that's nice about setjmp/longjmp is that
unwinding the stack on a trap is an O(1) operation. For situations such
as stack overflow that's a particularly nice property to have as we can
guarantee embedders that traps are a constant time (albeit somewhat
expensive with signals) operation. Exceptions naively require unwinding
the entire stack, and although frame pointers mean we're just traversing
a linked list I wanted to preserve the O(1) property here nonetheless.
To achieve this a solution is implemented where the array-to-wasm
(host-to-wasm) trampolines setup state in `VMStoreContext` so looking up
the current trap handler frame is an O(1) operation. Namely the sp/fp/pc
values for a `Handler` are stored inline.
Implementing this feature required supporting
relocations-to-offsets-in-functions which was not previously supported
by Wasmtime. This required Cranelift refactorings such as bytecodealliance#11570, bytecodealliance#11585,
and bytecodealliance#11576. This then additionally required some more refactoring in
this commit which was difficult to split out as it otherwise wouldn't be
tested.
Apart from the relocation-related business much of this change is about
updating the platform signal handlers to use exceptions instead of
longjmp to return. For example on Unix this means updating the
`ucontext_t` with register values that the handler specifies. Windows
involves updating similar contexts, and macOS mach ports ended up not
needing too many changes.
In terms of overall performance the relevant benchmark from this
repository, compared to before this commit, is:
sync/no-hook/core - host-to-wasm - typed - nop
time: [10.552 ns 10.561 ns 10.571 ns]
change: [−7.5238% −7.4011% −7.2786%] (p = 0.00 < 0.05)
Performance has improved.
Closes bytecodealliance#3927
cc bytecodealliance#10923
prtest:full
Since Wasmtime's inception it's used the `setjmp` and `longjmp`
functions in C to implement handling of traps. While this solution was
easy to implement, relatively portable, and performant enough, there are
a number of downsides that have evolved over time to make this an
unattractive approach in the long run:
* Using `setjmp` fundamentally requires using C because Rust does not
understand a function that returns twice. It's fundamentally unsound
to invoke `setjmp` in Rust meaning that Wasmtime has forever needed a
C compiler configured and set up to build. This notably means that
`cargo check` cannot check other targets easily.
* Using `longjmp` means that Rust function frames are unwound on the
stack without running destructors. This is a dangerous operation of
which we get no protection from the compiler about. Both frames
entering wasm and frames exiting wasm are all skipped. Absolutely
minimizing this has been beneficial for portability to platforms such
as Pulley.
* Currently the no_std implementation of Wasmtime requires embedders to
provide `wasmtime_{setjmp,longjmp}` which is a thorn in the side of
what is otherwise a mostly entirely independent implementation of
Wasmtime.
* There is a performance floor to using `setjmp` and `longjmp`. Calling
`setjmp` requires using C but Wasmtime is otherwise written in Rust
meaning that there's a Rust->C->Rust->Wasm boundary which
fundamentally can't be inlined without cross-language LTO which is
difficult to configure.
* With the implementation of the WebAssembly exceptions proposal
Wasmtime now has two means of unwinding the stack. Ideally Wasmtime
would only have one, and the more general one is the method of
exceptions.
* Jumping out of a signal handler on Unix is tricky business. While
we've made it work it's generally most robust of the signal handler
simply returns which it now does.
With all of that in mind the purpose of this commit is to replace the
setjmp/longjmp mechanism of handling traps with the recently implemented
support for exceptions in Cranelift. That is intended to resolve all of
the above points in one swoop.
One point in particular though that's nice about setjmp/longjmp is that
unwinding the stack on a trap is an O(1) operation. For situations such
as stack overflow that's a particularly nice property to have as we can
guarantee embedders that traps are a constant time (albeit somewhat
expensive with signals) operation. Exceptions naively require unwinding
the entire stack, and although frame pointers mean we're just traversing
a linked list I wanted to preserve the O(1) property here nonetheless.
To achieve this a solution is implemented where the array-to-wasm
(host-to-wasm) trampolines setup state in `VMStoreContext` so looking up
the current trap handler frame is an O(1) operation. Namely the sp/fp/pc
values for a `Handler` are stored inline.
Implementing this feature required supporting
relocations-to-offsets-in-functions which was not previously supported
by Wasmtime. This required Cranelift refactorings such as bytecodealliance#11570, bytecodealliance#11585,
and bytecodealliance#11576. This then additionally required some more refactoring in
this commit which was difficult to split out as it otherwise wouldn't be
tested.
Apart from the relocation-related business much of this change is about
updating the platform signal handlers to use exceptions instead of
longjmp to return. For example on Unix this means updating the
`ucontext_t` with register values that the handler specifies. Windows
involves updating similar contexts, and macOS mach ports ended up not
needing too many changes.
In terms of overall performance the relevant benchmark from this
repository, compared to before this commit, is:
sync/no-hook/core - host-to-wasm - typed - nop
time: [10.552 ns 10.561 ns 10.571 ns]
change: [−7.5238% −7.4011% −7.2786%] (p = 0.00 < 0.05)
Performance has improved.
Closes bytecodealliance#3927
cc bytecodealliance#10923
prtest:full
Since Wasmtime's inception it's used the `setjmp` and `longjmp`
functions in C to implement handling of traps. While this solution was
easy to implement, relatively portable, and performant enough, there are
a number of downsides that have evolved over time to make this an
unattractive approach in the long run:
* Using `setjmp` fundamentally requires using C because Rust does not
understand a function that returns twice. It's fundamentally unsound
to invoke `setjmp` in Rust meaning that Wasmtime has forever needed a
C compiler configured and set up to build. This notably means that
`cargo check` cannot check other targets easily.
* Using `longjmp` means that Rust function frames are unwound on the
stack without running destructors. This is a dangerous operation of
which we get no protection from the compiler about. Both frames
entering wasm and frames exiting wasm are all skipped. Absolutely
minimizing this has been beneficial for portability to platforms such
as Pulley.
* Currently the no_std implementation of Wasmtime requires embedders to
provide `wasmtime_{setjmp,longjmp}` which is a thorn in the side of
what is otherwise a mostly entirely independent implementation of
Wasmtime.
* There is a performance floor to using `setjmp` and `longjmp`. Calling
`setjmp` requires using C but Wasmtime is otherwise written in Rust
meaning that there's a Rust->C->Rust->Wasm boundary which
fundamentally can't be inlined without cross-language LTO which is
difficult to configure.
* With the implementation of the WebAssembly exceptions proposal
Wasmtime now has two means of unwinding the stack. Ideally Wasmtime
would only have one, and the more general one is the method of
exceptions.
* Jumping out of a signal handler on Unix is tricky business. While
we've made it work it's generally most robust of the signal handler
simply returns which it now does.
With all of that in mind the purpose of this commit is to replace the
setjmp/longjmp mechanism of handling traps with the recently implemented
support for exceptions in Cranelift. That is intended to resolve all of
the above points in one swoop.
One point in particular though that's nice about setjmp/longjmp is that
unwinding the stack on a trap is an O(1) operation. For situations such
as stack overflow that's a particularly nice property to have as we can
guarantee embedders that traps are a constant time (albeit somewhat
expensive with signals) operation. Exceptions naively require unwinding
the entire stack, and although frame pointers mean we're just traversing
a linked list I wanted to preserve the O(1) property here nonetheless.
To achieve this a solution is implemented where the array-to-wasm
(host-to-wasm) trampolines setup state in `VMStoreContext` so looking up
the current trap handler frame is an O(1) operation. Namely the sp/fp/pc
values for a `Handler` are stored inline.
Implementing this feature required supporting
relocations-to-offsets-in-functions which was not previously supported
by Wasmtime. This required Cranelift refactorings such as #11570, #11585,
and #11576. This then additionally required some more refactoring in
this commit which was difficult to split out as it otherwise wouldn't be
tested.
Apart from the relocation-related business much of this change is about
updating the platform signal handlers to use exceptions instead of
longjmp to return. For example on Unix this means updating the
`ucontext_t` with register values that the handler specifies. Windows
involves updating similar contexts, and macOS mach ports ended up not
needing too many changes.
In terms of overall performance the relevant benchmark from this
repository, compared to before this commit, is:
sync/no-hook/core - host-to-wasm - typed - nop
time: [10.552 ns 10.561 ns 10.571 ns]
change: [−7.5238% −7.4011% −7.2786%] (p = 0.00 < 0.05)
Performance has improved.
Closes #3927
cc #10923
prtest:full
) * aarch64: Add support for "near" in `LoadExtName` The current aarch64 backend does not support `symbol_value` to get the value of a function, for example, with a "near" relocation using a relative relocation. Currently it uses an `Abs8` relocation which means that it's not suitable in Wasmtime, for example. This commit refactors relocation/external name support in the aarch64 backend to support this mode of relocation. The previous `LoadExtName` was split into `LoadExtName{Got,Near,Far}` where the "near" bit is what's new to the backend. The preexisting `symbol-value.clif`-style tests were updated to match the x64 backend which has a more comprehensive suite of examples of what it looks like to refer to various symbols. The goal of this commit is to enable Wasmtime to generate code which refers to a relative point elsewhere in the code (e.g. an exception handler) and load the value into a register. This part isn't filled out yet, but it seemed good to at least in the meantime fill out these missing relocations in the backend. * Fix clippy warning * Add support for new relocations to cranelift-jit Needed for filetests
* riscv64: Implement near relocations This is the same as bytecodealliance#11570 but for the riscv64 backend. The intention is to support "near" relocations which don't require `Abs8` relocations for upcoming use in Wasmtime. The same design as bytecodealliance#11570 is used here. * Review comments
Since Wasmtime's inception it's used the `setjmp` and `longjmp`
functions in C to implement handling of traps. While this solution was
easy to implement, relatively portable, and performant enough, there are
a number of downsides that have evolved over time to make this an
unattractive approach in the long run:
* Using `setjmp` fundamentally requires using C because Rust does not
understand a function that returns twice. It's fundamentally unsound
to invoke `setjmp` in Rust meaning that Wasmtime has forever needed a
C compiler configured and set up to build. This notably means that
`cargo check` cannot check other targets easily.
* Using `longjmp` means that Rust function frames are unwound on the
stack without running destructors. This is a dangerous operation of
which we get no protection from the compiler about. Both frames
entering wasm and frames exiting wasm are all skipped. Absolutely
minimizing this has been beneficial for portability to platforms such
as Pulley.
* Currently the no_std implementation of Wasmtime requires embedders to
provide `wasmtime_{setjmp,longjmp}` which is a thorn in the side of
what is otherwise a mostly entirely independent implementation of
Wasmtime.
* There is a performance floor to using `setjmp` and `longjmp`. Calling
`setjmp` requires using C but Wasmtime is otherwise written in Rust
meaning that there's a Rust->C->Rust->Wasm boundary which
fundamentally can't be inlined without cross-language LTO which is
difficult to configure.
* With the implementation of the WebAssembly exceptions proposal
Wasmtime now has two means of unwinding the stack. Ideally Wasmtime
would only have one, and the more general one is the method of
exceptions.
* Jumping out of a signal handler on Unix is tricky business. While
we've made it work it's generally most robust of the signal handler
simply returns which it now does.
With all of that in mind the purpose of this commit is to replace the
setjmp/longjmp mechanism of handling traps with the recently implemented
support for exceptions in Cranelift. That is intended to resolve all of
the above points in one swoop.
One point in particular though that's nice about setjmp/longjmp is that
unwinding the stack on a trap is an O(1) operation. For situations such
as stack overflow that's a particularly nice property to have as we can
guarantee embedders that traps are a constant time (albeit somewhat
expensive with signals) operation. Exceptions naively require unwinding
the entire stack, and although frame pointers mean we're just traversing
a linked list I wanted to preserve the O(1) property here nonetheless.
To achieve this a solution is implemented where the array-to-wasm
(host-to-wasm) trampolines setup state in `VMStoreContext` so looking up
the current trap handler frame is an O(1) operation. Namely the sp/fp/pc
values for a `Handler` are stored inline.
Implementing this feature required supporting
relocations-to-offsets-in-functions which was not previously supported
by Wasmtime. This required Cranelift refactorings such as bytecodealliance#11570, bytecodealliance#11585,
and bytecodealliance#11576. This then additionally required some more refactoring in
this commit which was difficult to split out as it otherwise wouldn't be
tested.
Apart from the relocation-related business much of this change is about
updating the platform signal handlers to use exceptions instead of
longjmp to return. For example on Unix this means updating the
`ucontext_t` with register values that the handler specifies. Windows
involves updating similar contexts, and macOS mach ports ended up not
needing too many changes.
In terms of overall performance the relevant benchmark from this
repository, compared to before this commit, is:
sync/no-hook/core - host-to-wasm - typed - nop
time: [10.552 ns 10.561 ns 10.571 ns]
change: [−7.5238% −7.4011% −7.2786%] (p = 0.00 < 0.05)
Performance has improved.
Closes bytecodealliance#3927
cc bytecodealliance#10923
prtest:full
The current aarch64 backend does not support
symbol_valueto get the value of a function, for example, with a "near" relocation using a relative relocation. Currently it uses anAbs8relocation which means that it's not suitable in Wasmtime, for example.This commit refactors relocation/external name support in the aarch64 backend to support this mode of relocation. The previous
LoadExtNamewas split intoLoadExtName{Got,Near,Far}where the "near" bit is what's new to the backend. The preexistingsymbol-value.clif-style tests were updated to match the x64 backend which has a more comprehensive suite of examples of what it looks like to refer to various symbols.The goal of this commit is to enable Wasmtime to generate code which refers to a relative point elsewhere in the code (e.g. an exception handler) and load the value into a register. This part isn't filled out yet, but it seemed good to at least in the meantime fill out these missing relocations in the backend.