Skip to content

Conversation

@athei
Copy link
Member

@athei athei commented Feb 29, 2024

This PR adds experimental support for the virtualization host functions. Those allow the runtime to spawn and run PolkaVM instances. It is experimental because the behaviour is subject to change until PolkaVM and the host functions have a spec. However, we need to merge the code to go on with development. Docs and tests are all there and hence I argue it is good enough to be merged. I added a note that users should not use those functions in production.

This PR adds or changes the following components:

  • sc-executor-wasmtime: Implementation of the new host functions. Those are executor specific because they need to call back into the runtime to handle the host function calls (syscalls) the PolkaVM program performs. Luckily, wasmtime is our only executor.
  • sp-virtualization: New crate that abstracts away the host functions. Meaning that a user (like pallet-contracts) will interface only with this crate and not with the host functions directly. This is necessary so that the natively running test code still works. The host functions also depend on this crate. Those also contain all the tests. Everything PolkaVM is neatly organized into one crate. It also contains the definition of the new host functions.
  • sp-wasm-interface: We added an interface mirroring the host functions here. This is necessary in order for the host functions to be able to call into the executor.

The tests are guarded behind the riscv feature because they require a custom toolchain to be compiled. This toolchaion is installed on the CI machine and hence we set the feature there (we are already setting it from before). The riscv feature will be removed as soon as the target is available in the upstream compiler.

@athei athei added the T5-host_functions This PR/Issue is related to host functions. label Feb 29, 2024
@athei athei requested a review from pgherveou February 29, 2024 08:11
@athei athei requested a review from koute as a code owner February 29, 2024 08:11
@paritytech paritytech deleted a comment from paritytech-cicd-pr Feb 29, 2024
Copy link
Contributor

@pgherveou pgherveou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dumping a few nit comments, will make a second pass

@athei
Copy link
Member Author

athei commented Mar 1, 2024

CI fails with this error when creating the Engine:

failed to initialize sandbox process: exited due to signal (signal = SIGILL):
fatal error while spawning child: execveat (errno = 2 (ENOENT)):
failed to create a worker process (1 out of 2)

Engine is created here:

/// This is the single PolkaVM engine we use for everything.
///
/// By using a common engine we allow PolkaVM to use caching. This caching is important
/// to reduce startup costs. This is even the case when instances use different code.
static ENGINE: OnceLock<Engine> = OnceLock::new();
/// Engine wide configuration.
fn engine() -> &'static Engine {
ENGINE.get_or_init(|| {
let config = Config::new();
Engine::new(&config).expect("Default config is always valid; qed")
})
}

Tests can be run like this:

# run tests natively
cargo test -p sp-virtualization
# run tests via host function
cargo test -p sc-executor test_virtualization

@athei athei requested a review from a team as a code owner March 1, 2024 08:13
@athei
Copy link
Member Author

athei commented Mar 2, 2024

I set the PolkaVM backend to interpreter on CI for now so we can go on. We will debug this later.

// See the License for the specific language governing permissions and
// limitations under the License.

//! This crate is intended for use by runtime code (e.g pallet-contracts) to spawn PolkaVM instances
Copy link
Contributor

@yjhmelody yjhmelody Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you also consider to spawn wasmVM instances in this abstraction design ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. This will be specifically targeted at smart contract usecases and PolkaVM. We have no plans for spawning WASM VMs; we already tried that in the past, and it was a bad idea/it didn't work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that wasmi was integrated into sc-executor before, but it seems that it has never been used? I'm curious why risc-v works as expected in this way but wasm doesn't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, historically we had three features which are now removed:

  1. wasmi as an executor; this was meant to be used to run full runtimes as a sort of a backup executor to wasmtime, but since wasmtime got rock solid we removed it,
  2. we wanted to use the executor infrastructure for running smart contracts on wasmtime/wasmer/whatever in an accelerated manner, but it ended up being a bad idea (because the VMs weren't really designed for this use case) so we removed it,
  3. we had host calls to spawn new executors from within the runtime and run something in them, but the implementation for this was horribly broken and no one used it, so we removed it.

So right now essentially we're only looking into (2), but this time with a VM that is designed to support this use case.

(The (3) might actually be possible with Polkajam, but it's still too early to talk about that as we're still designing/prototyping it.)

Copy link
Member Author

@athei athei Apr 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also a 2a) where we actually had a predecessor of this API where we spawned wasmi instances for smart contracts. But it turned out to not be worth the hassle and we moved wasmi into the runtime.

However, PolkaVM can't live inside because it compiles to native instructions.

Wasm smart contracts will be run with an in-runtime wasmi. There is no plan to support anything else than PolkaVM with this virtualization API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, PolkaVM can't live inside because it compiles to native instructions.

If runtime itself is RISC-V it should be able to, also interpreted mode is still available as a fallback in polkavm, isn't it?

Comment on lines +370 to +374

fn test_virtualization(test_fixture: Vec<u8>) {
#[cfg(feature = "riscv")]
sp_virtualization::run_tests(test_fixture.as_ref());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is that called 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The macro where this is located in defines wasm exported functions. So this whole thing is compiled to Wasm and then called from this test here:

#[cfg(feature = "riscv")]
test_wasm_execution!(test_virtualization);
#[cfg(feature = "riscv")]
fn test_virtualization(wasm_method: WasmExecutionMethod) {
let mut ext = TestExternalities::default();
let mut ext = ext.ext();
let fixture = sp_virtualization_test_fixture::binary().encode();
call_in_wasm("test_virtualization", fixture.as_ref(), wasm_method, &mut ext).unwrap();
}

Comment on lines +94 to +96
loop {
increment_counter(1);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about an infinite loop that wouldn't call a host fn?

Comment on lines +224 to +232
let mut instance = Virt::instantiate(program).unwrap();
let mut state = SharedState {
gas_left: gas_limit_1,
exit: false,
user: State { counter: 0, memory: Some(instance.memory()), ..Default::default() },
};
let ret = instance.execute("counter", syscall_handler, &mut state);
assert_eq!(ret, Ok(()));
assert_eq!(gas_consumed, gas_limit_1 - state.gas_left);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to instead test gas_left = gas_consumed, and then gas_left = gas_consumed - 1 and assert on the expected results

exit: false,
user: State { counter: 0, memory: Some(instance.memory()), ..Default::default() },
};
let ret = instance.execute("counter", syscall_handler, &mut state);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename "counter" to "add_8" so it make more sense to the reader of this test

Comment on lines +281 to +282
// sub call should not affect parent state
assert_eq!(state.user.counter, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a bool flag in the user state to at least verify that it indeed did execute the subcall

#[derive(Default)]
struct State {
counter: u64,
memory: Option<Memory>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason you make it optional?

@paritytech-cicd-pr
Copy link

The CI pipeline was cancelled due to failure one of the required jobs.
Job name: cargo-clippy
Logs: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/5827070

@athei
Copy link
Member Author

athei commented Apr 8, 2024

After talking to @koute we agreed that this PR needs to be reworked. Right now it does recursively call into the runtime. We don't want to allow this. The API should be changed so that the execute function returns in case a host function is called. Instead of the call back that it is using now.

From a user perspective it would look like this:

virt.instantiate(...);
loop {
    result = virt.resume(...);
    match result {
        ExecutionResult::Finished => break,
        ExecutionResult::Hostcall(...) => {
            // handle the hostcall and continue
        }
    }
}

I will do that once this style is supported in PolkaVM itself. Problem is that the contracts pallet will need to adapt to this larger change. This will probably mean that we transition to the wasmi resumable functions at the same time in the contracts pallet. Otherwise we would need to support two completely different styles at the same time.

It doesn't even build against anymore after merging master because the functionality to do the recursive calls was removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

T5-host_functions This PR/Issue is related to host functions.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants