-
Notifications
You must be signed in to change notification settings - Fork 258
SIMD-0377: eBPF ISA compatibility #377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
fca89da
Create proposal
LucasSte e83f625
Add sign extended loads
LucasSte cbb1f73
Update smod and sdiv
LucasSte 4177ad6
Fix indirect jump
LucasSte 803512a
Remove eBPFv4 parts
LucasSte 302732f
Rewrite dynamic stack frames section
LucasSte ec7e4e9
Remove parts that relied on sbpfv2
LucasSte d3f0c2a
Update proposals/0377-ebpf-isa-compatibility.md
LucasSte 5876676
Update proposals/0377-ebpf-isa-compatibility.md
LucasSte a37a307
Mention stack gaps
LucasSte File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,259 @@ | ||
| --- | ||
| simd: '0377' | ||
| title: eBPF ISA compatibility | ||
| authors: | ||
| - Lucas Steuernagel (Anza) | ||
| - Alexander Meißner (Anza) | ||
| category: Standard | ||
| type: Core | ||
| status: Review | ||
| created: 2025-10-09 | ||
| feature: (fill in with feature key and github tracking issues once accepted) | ||
| --- | ||
|
|
||
| ## Summary | ||
|
|
||
| This SIMD introduces instruction set architecture (ISA) changes to make the | ||
| sBPF virtual machine compatible with the latest existing version of eBPF ISA | ||
| generated by its LLVM backend. | ||
|
|
||
| It reverts past ISA changes, modifies the encoding of existing instructions | ||
| and brings new instructions to the Solana virtual machine. | ||
|
|
||
| ## Motivation | ||
|
|
||
| The eBPF target on the Rust compiler emits code by default for eBPFv1, whose | ||
| only incompatibility with the Solana virtual machine is the `callx` | ||
| instruction. Aiming to prioritize Solana programs and decrease their CU | ||
| consumption, we want to be compatible with at least the current eBPF version | ||
| (v3), which brings in new instructions. In order for that to be possible, we | ||
| must modify our virtual machine to support eBPF integrally. | ||
|
|
||
| Additionally, we are going to anticipate some eBPF v4 instructions that are | ||
| beneficial for Solana and would decrease the update burden in case v4 becomes | ||
| the new default configuration for the eBPF upstream LLVM target. | ||
|
|
||
| ## New Terminology | ||
|
|
||
| The set containing these new instructions will form an sBPFv3 program. | ||
|
|
||
| ## Detailed Design | ||
|
|
||
| ### ELF Identification | ||
|
|
||
| Programs containing the instructions mentioned in this SIMD must have the | ||
| `0x03` value in the `e_flags` field of their header. | ||
|
|
||
| ### Revert SIMD-0166 | ||
|
|
||
| SIMD-0166 must be reverted beginning with sBPFv3, since we will introduce a | ||
| new design for dynamic stack frames that is closer to the eBPF code generation. | ||
|
|
||
| ### Revert SIMD-0173 | ||
|
|
||
| All changes proposed in SIMD-173 will no longer take effect in sBPFv3. | ||
| Consequently, the verifier must accept the following opcodes: | ||
|
|
||
| - `0x18`, `0x00` (`LDDW`) | ||
| - `0x72`, `0x71`, `0x73` (`STB`, `LDXB`, `STXB`) | ||
| - `0x6A`, `0x69`, `0x6B` (`STH`, `LDXH`, `STXH`) | ||
| - `0x62`, `0x61`, `0x63` (`STW`, `LDXW`, `STXW`) | ||
| - `0x7A`, `0x79`, `0x7B` (`STDW`, `LDXDW`, `STXDW`) | ||
| - `0xD4` (`LE`) | ||
|
|
||
| The new opcodes introduced in SIMD-173 must be rejected in the verifier with `VerifierError::UnknownOpCode`: | ||
|
|
||
| - the `HOR64` instruction (opcode `0xF7`) | ||
| - the moved opcodes: | ||
| - `0x27`, `0x2C`, `0x2F` (`STB`, `LDXB`, `STXB`) | ||
| - `0x37`, `0x3C`, `0x3F` (`STH`, `LDXH`, `STXH`) | ||
| - `0x87`, `0x8C`, `0x8F` (`STW`, `LDXW`, `STXW`) | ||
| - `0x97`, `0x9C`, `0x9F` (`STDW`, `LDXDW`, `STXDW`) | ||
|
|
||
| ### Revert SIMD-0174 | ||
|
|
||
| All changes proposed in SIMD-174 will no longer take effect in sBPFv3. | ||
| Consequently, the verifier must accept the following opcodes: | ||
|
|
||
| - the `MUL` instruction (opcodes `0x24`, `0x2C`, `0x27` and `0x2F`) | ||
| - the `DIV` instruction (opcodes `0x34`, `0x3C`, `0x37` and `0x3F`) | ||
| - the `MOD` instruction (opcodes `0x94`, `0x9C`, `0x97` and `0x9F`) | ||
| - the `NEG` instruction (opcodes `0x84` and `0x87`) | ||
|
|
||
| The verifier must reject programs and throw `VerifierError::UnknownOpCode` for | ||
| programs that contain any of the following opcodes. | ||
|
|
||
| - the `UHMUL64` instruction (opcode `0x36` and `0x3E`) | ||
| - the `UDIV32` instruction (opcode `0x46` and `0x4E`) | ||
| - the `UDIV64` instruction (opcode `0x56` and `0x5E`) | ||
| - the `UREM32` instruction (opcode `0x66` and `0x6E`) | ||
| - the `UREM64` instruction (opcode `0x76` and `0x7E`) | ||
| - the `LMUL32` instruction (opcode `0x86` and `0x8E`) | ||
| - the `LMUL64` instruction (opcode `0x96` and `0x9E`) | ||
| - the `SHMUL64` instruction (opcode `0xB6` and `0xBE`) | ||
| - the `SDIV32` instruction (opcode `0xC6` and `0xCE`) | ||
| - the `SDIV64` instruction (opcode `0xD6` and `0xDE`) | ||
| - the `SREM32` instruction (opcode `0xE6` and `0xEE`) | ||
| - the `SREM64` instruction (opcode `0xF6` and `0xFE`) | ||
|
|
||
| ### Execution changes | ||
|
|
||
| MOV32_REG (opcode `0x14`) must NOT perform sign extension. | ||
|
|
||
| SUB32_IMM and SUB64_IMM must perform the operation `src = src - imm`. | ||
|
|
||
| ### Dynamic stack frames | ||
|
|
||
| Aiming a closer compatibility to eBPF, the implementation of dynamic stack | ||
| frames is going to change. | ||
|
|
||
| The R10 register must continue to be the frame pointer, i.e. pointing to the | ||
| highest address accessible in a function. As such, the stack will grow upwards. | ||
|
||
|
|
||
| At the prologue of each function, there may be one ADD64 (opcode 0x07) | ||
| instruction to adjust the frame pointer to its new position with a positive | ||
| offset (`add64 R10, +imm`). When a function returns, the virtual machine will | ||
| automatically restore the frame pointer register with the value used in the | ||
| caller, so programs do not need to emit any instruction to adjust the frame | ||
| pointer in the epilogue of each function. | ||
|
|
||
| ### JMP32 instruction class | ||
|
|
||
| The JMP32 instruction class utilizes 32 bit wide operands for the same | ||
| operations as the JMP class. | ||
|
|
||
| The following opcodes must be allowed in the verifier and the virtual machine | ||
| must implement the behavior described below for each one of them. | ||
|
|
||
| - `JEQ32_IMM` -> opcode = `0x16` -> `pc += offset if dst as u32 == IMM as u32` | ||
| - `JGT32_IMM` -> opcode = `0x26` -> `pc += offset if dst as u32 > IMM as u32` | ||
| - `JGE32_IMM` -> opcode = `0x36` -> `pc += offset if dst as u32 >= IMM as u32` | ||
| - `JSET32_IMM` -> opcode = `0x46` -> | ||
| `pc += offset if (dst as u32 & IMM as u32) != 0` | ||
| - `JNE32_IMM` -> opcode = `0x56` -> `pc += offset if dst as u32 != IMM as u32` | ||
| - `JSGT32_IMM` -> opcode = `0x66` -> `pc += offset if dst as i32 > IMM as i32` | ||
| - `JSGE32_IMM` -> opcode = `0x76` -> `pc += offset if dst as i32 > IMM as i32` | ||
| - `JLT32_IMM` -> opcode = `0xa6` -> `pc += offset if dst as u32 < IMM as u32` | ||
| - `JLE32_IMM` -> opcode = `0xb6` -> `pc += offset if dst as u32 <= IMM as u32` | ||
| - `JSLT32_IMM` -> opcode = `0xc6` -> `pc += offset if dst as i32 < IMM as i32` | ||
| - `JSLE32_IMM` -> opcode = `0xd6` -> `pc += offset if dst as i32 <= IMM as i32` | ||
|
|
||
| - `JEQ32_REG` -> opcode = `0x1e` -> `pc += offset if dst as u32 == src as u32` | ||
| - `JGT32_REG` -> opcode = `0x2e` -> `pc += offset if dst as u32 > src as u32` | ||
| - `JGE32_REG` -> opcode = `0x3e` -> `pc += offset if dst as u32 >= src as u32` | ||
| - `JSET32_REG` -> opcode = `0x4e` -> | ||
| `pc += offset if (dst as u32 & src as u32) != 0` | ||
| - `JNE32_REG` -> opcode = `0x56` -> `pc += offset if dst as u32 != src as u32` | ||
| - `JSGT32_REG` -> opcode = `0x66` -> `pc += offset if dst as i32 > src as i32` | ||
| - `JSGE32_REG` -> opcode = `0x76` -> `pc += offset if dst as i32 > src as i32` | ||
| - `JLT32_REG` -> opcode = `0xa6` -> `pc += offset if dst as u32 < src as u32` | ||
| - `JLE32_REG` -> opcode = `0xb6` -> `pc += offset if dst as u32 <= src as u32` | ||
| - `JSLT32_REG` -> opcode = `0xc6` -> `pc += offset if dst as i32 < src as i32` | ||
| - `JSLE32_REG` -> opcode = `0xd6` -> `pc += offset if dst as i32 <= src as i32` | ||
|
|
||
| ### SMOD and SDIV instructions | ||
|
|
||
| The following opcodes must be allowed in the verifier for a sBPFv3 program and | ||
| the following behavior must occur in the virtual machine. | ||
|
|
||
| - `SMOD64_IMM` -> opcode = `0x97` -> `dst = dst as i64 % imm as i64` | ||
| - `SMOD64_REG` -> opcode = `0x9f` -> `dst = dst as i64 % src as i64` | ||
| - `SMOD32_IMM` -> opcode = `0x94` -> `dst = dst as i32 % imm as i32` | ||
| - `SMOD32_REG` -> opcode = `0x9c` -> `dst = dst as i32 % src as i32` | ||
| - `SDIV64_IMM` -> opcode = `0x37` -> `dst = dst as i64 / imm as i64` | ||
| - `SDIV64_REG` -> opcode = `0x3f` -> `dst = dst as i64 / src as i64` | ||
| - `SDIV32_IMM` -> opcode = `0x34` -> `dst = dst as i32 / imm as i32` | ||
| - `SDIV32_REG` -> opcode = `0x3c` -> `dst = dst as i32 / src as i32` | ||
|
|
||
| ### Sign extended mov and sign extended load | ||
|
|
||
| The verifier must accept the following instruction encodings for sign extended | ||
| `mov` operations, and the virtual machine must implement the behavior detailed | ||
| below for them. | ||
|
|
||
| The existing `MOV64` and `MOV32` instructions were included in the list below | ||
| only for comparison. | ||
|
|
||
| - `MOV64` -> opcode = `0xbf`, offset = `0` -> `dst = src as i64` | ||
| (existing instruction - only here for comparison) | ||
| - `MOV64S8` -> opcode = `0xbf`, offset = `8` -> `dst = src as i8 as i64` | ||
| - `MOV64S16` -> opcode = `0xbf`, offset = `16` -> `dst = src as i16 as i64` | ||
| - `MOV64S32` -> opcode = `0xbf`, offset = `32` -> `dst = src as i32 as i64` | ||
|
|
||
| - `MOV32` -> opcode = `0xbc`, offset = `0` -> `dst = src as u32` | ||
| (existing instruction - only here for comparison) | ||
| - `MOV32S8` -> opcode = `0xbc`, offset = `8` -> `dst = src as i8 as i32` | ||
| - `MOV32S16` -> opcode = `0xbc`, offset = `16` -> `dst = src as i16 as i32` | ||
|
|
||
|
|
||
| ### Indirect jump | ||
|
|
||
| The indirect jump instruction `jx` jumps to the instruction pointed by the | ||
| address in the source register. In sum, the verifier must allow the following | ||
| opcode and the runtime must implement the following behavior. | ||
|
|
||
| - `jx` -> opcode = `0x0d` -> pc = `src` | ||
|
|
||
| ### callx encoding | ||
|
|
||
| The encoding of callx must change so that the register containing the address | ||
| to jump to is in the destination register. | ||
|
|
||
| - `callx` -> opcode = `0x9d` -> pc = `dst` | ||
|
|
||
|
|
||
| ### Reinterpretation of LDDW as MOV and HOR | ||
|
|
||
| The LDDW instruction consists of two 8-byte instruction frames, but consumes | ||
| only one CU in the virtual machine. | ||
|
|
||
| The virtual machine must now interpret the first half of LLDW as the opcode | ||
| `0x18`, with the same behavior as `mov32 reg, imm` zero extending the | ||
| immediate value. | ||
|
|
||
| Likewise, the second half of LDDW must be interpreted as the opcode `0x00`, | ||
| being a bitwise OR operation of the MSBs in the destination register. This | ||
| instruction must be called `hor64 reg, imm`. | ||
|
|
||
| The HOR instruction, however, must encode a destination register on which to | ||
| operate. The encoding is as follows. | ||
|
|
||
| - `HOR64` -> opcode = `0x00` -> `dst = dst | (imm << 32)` | ||
|
|
||
| Consequently, we must charge two CUs for an LDDW execution. | ||
|
|
||
| ## Alternatives Considered | ||
|
|
||
| We have considered diverging from the eBPF standard by introducing new opcodes | ||
| and creating specific instructions to the Solana environment. We discarded | ||
| such an approach to be compatible with the existing LLVM eBPF code generation. | ||
|
|
||
| We are not adding `may_goto`, since it has an implicit condition implemented | ||
| in the kernel, and does not yet have any path for code generation. We are also | ||
| not supporting any of the eBPF atomic instructions, since the Solana virtual | ||
| machine is single threaded. | ||
|
|
||
| In eBPFv4, there are two other instructions that could be part of the Solana | ||
| vendored virtual machine, but are not included in the proposal: `gotol` | ||
| (opcode `0x06`) and `bswap` (opcode `0xd7`). | ||
|
|
||
| `gotol` is an unconditional jump with a 32 bit offset. It does not replace the | ||
| existing JA instruction (opcode 0x05), and is used only when the 16-bit offset | ||
| from the JA can't be used for a jump. This situation appears only in very | ||
| large functions, or in environments with aggressive inlining, and has not so | ||
| far represented a problem for smart contracts. | ||
|
|
||
| `bswap` supersedes LE (opcode 0xd4) and BE (opcode 0xdc) in eBPFv4, but | ||
| otherwise behaves similarly. Byte swap is rarely used as an instruction in | ||
| smart contracts, and so far the existing opcodes already fill up any needs. | ||
|
|
||
| ## Impact | ||
|
|
||
| With these changes, and a patch to the aya bpf-linker, developers will be able | ||
| to install the bpf-linker and use the existing rustup/cargo/rustc | ||
| infrastructure to build their programs. | ||
|
|
||
| ## Security Considerations | ||
|
|
||
| None | ||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.