Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
259 changes: 259 additions & 0 deletions proposals/0377-ebpf-isa-compatibility.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,259 @@
---
simd: '0377'
title: eBPF ISA compatibility
authors:
- Lucas Steuernagel (Anza)
- Alexander Meißner (Anza)
category: Standard
type: Core
status: Review
created: 2025-10-09
feature: (fill in with feature key and github tracking issues once accepted)
---

## Summary

This SIMD introduces instruction set architecture (ISA) changes to make the
sBPF virtual machine compatible with the latest existing version of eBPF ISA
generated by its LLVM backend.

It reverts past ISA changes, modifies the encoding of existing instructions
and brings new instructions to the Solana virtual machine.

## Motivation

The eBPF target on the Rust compiler emits code by default for eBPFv1, whose
only incompatibility with the Solana virtual machine is the `callx`
instruction. Aiming to prioritize Solana programs and decrease their CU
consumption, we want to be compatible with at least the current eBPF version
(v3), which brings in new instructions. In order for that to be possible, we
must modify our virtual machine to support eBPF integrally.

Additionally, we are going to anticipate some eBPF v4 instructions that are
beneficial for Solana and would decrease the update burden in case v4 becomes
the new default configuration for the eBPF upstream LLVM target.

## New Terminology

The set containing these new instructions will form an sBPFv3 program.

## Detailed Design

### ELF Identification

Programs containing the instructions mentioned in this SIMD must have the
`0x03` value in the `e_flags` field of their header.

### Revert SIMD-0166

SIMD-0166 must be reverted beginning with sBPFv3, since we will introduce a
new design for dynamic stack frames that is closer to the eBPF code generation.

### Revert SIMD-0173

All changes proposed in SIMD-173 will no longer take effect in sBPFv3.
Consequently, the verifier must accept the following opcodes:

- `0x18`, `0x00` (`LDDW`)
- `0x72`, `0x71`, `0x73` (`STB`, `LDXB`, `STXB`)
- `0x6A`, `0x69`, `0x6B` (`STH`, `LDXH`, `STXH`)
- `0x62`, `0x61`, `0x63` (`STW`, `LDXW`, `STXW`)
- `0x7A`, `0x79`, `0x7B` (`STDW`, `LDXDW`, `STXDW`)
- `0xD4` (`LE`)

The new opcodes introduced in SIMD-173 must be rejected in the verifier with `VerifierError::UnknownOpCode`:

- the `HOR64` instruction (opcode `0xF7`)
- the moved opcodes:
- `0x27`, `0x2C`, `0x2F` (`STB`, `LDXB`, `STXB`)
- `0x37`, `0x3C`, `0x3F` (`STH`, `LDXH`, `STXH`)
- `0x87`, `0x8C`, `0x8F` (`STW`, `LDXW`, `STXW`)
- `0x97`, `0x9C`, `0x9F` (`STDW`, `LDXDW`, `STXDW`)

### Revert SIMD-0174

All changes proposed in SIMD-174 will no longer take effect in sBPFv3.
Consequently, the verifier must accept the following opcodes:

- the `MUL` instruction (opcodes `0x24`, `0x2C`, `0x27` and `0x2F`)
- the `DIV` instruction (opcodes `0x34`, `0x3C`, `0x37` and `0x3F`)
- the `MOD` instruction (opcodes `0x94`, `0x9C`, `0x97` and `0x9F`)
- the `NEG` instruction (opcodes `0x84` and `0x87`)

The verifier must reject programs and throw `VerifierError::UnknownOpCode` for
programs that contain any of the following opcodes.

- the `UHMUL64` instruction (opcode `0x36` and `0x3E`)
- the `UDIV32` instruction (opcode `0x46` and `0x4E`)
- the `UDIV64` instruction (opcode `0x56` and `0x5E`)
- the `UREM32` instruction (opcode `0x66` and `0x6E`)
- the `UREM64` instruction (opcode `0x76` and `0x7E`)
- the `LMUL32` instruction (opcode `0x86` and `0x8E`)
- the `LMUL64` instruction (opcode `0x96` and `0x9E`)
- the `SHMUL64` instruction (opcode `0xB6` and `0xBE`)
- the `SDIV32` instruction (opcode `0xC6` and `0xCE`)
- the `SDIV64` instruction (opcode `0xD6` and `0xDE`)
- the `SREM32` instruction (opcode `0xE6` and `0xEE`)
- the `SREM64` instruction (opcode `0xF6` and `0xFE`)

### Execution changes

MOV32_REG (opcode `0x14`) must NOT perform sign extension.

SUB32_IMM and SUB64_IMM must perform the operation `src = src - imm`.

### Dynamic stack frames

Aiming a closer compatibility to eBPF, the implementation of dynamic stack
frames is going to change.

The R10 register must continue to be the frame pointer, i.e. pointing to the
highest address accessible in a function. As such, the stack will grow upwards.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it pointing one byte past the highest addressable byte?


At the prologue of each function, there may be one ADD64 (opcode 0x07)
instruction to adjust the frame pointer to its new position with a positive
offset (`add64 R10, +imm`). When a function returns, the virtual machine will
automatically restore the frame pointer register with the value used in the
caller, so programs do not need to emit any instruction to adjust the frame
pointer in the epilogue of each function.

### JMP32 instruction class

The JMP32 instruction class utilizes 32 bit wide operands for the same
operations as the JMP class.

The following opcodes must be allowed in the verifier and the virtual machine
must implement the behavior described below for each one of them.

- `JEQ32_IMM` -> opcode = `0x16` -> `pc += offset if dst as u32 == IMM as u32`
- `JGT32_IMM` -> opcode = `0x26` -> `pc += offset if dst as u32 > IMM as u32`
- `JGE32_IMM` -> opcode = `0x36` -> `pc += offset if dst as u32 >= IMM as u32`
- `JSET32_IMM` -> opcode = `0x46` ->
`pc += offset if (dst as u32 & IMM as u32) != 0`
- `JNE32_IMM` -> opcode = `0x56` -> `pc += offset if dst as u32 != IMM as u32`
- `JSGT32_IMM` -> opcode = `0x66` -> `pc += offset if dst as i32 > IMM as i32`
- `JSGE32_IMM` -> opcode = `0x76` -> `pc += offset if dst as i32 > IMM as i32`
- `JLT32_IMM` -> opcode = `0xa6` -> `pc += offset if dst as u32 < IMM as u32`
- `JLE32_IMM` -> opcode = `0xb6` -> `pc += offset if dst as u32 <= IMM as u32`
- `JSLT32_IMM` -> opcode = `0xc6` -> `pc += offset if dst as i32 < IMM as i32`
- `JSLE32_IMM` -> opcode = `0xd6` -> `pc += offset if dst as i32 <= IMM as i32`

- `JEQ32_REG` -> opcode = `0x1e` -> `pc += offset if dst as u32 == src as u32`
- `JGT32_REG` -> opcode = `0x2e` -> `pc += offset if dst as u32 > src as u32`
- `JGE32_REG` -> opcode = `0x3e` -> `pc += offset if dst as u32 >= src as u32`
- `JSET32_REG` -> opcode = `0x4e` ->
`pc += offset if (dst as u32 & src as u32) != 0`
- `JNE32_REG` -> opcode = `0x56` -> `pc += offset if dst as u32 != src as u32`
- `JSGT32_REG` -> opcode = `0x66` -> `pc += offset if dst as i32 > src as i32`
- `JSGE32_REG` -> opcode = `0x76` -> `pc += offset if dst as i32 > src as i32`
- `JLT32_REG` -> opcode = `0xa6` -> `pc += offset if dst as u32 < src as u32`
- `JLE32_REG` -> opcode = `0xb6` -> `pc += offset if dst as u32 <= src as u32`
- `JSLT32_REG` -> opcode = `0xc6` -> `pc += offset if dst as i32 < src as i32`
- `JSLE32_REG` -> opcode = `0xd6` -> `pc += offset if dst as i32 <= src as i32`

### SMOD and SDIV instructions

The following opcodes must be allowed in the verifier for a sBPFv3 program and
the following behavior must occur in the virtual machine.

- `SMOD64_IMM` -> opcode = `0x97` -> `dst = dst as i64 % imm as i64`
- `SMOD64_REG` -> opcode = `0x9f` -> `dst = dst as i64 % src as i64`
- `SMOD32_IMM` -> opcode = `0x94` -> `dst = dst as i32 % imm as i32`
- `SMOD32_REG` -> opcode = `0x9c` -> `dst = dst as i32 % src as i32`
- `SDIV64_IMM` -> opcode = `0x37` -> `dst = dst as i64 / imm as i64`
- `SDIV64_REG` -> opcode = `0x3f` -> `dst = dst as i64 / src as i64`
- `SDIV32_IMM` -> opcode = `0x34` -> `dst = dst as i32 / imm as i32`
- `SDIV32_REG` -> opcode = `0x3c` -> `dst = dst as i32 / src as i32`

### Sign extended mov and sign extended load

The verifier must accept the following instruction encodings for sign extended
`mov` operations, and the virtual machine must implement the behavior detailed
below for them.

The existing `MOV64` and `MOV32` instructions were included in the list below
only for comparison.

- `MOV64` -> opcode = `0xbf`, offset = `0` -> `dst = src as i64`
(existing instruction - only here for comparison)
- `MOV64S8` -> opcode = `0xbf`, offset = `8` -> `dst = src as i8 as i64`
- `MOV64S16` -> opcode = `0xbf`, offset = `16` -> `dst = src as i16 as i64`
- `MOV64S32` -> opcode = `0xbf`, offset = `32` -> `dst = src as i32 as i64`

- `MOV32` -> opcode = `0xbc`, offset = `0` -> `dst = src as u32`
(existing instruction - only here for comparison)
- `MOV32S8` -> opcode = `0xbc`, offset = `8` -> `dst = src as i8 as i32`
- `MOV32S16` -> opcode = `0xbc`, offset = `16` -> `dst = src as i16 as i32`


### Indirect jump

The indirect jump instruction `jx` jumps to the instruction pointed by the
address in the source register. In sum, the verifier must allow the following
opcode and the runtime must implement the following behavior.

- `jx` -> opcode = `0x0d` -> pc = `src`

### callx encoding

The encoding of callx must change so that the register containing the address
to jump to is in the destination register.

- `callx` -> opcode = `0x9d` -> pc = `dst`


### Reinterpretation of LDDW as MOV and HOR

The LDDW instruction consists of two 8-byte instruction frames, but consumes
only one CU in the virtual machine.

The virtual machine must now interpret the first half of LLDW as the opcode
`0x18`, with the same behavior as `mov32 reg, imm` zero extending the
immediate value.

Likewise, the second half of LDDW must be interpreted as the opcode `0x00`,
being a bitwise OR operation of the MSBs in the destination register. This
instruction must be called `hor64 reg, imm`.

The HOR instruction, however, must encode a destination register on which to
operate. The encoding is as follows.

- `HOR64` -> opcode = `0x00` -> `dst = dst | (imm << 32)`

Consequently, we must charge two CUs for an LDDW execution.

## Alternatives Considered

We have considered diverging from the eBPF standard by introducing new opcodes
and creating specific instructions to the Solana environment. We discarded
such an approach to be compatible with the existing LLVM eBPF code generation.

We are not adding `may_goto`, since it has an implicit condition implemented
in the kernel, and does not yet have any path for code generation. We are also
not supporting any of the eBPF atomic instructions, since the Solana virtual
machine is single threaded.

In eBPFv4, there are two other instructions that could be part of the Solana
vendored virtual machine, but are not included in the proposal: `gotol`
(opcode `0x06`) and `bswap` (opcode `0xd7`).

`gotol` is an unconditional jump with a 32 bit offset. It does not replace the
existing JA instruction (opcode 0x05), and is used only when the 16-bit offset
from the JA can't be used for a jump. This situation appears only in very
large functions, or in environments with aggressive inlining, and has not so
far represented a problem for smart contracts.

`bswap` supersedes LE (opcode 0xd4) and BE (opcode 0xdc) in eBPFv4, but
otherwise behaves similarly. Byte swap is rarely used as an instruction in
smart contracts, and so far the existing opcodes already fill up any needs.

## Impact

With these changes, and a patch to the aya bpf-linker, developers will be able
to install the bpf-linker and use the existing rustup/cargo/rustc
infrastructure to build their programs.

## Security Considerations

None

Loading