-
Notifications
You must be signed in to change notification settings - Fork 250
SIMD-0173: SBPF instruction encoding improvements #173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,123 @@ | ||
| --- | ||
| simd: '0173' | ||
| title: SBPF instruction encoding improvements | ||
| authors: | ||
| - Alexander Meißner | ||
| category: Standard | ||
| type: Core | ||
| status: Review | ||
| created: 2024-09-05 | ||
| feature: F6UVKh1ujTEFK3en2SyAL3cdVnqko1FVEXWhmdLRu6WP | ||
| extends: SIMD-0161 | ||
| --- | ||
|
|
||
| ## Summary | ||
|
|
||
| There are some instructions with questionable encodings, that when slightly | ||
| adjusted, could significantly simplify verification and execution of programs. | ||
|
|
||
| ## Motivation | ||
|
|
||
| The instruction `lddw dst, imm` is currently the only instruction which takes | ||
| two instruction slots. This proposal splits it into a two one-slot instruction | ||
| sequence: `mov32 dst, imm` and an introduced `hor64 dst, imm`. This way all | ||
| instructions will be exactly one slot long which will simplify: | ||
|
|
||
| - Calculating the number of instructons in a program will no longer require a | ||
| full linear scan. A division of the length of the text section by the | ||
| instruction slot size will suffice. | ||
| - The instruction meter will no longer have to skip one instruction slot when | ||
| counting a `LDDW` instruction. | ||
| - Jump and call instructions will no longer have to verify that the desination | ||
| is not the second half of a `LDDW` instruction. | ||
| - The verifier will no longer have to check that `LDDW` instructions are | ||
| complete and its first or second half does not occur without the other on its | ||
| own. | ||
|
|
||
| The `LE` instruction is essentially useless as only `BE` performs a byte-swap. | ||
| Its runtime behavior is close to no-op and can be replicated by other | ||
| instructions: | ||
|
|
||
| - `le dst, 16` behaves the same as `and32 dst, 0xFFFF` | ||
| - `le dst, 32` behaves the same as `and32 dst, 0xFFFFFFFF` | ||
| - `le dst, 64` behaves the same as `mov64 dst, src` | ||
|
|
||
| The `CALLX` instruction encodes its source register in the immediate field. | ||
| This is makes the instruction decoder more complex because it is the only case | ||
| in which a register is encoded in the immediate field, for no reason. | ||
|
|
||
| With all of the above changes and the ones defined in SIMD-0174, the memory | ||
| related instructions can be moved into the ALU instruction classes. Doing so | ||
| would free up 8 instruction classes completely, giving us back three bits of | ||
| instruction encoding. | ||
|
|
||
| ## Alternatives Considered | ||
|
|
||
| None. | ||
|
|
||
| ## New Terminology | ||
|
|
||
| None. | ||
|
|
||
| ## Detailed Design | ||
|
|
||
| The following must go into effect if and only if a program indicates the | ||
| SBPF-version v2 or higher in its program header (see SIMD-0161). Some now | ||
| unreachable verification and execution checks around `LDDW` can be safely | ||
| removed (see motivation). | ||
|
|
||
| ### Changes to the Bytecode Verifier | ||
|
|
||
| A program containing one of the following instructions must throw | ||
| `VerifierError::UnknownOpCode` during verification: | ||
|
|
||
| - the `LDDW` instruction (opcodes `0x18` and `0x00`) | ||
| - the `LE` instruction (opcode `0xD4`) | ||
| - the moved opcodes: | ||
| - `0x72`, `0x71`, `0x73` (`STB`, `LDXB`, `STXB`) | ||
| - `0x6A`, `0x69`, `0x6B` (`STH`, `LDXH`, `STXH`) | ||
| - `0x62`, `0x61`, `0x63` (`STW`, `LDXW`, `STXW`) | ||
| - `0x7A`, `0x79`, `0x7B` (`STDW`, `LDXDW`, `STXDW`) | ||
|
|
||
| A program containing one of the following instructions must **not** throw | ||
| `VerifierError::UnknownOpCode` during verification anymore: | ||
|
|
||
| - the `HOR64` instruction (opcode `0xF7`) | ||
| - the moved opcodes: | ||
| - `0x27`, `0x2C`, `0x2F` (`STB`, `LDXB`, `STXB`) | ||
| - `0x37`, `0x3C`, `0x3F` (`STH`, `LDXH`, `STXH`) | ||
| - `0x87`, `0x8C`, `0x8F` (`STW`, `LDXW`, `STXW`) | ||
| - `0x97`, `0x9C`, `0x9F` (`STDW`, `LDXDW`, `STXDW`) | ||
|
Comment on lines
+87
to
+90
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probably a little late for my comment, but the critique is still valid. Repurposing instruction classes brings no immediate benefit for either the validator or developers. We can only reap potential positive effects of this change if coupled with a future SIMD. Having said that, modifications like this could be postponed to when the gains are clear and well documented. |
||
|
|
||
| When a `CALLX` instruction (opcode `0x8D`) is encountered during verification, | ||
| the `src` register field must be verified instead of the `imm` immediate field. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For information, (upstream) LLVM recently changed the encoding of Would modifying the proposal to use
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could but it would require backports to v2.2. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Thanks for the information. I was wondering whether SIMD-0173 was already frozen/not subject to change, as this Pull Request is not merged. It seems SBPF v2 (and v3 ?) are already frozen standards and a version bump would be needed to make other changes. It feels like much effort for only few improvements, to change the register used by
I don't know the rationale behind the decision of choosing
While the Linux kernel does not support indirect calls ( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For information (for future readers stumbling on this thread), modifying the encoding of callx to use the destination register (and to revert SIMD-0173) is now being discussed in SIMD-0377: eBPF ISA compatibility. The sBPF implementation was already modified: anza-xyz/sbpf#89 |
||
| Otherwise, the verification rule stays the same: The src register must be in | ||
| the inclusive range from R0 to R9. | ||
|
|
||
| ### Changes to Execution | ||
|
|
||
| The introduced `HOR64` instruction (opcode `0xF7`) must take its immediate | ||
| value, shift it 32 bit towards the MSBs (multiplication-like left shift) and | ||
| then bitwise OR it into the given `dst` register. | ||
Lichtso marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| For the `CALLX` instruction (opcode `0x8D`) the jump destination must be read | ||
| from the `src` register field instead of the `imm` immediate field. | ||
|
|
||
| The execution behavior of the moved instructions is transferred to their new | ||
| opcodes: | ||
|
|
||
| - `0x72` => `0x27`, `0x71` => `0x2C`, `0x73` => `0x2F` | ||
| - `0x6A` => `0x37`, `0x69` => `0x3C`, `0x6B` => `0x3F` | ||
| - `0x62` => `0x87`, `0x61` => `0x8C`, `0x63` => `0x8F` | ||
| - `0x7A` => `0x97`, `0x79` => `0x9C`, `0x7B` => `0x9F` | ||
|
|
||
| ## Impact | ||
|
|
||
| The toolchain will emit machinecode according to the selected SBPF version. | ||
| As most proposed changes affect the encoding only, and not the functionallity, | ||
| we expect to see no impact on dApp developers. The only exception is that | ||
| 64-bit immediate loads will now cost 2 CU instead of 1 CU. | ||
|
|
||
| ## Security Considerations | ||
|
|
||
| None. | ||
Uh oh!
There was an error while loading. Please reload this page.