Skip to content

Conversation

@Amxx
Copy link
Collaborator

@Amxx Amxx commented Dec 2, 2025

Followup on a discution started by #5826

I was looking at the _getNodeId function in TrieProof, and I was wondering about something. What if the nodeId is a keccak256 hash, but the hash starts with (one or multiple) zeros ? If there is only one zero byte (1:256 chances) then the RLP encoding removes the prefix zero, we end up with an encoded node bytes32 that is 32 bytes long, that the system will see that as a hash (id of a long node) ... but what if the hash contains 2 zero bytes (1:65535 chances). Then the encoded value would be only 31 bytes long, and we'll have an issue with it being recognised as a short node when its not !!!

The issue is not on the decoding side though (the TrieProof library works AFAIK), but on the encoding side. When encoding such a hash, the leading zeros should NOT be compressed to make sure the decoder parses it correstly. We are in a case where "leading zero are information" (as @ernest once mentionned it).

When I use ethers.js and do ethers.encodeRlp(ethers.ZeroHash) I get 0xa00000000000000000000000000000000000000000000000000000000000000000

However, our RLP library redirects the calls from encode(bytes32 input) to encode(uint256 input) by "just" casting the input. So using encode(bytes32 input) will compress the zeros. Is that OK ? Should we change the RLP.sol library to not remove the prefix zeros when the input is a bytes32 (just like we do for addresses) ?

[...]

RLP doesn't define the solidity types, only "array" and "scalar" values. The question is "how should we interpret the solidity type". I don't think there is a definitive answer. At least there is nothing "standard".

It is a no-brainer that the solidity bytes is an array when it comes to RLP.

We also agree that uint256 represents a scalar, and the zeros should be removed.

What we discussed is the special case of address, and now bytes32. I believe that their length is essential to the type (unlike uint256), and that they should be considered as array of fixed size (20 bytes for address, 32 bytes for bytes32)

PR Checklist

  • Tests
  • Documentation
  • Changeset entry (run npx changeset add)

@changeset-bot
Copy link

changeset-bot bot commented Dec 2, 2025

🦋 Changeset detected

Latest commit: e1bbdf0

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
openzeppelin-solidity Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@Amxx Amxx added this to the 5.6 milestone Dec 2, 2025
@Amxx Amxx requested review from ernestognw and luiz-lvj December 2, 2025 16:11
@Amxx Amxx marked this pull request as ready for review December 3, 2025 14:01
@Amxx Amxx requested a review from a team as a code owner December 3, 2025 14:01
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 3, 2025

Walkthrough

The pull request modifies the RLP utility contract and its test suite. The encode(bytes32) function is reimplemented as a dedicated assembly-based encoding that preserves prefix zeros and uses a fixed-size array representation, instead of delegating to encode(uint256). Docstrings for encode(address), encode(uint256), and encode(bytes32) are updated to clarify their encoding behaviors. The test file is refactored to compute expected RLP encodings using ethers.encodeRlp() and adds explicit test cases for non-canonical encodings and zero-value handling.

Possibly related PRs

Suggested labels

bug, tests, ignore-changeset

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: preventing removal of prefix zeros when RLP encoding bytes32 values, which is the core fix addressed in this pull request.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The pull request description clearly relates to the changeset, explaining the RLP encoding issue for bytes32 values with leading zeros and the fix being implemented.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
test/utils/RLP.test.js (1)

117-120: Minor: Clarify the inline comments for zero-value tests.

The comment on line 117 ("Encoding for the value") is unclear about which value it refers to. Consider making these comments more descriptive.

-    // Encoding for the value
+    // Canonical empty string encoding → decodes to zero
     await expect(this.mock.$decodeBytes32('0x80')).to.eventually.equal(ethers.ZeroHash);
-    // Encoding for two zeros (and nothing else)
+    // Non-canonical encoding with two zero bytes → decodes to zero
     await expect(this.mock.$decodeBytes32('0x820000')).to.eventually.equal(ethers.ZeroHash);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6271317 and b81112d.

📒 Files selected for processing (2)
  • contracts/utils/RLP.sol (3 hunks)
  • test/utils/RLP.test.js (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
test/utils/RLP.test.js (1)
test/helpers/random.js (1)
  • generators (3-10)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: tests
  • GitHub Check: slither
  • GitHub Check: tests-upgradeable
  • GitHub Check: tests-foundry
  • GitHub Check: coverage
  • GitHub Check: halmos
🔇 Additional comments (4)
contracts/utils/RLP.sol (3)

154-158: LGTM!

The updated documentation accurately describes the fixed-size encoding behavior for addresses and provides clear guidance for users who want scalar encoding via casting.


168-172: LGTM!

The documentation clearly distinguishes scalar encoding (uint256) from fixed-size encoding (bytes32), which helps users choose the appropriate encoding for their use case.


193-206: LGTM!

The assembly implementation is correct. Memory layout and arithmetic are verified:

  • Prefix 0xa0 = SHORT_OFFSET (0x80) + 32 correctly indicates a 32-byte string
  • Full 32-byte input is preserved, including leading zeros
  • Free memory pointer is properly updated to result + 0x41

This aligns with the PR objective of matching ethers.js behavior for bytes32 encoding.

test/utils/RLP.test.js (1)

94-103: LGTM!

The test inputs provide good coverage of edge cases:

  • Zero hash (all leading zeros)
  • Trailing non-zero byte with leading zeros
  • Leading non-zero byte
  • Random value for broader coverage

Using ethers.encodeRlp(input) as the expected value correctly validates that the Solidity implementation matches the ethers.js reference behavior.

luiz-lvj
luiz-lvj previously approved these changes Dec 3, 2025
Co-authored-by: Luiz Vasconcelos Júnior <[email protected]>
luiz-lvj
luiz-lvj previously approved these changes Dec 3, 2025
luiz-lvj
luiz-lvj previously approved these changes Dec 4, 2025
ernestognw
ernestognw previously approved these changes Dec 4, 2025
@ernestognw ernestognw dismissed stale reviews from luiz-lvj and themself via b1db63e December 4, 2025 16:40
Copy link
Member

@ernestognw ernestognw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a small call to action in the changelog to make it clear how to preserve the previous behavior if someone is currently relying on encode(bytes32)

@Amxx Amxx merged commit 66049ed into OpenZeppelin:master Dec 4, 2025
20 of 21 checks passed
@Amxx Amxx deleted the refactor/rlp-encode-full-bytes32 branch December 4, 2025 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants