perf(l1): avoid temporary allocations when decoding and hashing trie nodes #5353

MegaRedHand · 2025-11-14T15:26:37Z

Motivation

Our current implementation uses intermediate allocations for hashing trie nodes. This is inefficient, since the buffer could be allocated once.

Also, we're currently allocating multiple buffers in Node::decode_unfinished. This can be replaced by a simple stack allocation.

Description

This PR adds a compute_hash_no_alloc function which receives a buffer and avoids the allocation. It also replaces the temporary buffers used in Node::decode_unfinished with a stack allocated array and references to the original buffer.

Flamegraph before:

Flamegraph after:

github-actions · 2025-11-14T15:29:32Z

Lines of code report

Total lines added: 56
Total lines removed: 0
Total lines changed: 56

Detailed view

+---------------------------------------------+-------+------+
| File                                        | Lines | Diff |
+---------------------------------------------+-------+------+
| ethrex/crates/common/rlp/structs.rs         | 164   | +1   |
+---------------------------------------------+-------+------+
| ethrex/crates/common/trie/node.rs           | 352   | +16  |
+---------------------------------------------+-------+------+
| ethrex/crates/common/trie/node/branch.rs    | 571   | +7   |
+---------------------------------------------+-------+------+
| ethrex/crates/common/trie/node/extension.rs | 515   | +7   |
+---------------------------------------------+-------+------+
| ethrex/crates/common/trie/node/leaf.rs      | 302   | +7   |
+---------------------------------------------+-------+------+
| ethrex/crates/common/trie/rlp.rs            | 136   | +10  |
+---------------------------------------------+-------+------+
| ethrex/crates/common/trie/trie.rs           | 978   | +1   |
+---------------------------------------------+-------+------+
| ethrex/crates/common/trie/trie_sorted.rs    | 447   | +7   |
+---------------------------------------------+-------+------+

github-actions · 2025-11-14T15:31:44Z

Benchmark for `fdcc01e`

Click to view benchmark

Test	Base	PR	%
Trie/cita-trie insert 10k	27.9±0.65ms	28.3±1.58ms	+1.43%
Trie/cita-trie insert 1k	2.8±0.01ms	2.9±0.09ms	+3.57%
Trie/ethrex-trie insert 10k	24.7±0.71ms	24.5±0.63ms	-0.81%
Trie/ethrex-trie insert 1k	2.2±0.01ms	2.2±0.01ms	0.00%

github-actions · 2025-11-14T15:50:34Z

Benchmark for `632c51a`

Click to view benchmark

Test	Base	PR	%
Trie/cita-trie insert 10k	27.7±0.71ms	27.4±0.31ms	-1.08%
Trie/cita-trie insert 1k	2.9±0.01ms	2.9±0.20ms	0.00%
Trie/ethrex-trie insert 10k	24.2±0.46ms	24.1±0.55ms	-0.41%
Trie/ethrex-trie insert 1k	2.2±0.01ms	2.2±0.01ms	0.00%

github-actions · 2025-11-14T15:52:33Z

Benchmark for `aa0588b`

Click to view benchmark

Test	Base	PR	%
Trie/cita-trie insert 10k	27.9±0.35ms	28.1±1.20ms	+0.72%
Trie/cita-trie insert 1k	2.8±0.01ms	2.9±0.18ms	+3.57%
Trie/ethrex-trie insert 10k	24.9±0.52ms	24.9±0.97ms	0.00%
Trie/ethrex-trie insert 1k	2.2±0.04ms	2.2±0.01ms	0.00%

github-actions · 2025-11-14T16:29:51Z

Benchmark Block Execution Results Comparison Against Main

Command	Mean [s]	Min [s]	Max [s]	Relative
`base`	59.849 ± 0.309	59.472	60.303	1.00 ± 0.01
`head`	59.747 ± 0.308	59.265	60.306	1.00

github-actions · 2025-11-14T22:49:04Z

Benchmark for `116c57f`

Click to view benchmark

Test	Base	PR	%
Trie/cita-trie insert 10k	28.7±1.32ms	28.5±0.82ms	-0.70%
Trie/cita-trie insert 1k	2.8±0.01ms	2.9±0.12ms	+3.57%
Trie/ethrex-trie insert 10k	25.4±0.97ms	25.1±0.84ms	-1.18%
Trie/ethrex-trie insert 1k	2.2±0.01ms	2.2±0.03ms	0.00%

github-actions · 2025-11-14T23:31:54Z

Benchmark for `58823f7`

Click to view benchmark

Test	Base	PR	%
Trie/cita-trie insert 10k	28.1±0.56ms	28.4±0.70ms	+1.07%
Trie/cita-trie insert 1k	2.9±0.01ms	2.8±0.09ms	-3.45%
Trie/ethrex-trie insert 10k	24.9±0.83ms	24.8±0.36ms	-0.40%
Trie/ethrex-trie insert 1k	2.2±0.01ms	2.2±0.05ms	0.00%

Copilot

Pull Request Overview

This PR optimizes trie node hashing and decoding by eliminating temporary allocations. The changes introduce buffer reuse patterns for hash computation and replace heap-allocated vectors with stack-allocated arrays in the RLP decoder.

Added compute_hash_no_alloc functions that accept a reusable buffer parameter
Modified memoize_hashes to accept and reuse a buffer throughout recursive traversals
Replaced dynamic vector allocation in RLP decoder with a stack-allocated array of references
Added get_encoded_item_ref in RLP decoder to avoid unnecessary Vec allocations

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
crates/common/trie/node.rs	Adds `compute_hash_no_alloc` method and modifies `memoize_hashes` to accept a buffer parameter for reuse
crates/common/trie/node/branch.rs	Implements `compute_hash_no_alloc` for BranchNode with buffer reuse pattern
crates/common/trie/node/extension.rs	Implements `compute_hash_no_alloc` for ExtensionNode with buffer reuse pattern
crates/common/trie/node/leaf.rs	Implements `compute_hash_no_alloc` for LeafNode with buffer reuse pattern
crates/common/trie/rlp.rs	Replaces heap-allocated Vec with stack-allocated array for RLP items, uses reference-based decoding
crates/common/rlp/structs.rs	Adds `get_encoded_item_ref` to return references instead of allocating new vectors
crates/common/trie/trie_sorted.rs	Updates hash computation calls to use new buffer-accepting methods with 512-byte capacity
crates/common/trie/trie.rs	Updates root hash computation to use buffer reuse pattern
CHANGELOG.md	Documents the performance optimization

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-14T23:45:41Z

crates/common/trie/node.rs

+        }
+    }
+
+    /// Computes the node's hash


The doc comment is identical to the one for compute_hash above. Consider clarifying that this function uses a provided buffer to avoid allocations, e.g., "Computes the node's hash using the provided buffer to avoid allocations."

Suggested change

/// Computes the node's hash

/// Computes the node's hash using the provided buffer to avoid allocations

Copilot · 2025-11-14T23:45:41Z

crates/common/trie/node/leaf.rs

+        self.compute_hash_no_alloc(&mut vec![])
+    }
+
+    /// Computes the node's hash, using the provided buffer


The doc comment is identical to the one for compute_hash above. Consider clarifying that this function uses a provided buffer to avoid allocations, e.g., "Computes the node's hash using the provided buffer to avoid allocations."

Suggested change

/// Computes the node's hash, using the provided buffer

/// Computes the node's hash using the provided buffer to avoid allocations.

Copilot · 2025-11-14T23:45:42Z

crates/common/trie/node/extension.rs

+        self.compute_hash_no_alloc(&mut vec![])
+    }
+
+    /// Computes the node's hash, using the provided buffer


The doc comment is identical to the one for compute_hash above. Consider clarifying that this function uses a provided buffer to avoid allocations, e.g., "Computes the node's hash using the provided buffer to avoid allocations."

Suggested change

/// Computes the node's hash, using the provided buffer

/// Computes the node's hash using the provided buffer to avoid allocations.

Copilot · 2025-11-14T23:45:42Z

crates/common/trie/node/branch.rs

+        self.compute_hash_no_alloc(&mut vec![])
+    }
+
+    /// Computes the node's hash, using the provided buffer


The doc comment is identical to the one for compute_hash above. Consider clarifying that this function uses a provided buffer to avoid allocations, e.g., "Computes the node's hash using the provided buffer to avoid allocations."

Suggested change

/// Computes the node's hash, using the provided buffer

/// Computes the node's hash using the provided buffer to avoid allocations.

/// This method reuses the given buffer to minimize heap allocations when encoding the node.

Copilot · 2025-11-14T23:45:42Z

crates/common/trie/trie_sorted.rs

 ) -> Result<(), TrieGenerationError> {
    debug!("{:x?}", center_side.path);
    debug!("{:x?}", parent_element.path);
+    let mut nodehash_buffer = Vec::with_capacity(512);


[nitpick] The initial capacity of 512 bytes may be insufficient for some branch nodes. A branch node can have 16 children (each encoded as up to 33 bytes) plus a value field, potentially exceeding 512 bytes with RLP overhead. While Vec will automatically grow, this may cause reallocations. Consider using a larger initial capacity (e.g., 600-700 bytes) or document this trade-off.

Suggested change

let mut nodehash_buffer = Vec::with_capacity(512);

// Increased initial capacity to 700 bytes to avoid reallocations for large branch nodes.

let mut nodehash_buffer = Vec::with_capacity(700);

Copilot · 2025-11-14T23:45:43Z

crates/common/trie/trie_sorted.rs

    let mut left_side = StackElement::default();
    let mut center_side: CenterSide = CenterSide::from_value(initial_value);
    let mut right_side_opt: Option<(H256, Vec<u8>)> = data_iter.next();
+    let mut nodehash_buffer = Vec::with_capacity(512);


[nitpick] The initial capacity of 512 bytes may be insufficient for some branch nodes. A branch node can have 16 children (each encoded as up to 33 bytes) plus a value field, potentially exceeding 512 bytes with RLP overhead. While Vec will automatically grow, this may cause reallocations. Consider using a larger initial capacity (e.g., 600-700 bytes) or document this trade-off.

Suggested change

let mut nodehash_buffer = Vec::with_capacity(512);

// Increased initial capacity to 700 bytes to avoid reallocations for large branch nodes (16 children * 33 bytes + value + RLP overhead)

let mut nodehash_buffer = Vec::with_capacity(700);

Copilot · 2025-11-14T23:45:43Z

crates/common/trie/trie.rs

    pub fn hash_no_commit(&self) -> H256 {
        if self.root.is_valid() {
-            self.root.compute_hash().finalize()
+            // 512 is the maximum size of an encoded node


The comment states "512 is the maximum size of an encoded node", but this may not be accurate for all cases. A branch node with 16 children can exceed this size. While Vec will automatically grow, the comment should be updated to reflect that 512 is an estimated typical size rather than a strict maximum.

Suggested change

// 512 is the maximum size of an encoded node

// 512 is an estimated typical size for an encoded node; some nodes (e.g., branch nodes with many children) may exceed this size

…nodes (lambdaclass#5353) **Motivation** Our current implementation uses intermediate allocations for hashing trie nodes. This is inefficient, since the buffer could be allocated once. Also, we're currently allocating multiple buffers in `Node::decode_unfinished`. This can be replaced by a simple stack allocation. **Description** This PR adds a `compute_hash_no_alloc` function which receives a buffer and avoids the allocation. It also replaces the temporary buffers used in `Node::decode_unfinished` with a stack allocated array and references to the original buffer. Flamegraph before: <img width="1512" height="763" alt="Screenshot 2025-11-14 at 20 48 14" src="https://github.com/user-attachments/assets/4c31ba88-5eba-4ddf-9192-78fb07358265" /> Flamegraph after: <img width="1512" height="735" alt="Screenshot 2025-11-14 at 20 48 51" src="https://github.com/user-attachments/assets/ec6fbc93-d5dc-4560-831a-c4f9715dc78e" />

MegaRedHand added 2 commits November 14, 2025 11:59

perf(l1): avoid intermediate allocations when hashing trie

552c144

perf: replace other uses of compute_hash

fcc0dd3

github-actions bot assigned MegaRedHand Nov 14, 2025

github-actions bot added L1 Ethereum client performance Block execution throughput and performance in general labels Nov 14, 2025

github-project-automation bot added this to ethrex_performance and ethrex_l1 Nov 14, 2025

github-project-automation bot moved this to Todo in ethrex_performance Nov 14, 2025

MegaRedHand added 2 commits November 14, 2025 12:46

chore: update Cargo.lock

6552ef3

Merge branch 'main' into trie-hash-avoid-intermediate-allocations

868eb65

perf: avoid heap allocation when decoding Node

fd039c2

Merge branch 'main' into trie-hash-avoid-intermediate-allocations

0573414

MegaRedHand changed the title ~~perf(l1): avoid intermediate allocations when computing trie node hashes~~ perf(l1): avoid intermediate allocations when decoding and hashing nodes Nov 14, 2025

MegaRedHand changed the title ~~perf(l1): avoid intermediate allocations when decoding and hashing nodes~~ perf(l1): avoid intermediate allocations when decoding and hashing trie nodes Nov 14, 2025

MegaRedHand changed the title ~~perf(l1): avoid intermediate allocations when decoding and hashing trie nodes~~ perf(l1): avoid temporary allocations when decoding and hashing trie nodes Nov 14, 2025

MegaRedHand added 2 commits November 14, 2025 20:27

docs: update changelog

3699fc2

refactor: fix clippy lints

ff4f8c0

MegaRedHand marked this pull request as ready for review November 14, 2025 23:40

MegaRedHand requested a review from a team as a code owner November 14, 2025 23:40

Copilot AI review requested due to automatic review settings November 14, 2025 23:40

ethrex-project-sync bot moved this to In Review in ethrex_l1 Nov 14, 2025

Copilot started reviewing on behalf of MegaRedHand November 14, 2025 23:41 View session

Copilot finished reviewing on behalf of MegaRedHand November 14, 2025 23:44

Copilot AI reviewed Nov 14, 2025

View reviewed changes

edg-l approved these changes Nov 17, 2025

View reviewed changes

jrchatruc approved these changes Nov 17, 2025

View reviewed changes

jrchatruc added this pull request to the merge queue Nov 17, 2025

Merged via the queue into main with commit 06dc722 Nov 17, 2025
56 checks passed

jrchatruc deleted the trie-hash-avoid-intermediate-allocations branch November 17, 2025 15:43

github-project-automation bot moved this from In Review to Done in ethrex_l1 Nov 17, 2025

github-project-automation bot moved this from Todo to Done in ethrex_performance Nov 17, 2025

	/// Computes the node's hash
	/// Computes the node's hash using the provided buffer to avoid allocations

	/// Computes the node's hash, using the provided buffer
	/// Computes the node's hash using the provided buffer to avoid allocations.

	/// Computes the node's hash, using the provided buffer
	/// Computes the node's hash using the provided buffer to avoid allocations.
	/// This method reuses the given buffer to minimize heap allocations when encoding the node.

	let mut nodehash_buffer = Vec::with_capacity(512);
	// Increased initial capacity to 700 bytes to avoid reallocations for large branch nodes.
	let mut nodehash_buffer = Vec::with_capacity(700);

	// 512 is the maximum size of an encoded node
	// 512 is an estimated typical size for an encoded node; some nodes (e.g., branch nodes with many children) may exceed this size

perf(l1): avoid temporary allocations when decoding and hashing trie nodes #5353

perf(l1): avoid temporary allocations when decoding and hashing trie nodes #5353

Uh oh!

Conversation

MegaRedHand commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Lines of code report

Uh oh!

github-actions bot commented Nov 14, 2025

Benchmark for fdcc01e

Uh oh!

github-actions bot commented Nov 14, 2025

Benchmark for 632c51a

Uh oh!

github-actions bot commented Nov 14, 2025

Benchmark for aa0588b

Uh oh!

github-actions bot commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Block Execution Results Comparison Against Main

Uh oh!

github-actions bot commented Nov 14, 2025

Benchmark for 116c57f

Uh oh!

github-actions bot commented Nov 14, 2025

Benchmark for 58823f7

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

MegaRedHand commented Nov 14, 2025 •

edited

Loading

github-actions bot commented Nov 14, 2025 •

edited

Loading

Benchmark for `fdcc01e`

Benchmark for `632c51a`

Benchmark for `aa0588b`

github-actions bot commented Nov 14, 2025 •

edited

Loading

Benchmark for `116c57f`

Benchmark for `58823f7`