-
Notifications
You must be signed in to change notification settings - Fork 238
feat: forced inclusion #2797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: forced inclusion #2797
Conversation
|
The latest Buf updates on your PR. Results from workflow CI and Release / buf-check (pull_request).
|
|
Claude finished @julienrbrt's task —— View job Comprehensive Code Review: Forced Inclusion ImplementationI've completed a thorough review of this PR implementing the forced inclusion mechanism. Overall, this is a well-designed and carefully implemented feature with good test coverage. Below are my findings organized by focus area. 1. Code Quality & Architecture ⭐Strengths
Issues Found🔴 Critical: Race condition in BasedSequencer.txQueue
type BasedSequencer struct {
// ... existing fields ...
mu sync.Mutex
txQueue [][]byte
}
func (s *BasedSequencer) GetNextBatch(...) {
s.mu.Lock()
defer s.mu.Unlock()
// ... rest of implementation
}🟡 Medium: Inconsistent error handling in syncer verification
if err := s.verifyForcedInclusionTxs(currentState, data); err != nil {
s.logger.Error().Err(err).Uint64("height", nextHeight).Msg("forced inclusion verification failed")
if errors.Is(err, errMaliciousProposer) {
s.cache.RemoveHeaderDAIncluded(headerHash)
return err
}
// Non-malicious errors don't return - block is applied anyway!
}
if err := s.verifyForcedInclusionTxs(currentState, data); err != nil {
s.logger.Error().Err(err).Uint64("height", nextHeight).Msg("forced inclusion verification failed")
s.cache.RemoveHeaderDAIncluded(headerHash)
return err
}🟡 Medium: Configuration validation disabled
if len(c.DA.GetForcedInclusionNamespace()) > 0 {
// if err := validateNamespace(c.DA.GetForcedInclusionNamespace()); err != nil {
// return fmt.Errorf("could not validate forced inclusion namespace (%s): %w", c.DA.GetForcedInclusionNamespace(), err)
// }
return fmt.Errorf("forced inclusion is not yet live")
}
🟢 Minor: Potential unbounded slice growth
🟢 Minor: Missing godoc on exported types
// ForcedInclusionEvent contains forced inclusion transactions retrieved from DA
// along with the DA height range from which they were fetched.
type ForcedInclusionEvent struct {
StartDaHeight uint64 // First DA height in the epoch range
EndDaHeight uint64 // Last DA height processed (may be less than epoch end if future height encountered)
Txs [][]byte // Raw transaction bytes from DA forced inclusion namespace
}2. Error Handling & Resilience ⭐⭐Strengths
Issues Found🟡 Medium: Missing context cancellation check in epoch loop
for epochHeight := epochStart + 1; epochHeight < epochEnd; epochHeight++ {
// No context check here - could block shutdown
result := r.client.RetrieveForcedInclusion(ctx, epochHeight)
// ...
}
for epochHeight := epochStart + 1; epochHeight < epochEnd; epochHeight++ {
select {
case <-ctx.Done():
return nil, ctx.Err()
default:
}
result := r.client.RetrieveForcedInclusion(ctx, epochHeight)
// ...
}🟢 Minor: Logging of potentially large data
3. Security 🔒Strengths
Issues Found🟡 Medium: Hash collision vulnerability in verification
blockTxMap := make(map[string]struct{})
for _, tx := range data.Txs {
blockTxMap[hashTx(tx)] = struct{}{} // What is hashTx?
}
import "crypto/sha256"
func hashTx(tx []byte) string {
h := sha256.Sum256(tx)
return hex.EncodeToString(h[:])
}🟢 Minor: Absolute blob size limit discrepancy
// AbsoluteMaxBlobSize is the absolute maximum size for a single blob (DA layer limit).
AbsoluteMaxBlobSize = 2 * 1024 * 1024 // 2MB
🟢 Minor: No rate limiting on forced inclusion submissions
4. Performance & Resource Efficiency ⚡Strengths
Issues Found🟡 Medium: Potential memory inefficiency in epoch fetching
🟡 Medium: Sequential DA fetching in epoch loop
🟢 Minor: Unbuffered map allocation
blockTxMap := make(map[string]struct{})
blockTxMap := make(map[string]struct{}, len(data.Txs))🟢 Minor: Slice append in hot path
batch.Transactions = append(forcedTxs, trimmedBatchTxs...)
combined := make([][]byte, 0, len(forcedTxs)+len(trimmedBatchTxs))
combined = append(combined, forcedTxs...)
combined = append(combined, trimmedBatchTxs...)
batch.Transactions = combined5. Testing & Reliability ✅Strengths
Issues Found🔴 Critical: Missing concurrency tests for BasedSequencer
func TestBasedSequencer_ConcurrentGetNextBatch(t *testing.T) {
// Run with: go test -race
// Spawn multiple goroutines calling GetNextBatch
}🟡 Medium: Missing forced inclusion verification tests
🟡 Medium: Epoch boundary edge cases
🟢 Minor: Test coverage for pending queue overflow
6. Documentation & Developer Experience 📚Strengths
Issues Found🟡 Medium: Missing migration guide
🟢 Minor: Inconsistent terminology
🟢 Minor: Missing operational metrics
🟢 Minor: TODO comments in production code
// TODO(@julienrbrt): technically we may need to check for block gas as well
7. Concurrency & Race Conditions 🔄Issues Found🔴 Critical: BasedSequencer data race (detailed above in Architecture section) 🟡 Medium: Single sequencer pending queue race
8. Additional ObservationsBreaking Changes
Follow-up Work IdentifiedGood capture of follow-up items in comments:
Summary & RecommendationsMust Fix Before Merge
Should Fix Before Enabling Feature
Nice to Have
Overall AssessmentThis is a well-designed and carefully implemented feature. The architecture is sound, the code quality is high, and the ADR provides excellent context. The main concerns are:
With the critical issues addressed, this will be a solid foundation for censorship resistance in the rollup. Estimated Remaining Work: 2-3 days to address critical and high-priority issues. |
alpe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice start!
Can you elaborate why you decided on a separate namespace for the force inclusion TX? The consumers have to read both ns anyway to stay up to date
|
|
||
| event.StartDaHeight = epochHeight | ||
| event.Txs = append(event.Txs, result.Data...) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to prepare for malicious content. let's exit the loop early when a tx size threshold is reached. This can be a multiple of common.DefaultMaxBlobSize used by the executor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense for the height check yes!. However i was thinking of doing no other checks and let the execution client deal with gibberish data (this is why i added that as requirement in the execution interface description)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to keep raw TX data in the namespace, there is not much we can do here to validate, indeed. A size check is an easy win but more would require extending the executor interface for a checkTX.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree, and this actually may be required to avoid congestions issue and losing txs.
This was a suggestion. Personally I think it makes sense, as we are filtering what's coming up in that namespace at fetching level directly in ev-node. What is posted in the force included namespace is handled directly by the execution client. ev-node only pass down bytes. |
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2797 +/- ##
==========================================
+ Coverage 64.76% 65.45% +0.68%
==========================================
Files 81 85 +4
Lines 7328 7755 +427
==========================================
+ Hits 4746 5076 +330
- Misses 2041 2114 +73
- Partials 541 565 +24
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).
|
|
List of improvements to do in follow-ups:
|
|
We discussed the above in the standup (#2797 (comment)), and a few ideas came. 1 - 2 . When making the call async, we need to make sure the executor and full node stay insync with an epoch. This can be done easily by making an epoch a few blocks behind the actual DA height.
|
alpe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for answering all my questions and comments.
There is still the todo in the code to store unprocessed direct TX when the max block size is reached.
|
|
||
| event.StartDaHeight = epochHeight | ||
| event.Txs = append(event.Txs, result.Data...) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to keep raw TX data in the namespace, there is not much we can do here to validate, indeed. A size check is an easy win but more would require extending the executor interface for a checkTX.
we decided to remove the sequencer go.mod, as ev-node can provide directly the sequencer implementation (sequencers/single was already depending on ev-node anyway) this means no go.mod need to be added for the new based sequencers in #2797
|
Once is PR is merged, we should directly after:
In the meantime, I have disabled the feature so it can be merged (0d790ef) |
|
FYI the upgrade test will fail until tastora is updated. |
| Users can submit transactions in two ways: | ||
|
|
||
| ### Systems Affected | ||
| 1. **Normal Path**: Submit to sequencer's mempool/RPC (fast, low cost) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the mempool not used app side for abci? Does ev-node have a mempool? Or does "sequencer's mempool/RPC" here refer to the sequencer node as a single entity even if its running the app out-of-process as is with evm.
From what I understand, the reth/evm mempool is used for evm and the sequencer queries the pending txs pool/queue in GetTxs
| ### Full Node Verification Flow | ||
|
|
||
| ``` | ||
| 1. Receive block from DA or P2P | ||
| 2. Before applying block: | ||
| a. Fetch forced inclusion txs from DA at block's DA height | ||
| b. Build map of transactions in block | ||
| c. Verify all forced txs are in block | ||
| d. If missing: reject block, flag malicious proposer | ||
| 3. Apply block if verification passes | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense! I think my mental model was assuming that ev-node did not need to be run with ev-reth for full nodes. But on reflection I think I was incorrect or misunderstood.
I assume ev-node must always be run even for evm stack full nodes but with --evnode.node.aggregator=false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, full node runs the whole stack. Light nodes on the other hand just fetch headers.
| - Only at epoch boundaries | ||
| - Scan epoch range for forced transactions | ||
| 3. Get batch from mempool queue | ||
| 4. Prepend forced txs to batch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if we wanted to zk prove forced inclusion txs we could query the forced inclusion namespace at each epoch and prepend them to the txs list that we compare with the execution client's state transition function 🤔
Rename `evm-single` to `evm` and `grpc-single` to `evgrpc` for clarity. ref: #2797 (comment)
Extract some logic from #2797. Those refactors were done to ease force inclusion integration but they can be extracted to be merged sooner
af054de to
a18e75f
Compare
ref: #1914
A choice has been made to make this logic in the executor and avoid extending the reaper and the sequencer.
This is because, updating the repeer, means passing down the last fetched da height accross all components.
It adds a lot of complexity otherwise. Adding it in the sequencer may be preferable, but this makes the inclusion in a sync node less straightforward. This is what is being investigated.
Compared to the previous implementation, a forced transaction does not have any structure. It should be the raw structure from the execution client. This is to keep ev-node know nothing about the transaction. No signature checks, no validation of correctness. The execution client must make sure to reject gibberish transactions.
---- for later, won't be included in this pr (ref #2797 (comment))