Skip to content

Replace O(n²) sync lookups with HashMap for O(1) access#245

Merged
schjonhaug merged 5 commits intomasterfrom
perf/hashmap-sync-lookups
Mar 5, 2026
Merged

Replace O(n²) sync lookups with HashMap for O(1) access#245
schjonhaug merged 5 commits intomasterfrom
perf/hashmap-sync-lookups

Conversation

@schjonhaug
Copy link
Owner

Summary

  • Replace all 8 .find() calls in sync.rs that linearly scanned transaction collections inside loops, creating O(n²) complexity
  • Build bdk_tx_map and existing_tx_map HashMaps for O(1) lookups in RBF detection, CPFP detection, and address-based confirmation tracking

Benchmark results

Tested with a 673-transaction stress wallet (48 unconfirmed, active CPFP chains) on regtest:

Metric Before (O(n²) .find()) After (O(1) HashMap) Speedup
CPFP detection 72.56ms, 74.02ms 3.04ms, 3.05ms, 3.10ms ~24x
End-to-end tx processing 120.39ms, 116.21ms 36.13ms, 36.16ms, 36.58ms ~3.3x

The improvement grows quadratically with transaction count — at 50K+ transactions the O(n²) pattern would take minutes.

Changes

Descriptor wallet sync (RBF/CPFP detection):

  • Build bdk_tx_map HashMap after collecting BDK transactions
  • Replace 4 .find() calls in RBF conflict and CPFP child/parent lookups

Address-based wallet sync (confirmation tracking):

  • Build existing_tx_map HashMap from existing transactions
  • Replace 2 .find() calls in single-address sync
  • Replace 2 .find() calls in grouped address watch sync

Test plan

  • cargo build — compiles cleanly
  • cargo test -- --test-threads=1 — all 147 tests pass
  • Stress tested with ./dev.sh create-stress-wallet 500 (624 transactions)
  • Benchmarked BEFORE vs AFTER with unconfirmed CPFP chains

Closes #236

Replace all 8 .find() calls in sync.rs that linearly scanned
transaction collections inside loops, creating O(n²) complexity.

Descriptor wallet sync (RBF/CPFP detection):
- Build bdk_tx_map HashMap after collecting BDK transactions
- Replace 4 .find() calls in RBF conflict and CPFP detection

Address-based wallet sync (confirmation tracking):
- Build existing_tx_map HashMap from existing transactions
- Replace 2 .find() calls in single-address sync
- Replace 2 .find() calls in grouped address watch sync

Benchmarked with 673 transactions (48 unconfirmed, active CPFP chains):
- CPFP detection: 74ms → 3ms (24x faster)
- End-to-end tx processing: 116ms → 36ms (3.3x faster)

Closes #236
@github-actions
Copy link

github-actions bot commented Mar 5, 2026

🤖 Gemini Code Review

Here is a review of the proposed changes to backend/src/sync.rs:

Code Quality and Best Practices

  • Excellent Refactoring: Replacing numerous O(N) slice iterations (.iter().find(...) and .contains(...)) with O(1) HashMap and HashSet lookups is a major win for readability and long-term scalability.
  • Idiomatic Rust: The implementation of existing_tx_map leveraging HashMap<&str, &_> is a great example of zero-copy design. It borrows string slices directly from the existing transaction objects, avoiding expensive and unnecessary memory allocations.
  • Error Handling: Using .inspect_err(...) when parsing the Txid strings is a robust way to emit non-blocking warnings while cleanly moving through the .ok().and_then(...) chain.

Potential Bugs or Issues

  • Safe State Transitions: The conditions evaluating state updates for transaction history (e.g., checking && existing.block_height.is_none()) flawlessly mirror the nested if statements from the original implementation. There are no logical regressions introduced here.
  • Casing Consistency: By parsing transaction IDs into native Txid types to query bdk_tx_map (rather than comparing plain strings), the code is strictly safer and immune to any potential uppercase/lowercase hex string bugs.

Performance Considerations

  • Massive Algorithmic Improvements: The previous implementation traversed existing_transactions for every single entry in the transaction history array. The new HashMap strategy converts an O(H * E) bottleneck (where H is history entries and E is existing transactions) into an O(H) linear pass. This will substantially speed up background synchronization for highly active wallets.
  • Minor Optimization Opportunity: At the beginning of the sync pipeline, canonical_transactions_data immediately casts tx_item.tx_node.txid to a String. Downstream, during RBF conflict resolution and unconfirmed transaction mapping, these strings are repeatedly parsed back into Txid types (using Txid::from_str). In fact, during conflict detection, Txid::from_str executes inside a nested loop for every conflicted transaction.
    • Suggestion: Consider storing the raw Txid object inside the canonical_transactions_data tuple and only calling .to_string() at the boundaries where database inserts or specific event metrics require a String. This would eliminate all of the redundant string parsing within your inner loops.

Security Concerns

  • None. This PR strictly targets the internal runtime efficiency of the wallet synchronizer. There are no changes to API perimeters, database sanitization, or cryptographic material handling. Utilizing native BDK objects over raw strings actually leans toward a safer, type-checked execution path.

Test Coverage

  • Because this refactor modifies internal performance heuristics rather than external behavior, the existing integration and system tests (like balance_alerts_system_tests.rs and advanced_transactions.rs) should be entirely sufficient to prove correctness. As long as the CI pipelines remain green, no additional specific tests are necessary for these hash map substitutions.

@github-actions
Copy link

github-actions bot commented Mar 5, 2026

🤖 Codex Code Review

  1. No blocking bugs found in the modified diff. The refactors in sync.rs mostly improve lookup complexity (moving from repeated linear scans to hash maps/sets), and I don’t see a behavioral regression in the changed paths.

  2. Test coverage appears insufficient for the new logic branches introduced in:

Suggested tests to add:

  1. RBF detection where conflicted tx is pending and canonical replacement shares inputs.
  2. CPFP detection across multiple unconfirmed txs.
  3. Address-watch confirmation transition (pending -> confirmed) with map-based lookup.

Security concerns specific to this diff: none identified.

@github-actions
Copy link

github-actions bot commented Mar 5, 2026

🤖 Claude Code Review

Now I have a thorough picture of the PR. Let me provide the review.


Pull Request Review

This PR bundles two independent changes: a backend performance optimization in sync.rs and a frontend donation flow improvement.


Backend: O(n²) → O(1) HashMap Optimization (sync.rs)

What it does

Replaces linear .find() scans inside loops with HashMap lookups, eliminating O(n²) complexity in RBF conflict detection and CPFP detection. The commit message cites concrete benchmarks: 24x faster CPFP detection (74ms → 3ms), 3.3x faster end-to-end processing for a 673-transaction wallet.

Positive

  • The optimization is well-motivated and the benchmark numbers are convincing.
  • Using native Txid as keys in bdk_tx_map is a clear improvement over string-based keys.
  • inspect_err logging is consistent and correct — errors are logged before being silently dropped with .ok().
  • canonical_txids correctly uses HashSet<Txid> for O(1) membership testing.

Issues

1. Redundant Txid → String → Txid round-trip for canonical_txids (minor)

canonical_transactions_data is built from BDK with .to_string() on the Txid at sync.rs:443:

let txid = tx_item.tx_node.txid.to_string();

Then at line 570–577, those string txids are re-parsed back into HashSet<Txid>. Since the strings originated from native Txids, the parsing cannot fail in practice — but the filter_map/inspect_err error path exists to handle it anyway. This is a code smell: the round-trip happens because the 6-tuple in canonical_transactions_data carries String txids that are also fed into DB calls downstream.

A deeper fix would carry both the Txid and String forms (or only Txid and stringify on DB insertion), but that's a larger refactor. Worth noting as a follow-up.

2. Inline Txid::from_str inside a nested loop (minor)

At sync.rs:761–764, inside the for conflicted_txid in &conflicted_txids loop, and then inside the inner for (canonical_txid, ...) in &all_transactions loop, Txid::from_str(canonical_txid) is called for every canonical txid per conflicted tx:

if let Some(canonical_tx) = Txid::from_str(canonical_txid)
    .inspect_err(...)
    .ok()
    .and_then(|txid| bdk_tx_map.get(&txid))

Since bdk_tx_map is already keyed by Txid, and the canonical txids come from canonical_transactions_data (String form), this parse happens O(conflicted × canonical) times. For the common case where !found_replacement exits early via break, this is fine, but it could be avoided by keying all_transactions by Txid or by doing a single pre-pass to build a canonical_txid_str → &bdk_tx map with String keys (matching the type of canonical_transactions_data). Low impact in practice but worth flagging.

3. Mixed key type inconsistency for existing_tx_map (minor)

In the descriptor wallet path (around line 414), the map is:

HashMap<String, &TransactionWithWallet>

In the address-watch paths (lines 1552, 1836), it's:

HashMap<&str, &_>

The &str variant avoids the .clone() of each txid, which is correct. The descriptor path could also use &str keys (map(|tx| (tx.txid.as_str(), tx))), saving allocations. Inconsistency also makes the code harder to reason about.

4. conflicted_txid requires string conversion to query existing_tx_map (cosmetic)

let conflicted_txid_str = conflicted_txid.to_string();
if let Some(pending_tx) = existing_tx_map.get(&conflicted_txid_str) {

This is a consequence of existing_tx_map storing String keys while conflicted_txids is now Vec<Txid>. It's not wrong, but it highlights the type impedance mismatch mentioned above. If existing_tx_map used &str keys, this would still require an owned String intermediary since .as_str() requires a live binding — so the pattern is acceptable as-is.


Frontend: Donation Flow Improvements

What it does

  • Deletes donations/layout.tsx, moves AppHeader/AppFooter directly into donations/page.tsx
  • thank-you/page.tsx becomes a standalone page (no header/footer), with useSearchParams detecting ?checkoutPlanId= to distinguish recurring vs one-time completions
  • All 9 i18n files updated with oneTime/recurring nested keys and closeTab replacing backToHome

Positive

  • Suspense boundary around useSearchParams is required in Next.js App Router — omitting it causes a build error. Correct.
  • Added target="_blank" and rel="noopener noreferrer" on the BTCPay Server links — correct security practice.
  • The checkoutPlanId param is a reliable BTCPay Server signal for recurring checkout completion; using it as the discriminant is sensible.
  • All 9 locales are updated consistently.

Issues

5. <Suspense> without a fallback (minor)

<Suspense>
  <ThankYouContent />
</Suspense>

React renders nothing during suspension when no fallback is provided. Since useSearchParams is client-side, the suspend window is imperceptible, but explicitly providing fallback={null} or a minimal spinner communicates intent clearly.

6. Thank-you page has no header/footer (design consideration, not a bug)

Deleting layout.tsx means the /donations/thank-you route also loses its layout. The current implementation intentionally renders a centered full-screen confirmation (min-h-screen). If the thank-you page is opened in a new tab (as the target="_blank" links imply), this is the right UX. Just confirming this is intentional.

7. backToHome key removed but may be cached in old builds (low risk)

If a user has a cached version of the thank-you page that still references t("backToHome"), next-intl will fail gracefully with a missing key warning. Not a runtime bug, but the key is fully removed with no deprecation. Fine for a small OSS project.


Test Coverage

No new tests were added. The sync.rs changes are refactors of existing logic paths with no behavioral change, so existing integration tests should cover correctness. However, given that RBF detection is complex and the type of conflicted_txids changed from Vec<String> to Vec<Txid>, a unit test specifically exercising the RBF detection path (mock wallet with a conflicted and canonical tx) would add confidence here.


Summary

Area Verdict
Performance optimization logic ✅ Correct and well-measured
Type improvements (Txid keys) ✅ Better than String keys
canonical_txids round-trip ⚠️ Minor: Txid→String→Txid is unnecessary
Nested loop from_str ⚠️ Minor: sub-optimal but low impact
existing_tx_map key type ⚠️ Inconsistent across paths
Frontend Suspense boundary ✅ Correct
External link security ✅ Good
i18n completeness ✅ All 9 locales
Test coverage ⚠️ No new tests for RBF path

The optimization is sound and the benchmark data justifies it. The remaining issues are all minor; none are blockers.

- Remove redundant `existing_txids` HashSet in both address-watch
  functions; `existing_tx_map.contains_key()` replaces it, eliminating
  an extra allocation and iteration per sync cycle

- Key `bdk_tx_map` by native `Txid` instead of `String` to avoid
  heap-allocating a hex string for every BDK transaction; only the
  fewer lookup keys are parsed via `Txid::from_str()`
- Collect bdk_tx_map directly from iterator, skip intermediate Vec
- Replace contains_key + get double lookup with single if-let in
  both address-watch functions
- Add warn! logging on Txid::from_str parse failures for debugging
…pect_err

- Replace canonical_txids Vec with HashSet<Txid> so the conflict
  filter is O(1) per element instead of O(n)
- Build conflicted_txids as Vec<Txid> directly from BDK graph,
  eliminating the Txid → String → Txid round-trip for conflict lookups
- Use inspect_err instead of map_err for logging side effects,
  which is the idiomatic Rust pattern (stable since 1.76)
@schjonhaug schjonhaug merged commit 108db37 into master Mar 5, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Performance: Add pagination and virtual scrolling for high-transaction wallets

1 participant