perf(sdk): add optional foldhash for ValueMap HashMaps in metrics hot path by bryantbiggs · Pull Request #3388 · open-telemetry/opentelemetry-rust

bryantbiggs · 2026-02-24T20:01:01Z

Summary

Adds an opt-in metrics-use-foldhash feature flag that replaces the default SipHash-1-3 hasher with foldhash for the HashMap used in ValueMap::trackers — the metrics hot path.

SipHash's HashDoS resistance is unnecessary here since ValueMap is pub(crate) and keys (Vec<KeyValue>) are not attacker-controlled. When the feature is not enabled, the standard library HashMap (SipHash) is used — no mandatory dependency is added.

Why foldhash?

I benchmarked four hashers — std SipHash, ahash, foldhash, and rapidhash — on the actual Vec<KeyValue> key type used by ValueMap, with 1600 time series and 2/4/8 attributes per entry matching real-world metrics cardinality.

Benchmark results (Apple Silicon M4 Max, `aarch64-apple-darwin`)

2 attributes per entry:

Benchmark	std SipHash	ahash	foldhash	rapidhash
hash_only	30.34 µs	14.36 µs	14.24 µs	14.14 µs
lookup_hit	56.67 µs	40.59 µs	37.60 µs	35.84 µs
lookup_miss	2.34 µs	1.25 µs	0.955 µs	1.04 µs
insert	133.36 µs	115.79 µs	111.70 µs	113.77 µs
mixed_rw (ValueMap hot path)	119.29 µs	88.65 µs	86.01 µs	83.68 µs

4 attributes per entry (most common OTel workload):

Benchmark	std SipHash	ahash	foldhash	rapidhash
hash_only	58.75 µs	31.70 µs	26.38 µs	30.47 µs
lookup_hit	105.94 µs	83.71 µs	71.30 µs	71.81 µs
lookup_miss	4.14 µs	2.35 µs	1.79 µs	2.15 µs
insert	239.58 µs	209.55 µs	196.81 µs	195.33 µs
mixed_rw (ValueMap hot path)	220.08 µs	173.29 µs	155.30 µs	153.91 µs

8 attributes per entry:

Benchmark	std SipHash	ahash	foldhash	rapidhash
hash_only	121.79 µs	76.55 µs	49.74 µs	66.36 µs
lookup_hit	211.04 µs	185.81 µs	138.23 µs	147.51 µs
lookup_miss	7.93 µs	5.01 µs	3.29 µs	4.37 µs
insert	447.65 µs	422.34 µs	368.78 µs	393.28 µs
mixed_rw (ValueMap hot path)	432.21 µs	380.27 µs	300.27 µs	317.48 µs

Summary: mixed_rw (models the ValueMap hot path)

Attributes	std SipHash	ahash	foldhash	rapidhash
2	119.29 µs	88.65 µs (-26%)	86.01 µs (-28%)	83.68 µs (-30%)
4	220.08 µs	173.29 µs (-21%)	155.30 µs (-29%)	153.91 µs (-30%)
8	432.21 µs	380.27 µs (-12%)	300.27 µs (-31%)	317.48 µs (-27%)

foldhash was chosen over rapidhash because:

Faster raw hashing at 4+ attributes: 13% faster at 4 attrs, 25% faster at 8 attrs
Wins decisively at 8 attributes across all benchmarks (5-25% faster than rapidhash)
Essentially tied at 2-4 attrs on the mixed_rw hot path (within 1-3%)
Better lookup_miss performance across all sizes (17-34% faster)
hashbrown's default hasher — foldhash replaced ahash as the default in Rust's HashMap backing (rust-lang/hashbrown#563), meaning it's been heavily vetted for mixed read/write workloads
Scales better with key size — the advantage grows as attribute count increases
Zero dependencies, pure Rust

Ecosystem context: ahash is no longer best-in-class

The non-cryptographic hashing landscape has shifted significantly:

foldhash replaced ahash as the default hasher in hashbrown (rust-lang/hashbrown#563)
ahash has known unresolved performance regressions: ~40% on Apple M1 (#194) and 73-151% on AMD Zen with target-cpu=native (#190)
ahash maintenance has slowed: no throughput-focused optimization merged in 2024-2025; three VAES PRs (#144, #186, #187) stalled for 2+ years despite the Rust AVX-512 intrinsics blocker being resolved in Rust 1.89

Design decisions

Feature-gated (metrics-use-foldhash): Avoids adding a mandatory dependency. Users who want maximum metrics throughput can opt in.
Default is std SipHash: Safe, zero-dependency default. The performance difference only matters in high-cardinality metrics workloads.

Changes

opentelemetry-sdk/Cargo.toml: Add foldhash as optional dependency, add metrics-use-foldhash feature flag
opentelemetry-sdk/src/metrics/internal/mod.rs: Conditionally use foldhash::fast::RandomState when feature is enabled

Test plan

cargo check --features metrics (default path, std HashMap)
cargo check --features metrics-use-foldhash (foldhash path)
CI passes on all platforms

Refs: #3371

cijothomas · 2026-02-24T20:04:57Z

opentelemetry-sdk/Cargo.toml

 autobenches = false

 [dependencies]
+ahash = "0.8"


We try to keep dependencies absolute minimum, so if we chose to do it, it must be via a opt-in feature flag, so users can knowingly opt-in to this.

perfectly reasonable! if we do go forward, any thoughts on feature name? alt_hasher? ahash? (naming is hard 😅 )

Tagging @utpilla for thoughts as he might have explored this before.

Any prior art in similar crates to steal feature names from ? metrics-use-ahash (to indicate this is specifically for metrics)

cijothomas · 2026-02-24T20:06:39Z

ahash is already a transitive dependency via indexmap

We don't use indexmap now. We used to, but removed long ago.

unnecessary here since ValueMap is pub(crate) and keys are not attacker-controlled.

A lot of users provide keys and values from incoming request etc, so we should be safe by default. If a user explicitly opts-in to ahash, then we can do this change.

cijothomas

Thanks for striving to improve metrics perf. As noted in my comments, I am okay with this change if we feature gate it, so users are explicitly opting into this.

@utpilla might have considered/tried this before, but we didn't quite end up adding it - Would want to get his thoughts too.

bryantbiggs · 2026-02-24T21:15:26Z

taking a deeper look - ahash is not as well maintained anymore and there have been better, more modern alternatives that are maintained. its somewhat of a toss up between foldhash and rapidhash; with foldhash performing better with more attributes - I am less familiar with how many attributes are commonly used so I will defer to you all in terms of which one we should add behind a feature flag

… path

codecov · 2026-02-24T22:48:24Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.3%. Comparing base (dba1820) to head (d58b454).

Additional details and impacted files

@@          Coverage Diff          @@
##            main   #3388   +/-   ##
=====================================
  Coverage   82.3%   82.3%           
=====================================
  Files        128     128           
  Lines      24612   24617    +5     
=====================================
+ Hits       20257   20262    +5     
  Misses      4355    4355

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

utpilla · 2026-02-24T23:33:49Z

SipHash's HashDoS resistance is unnecessary here since ValueMap is pub(crate) and keys (Vec<KeyValue>) are not attacker-controlled.

That's not entirely true. While ValueMap is pub(crate), the measurement recording APIs are public, and the SDK simply stores whatever dimensions the calling application provides. If an application passes end-user input directly as metric attributes, those values flow straight into our hash map. Whether that's a realistic threat depends entirely on how the application is instrumented, and as an SDK we can't know that. Given the SDK is consumed across a wide range of unknown environments and use cases, I think we should be conservative here and not dismiss the HashDoS risk outright.

That said, I do acknowledge that most deployments are probably low-risk, dimension values typically come from controlled sources, and our cardinality capping provides an additional layer of protection. I'm supportive of giving users a performance escape hatch, I just want us to be thoughtful about how we expose it.

On the implementation approach: my concern with a crate-specific feature flag is maintenance. If we add foldhash today, we'll get requests for some other hasher tomorrow, and whatever comes after that. Over time, we could end up with a growing list of feature flags to maintain and test, each one a new dependency to vet.

A cleaner approach would be to make the internal storage generic over BuildHasher, similar to how HashMap itself works. The default stays as RandomState (SipHash), keeping the safe behavior for users who don't opt in, but users who want to bring their own hasher: foldhash, ahash, or anything else can do so without us needing to take on additional dependencies or feature flags. This does mean the generic parameter would need to thread through to MeterProvider and related types, which is a non-trivial change, but I think it's worth exploring. @bryantbiggs would you be interested in checking the feasibility of this approach?

If we do decide to go with a feature flag approach in the meantime, I'd strongly prefer the feature name include an unsafe prefix, something like unsafe-metrics-foldhash and that the docs clearly explain what the unsafe part is (no HashDoS protection), so users understand they're making a deliberate security tradeoff and not just flipping an easy performance switch.

bryantbiggs · 2026-02-24T23:45:38Z

A cleaner approach would be to make the internal storage generic over BuildHasher, similar to how HashMap itself works. The default stays as RandomState (SipHash), keeping the safe behavior for users who don't opt in, but users who want to bring their own hasher: foldhash, ahash, or anything else can do so without us needing to take on additional dependencies or feature flags. This does mean the generic parameter would need to thread through to MeterProvider and related types, which is a non-trivial change, but I think it's worth exploring. @bryantbiggs would you be interested in checking the feasibility of this approach?

Yes, I can take a look and propose something.

However, it does feel like a piece of functionality that will be commonly unknown to users. If most users don't know they can improve performance by bringing a different hash algorithm in their implementation then I would argue it's a change that is not valuable.

bryantbiggs · 2026-02-25T00:52:25Z

I dug into how every other major OTel SDK handles hashing for this exact same data structure — the per-instrument map that looks up attribute sets to aggregation buckets on every counter.add() / histogram.record() call:

Language	Data Structure	Hash Algorithm	HashDoS Resistant?	Seed
Go	`sync.Map` keyed by `Distinct{uint64}` in `limitedSyncMap`	xxHash64 (`cespare/xxhash/v2`)	No	Fixed zero
Java	`ConcurrentHashMap<Attributes, AggregatorHandle>` in `DefaultSynchronousMetricStorage`	`Arrays.hashCode()` (31×x polynomial)	No	Deterministic
C++	`unordered_map<MetricAttributes, unique_ptr<Aggregation>>` in `AttributesHashMap`	Boost `hash_combine` over `std::hash`	No	Fixed zero
Python	`dict` with `frozenset` keys in `_ViewInstrumentMatch`	Python's built-in SipHash	Yes	Per-process
.NET 6+	`ConcurrentDictionary<Tags, int>` in `AggregatorStore`	`System.HashCode` (xxHash32) + Marvin32	Yes	Per-process
Rust (current)	`RwLock<HashMap<Vec<KeyValue>, Arc<A>>>` in `ValueMap`	SipHash-1-3	Yes	Per-instance

Key takeaways:

3 of 5 other SDKs use non-DoS-resistant hashing as their only option. Python and .NET only get protection because their runtimes provide it, not from deliberate OTel decisions.
Go actively chose xxHash64 with zero seed for this path 2 months ago (PR #7497, v1.39.0), with Prometheus maintainers in the review. They even rejected collision detection because it doubled the cost of measure operations.
Go goes further than this PR — it uses only the hash as the map key (Distinct{uint64}), accepting silent data loss on collision. With foldhash, Rust's HashMap still does full Vec<KeyValue> equality comparison, so correctness is always preserved.
Zero HashDoS issues exist across the entire open-telemetry org. No spec or SIG Security guidance on hash algorithm selection. No documented real-world HashDoS attacks on observability systems.
All SDKs rely on cardinality limits (default 2000) as the primary defense.

On the BuildHasher generic approach — happy to explore as a follow-up, but I'd rather not block this on a larger API change that threads a type parameter through MeterProvider. No other OTel SDK offers configurable hashing, and the complexity may not be justified for a risk that hasn't materialized anywhere.

On naming — I'd push back on the unsafe- prefix. Go/Java/C++ ship non-resistant hashing as their only option without such labeling. metrics-foldhash with clear docs explaining the tradeoff feels more appropriate than implying a known exploitable vulnerability.

cijothomas · 2026-02-25T01:39:11Z

@bryantbiggs @utpilla "unsafe" prefix feels a bit aggressive and can incorrectly imply unsafe blocks. Okay with out unsafe prefix, but doc covering the risks.

(Btw, I don't think we can just follow what Go or C++ is doing, or go easy just because no documented exploit has occurred. As a foundation library, we need to be safe-by-default; if a particular user knows their scenario won't result in an exploitation, then can opt-in to things.)

bryantbiggs requested a review from a team as a code owner February 24, 2026 20:01

bryantbiggs force-pushed the worktree-sorted-only-valuemap branch from 159dab5 to f80ee5d Compare February 24, 2026 20:03

cijothomas reviewed Feb 24, 2026

View reviewed changes

cijothomas requested changes Feb 24, 2026

View reviewed changes

bryantbiggs force-pushed the worktree-sorted-only-valuemap branch from f80ee5d to 5e4684b Compare February 24, 2026 21:01

bryantbiggs changed the title ~~perf(sdk): use ahash for ValueMap HashMaps in metrics hot path~~ perf(sdk): add optional rapidhash for ValueMap HashMaps in metrics hot path Feb 24, 2026

bryantbiggs force-pushed the worktree-sorted-only-valuemap branch 2 times, most recently from 92730a0 to 916519b Compare February 24, 2026 21:12

perf(sdk): add optional foldhash for ValueMap HashMaps in metrics hot…

d58b454

… path

bryantbiggs force-pushed the worktree-sorted-only-valuemap branch from 916519b to d58b454 Compare February 24, 2026 21:21

bryantbiggs changed the title ~~perf(sdk): add optional rapidhash for ValueMap HashMaps in metrics hot path~~ perf(sdk): add optional foldhash for ValueMap HashMaps in metrics hot path Feb 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(sdk): add optional foldhash for ValueMap HashMaps in metrics hot path#3388

perf(sdk): add optional foldhash for ValueMap HashMaps in metrics hot path#3388
bryantbiggs wants to merge 1 commit intoopen-telemetry:mainfrom
bryantbiggs:worktree-sorted-only-valuemap

bryantbiggs commented Feb 24, 2026 •

edited

Loading

Uh oh!

cijothomas Feb 24, 2026

Uh oh!

bryantbiggs Feb 24, 2026 •

edited

Loading

Uh oh!

cijothomas Feb 24, 2026

Uh oh!

cijothomas commented Feb 24, 2026

Uh oh!

cijothomas left a comment

Uh oh!

bryantbiggs commented Feb 24, 2026 •

edited

Loading

Uh oh!

codecov bot commented Feb 24, 2026

Uh oh!

utpilla commented Feb 24, 2026

Uh oh!

bryantbiggs commented Feb 24, 2026 •

edited

Loading

Uh oh!

bryantbiggs commented Feb 25, 2026

Uh oh!

cijothomas commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bryantbiggs commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why foldhash?

Benchmark results (Apple Silicon M4 Max, aarch64-apple-darwin)

Summary: mixed_rw (models the ValueMap hot path)

Ecosystem context: ahash is no longer best-in-class

Design decisions

Changes

Test plan

Uh oh!

cijothomas Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

bryantbiggs Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cijothomas Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

cijothomas commented Feb 24, 2026

Uh oh!

cijothomas left a comment

Choose a reason for hiding this comment

Uh oh!

bryantbiggs commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Feb 24, 2026

Codecov Report

Uh oh!

utpilla commented Feb 24, 2026

Uh oh!

bryantbiggs commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bryantbiggs commented Feb 25, 2026

Uh oh!

cijothomas commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bryantbiggs commented Feb 24, 2026 •

edited

Loading

Benchmark results (Apple Silicon M4 Max, `aarch64-apple-darwin`)

bryantbiggs Feb 24, 2026 •

edited

Loading

bryantbiggs commented Feb 24, 2026 •

edited

Loading

bryantbiggs commented Feb 24, 2026 •

edited

Loading