perf(sdk): add optional foldhash for ValueMap HashMaps in metrics hot path#3388
perf(sdk): add optional foldhash for ValueMap HashMaps in metrics hot path#3388bryantbiggs wants to merge 1 commit intoopen-telemetry:mainfrom
Conversation
159dab5 to
f80ee5d
Compare
opentelemetry-sdk/Cargo.toml
Outdated
| autobenches = false | ||
|
|
||
| [dependencies] | ||
| ahash = "0.8" |
There was a problem hiding this comment.
We try to keep dependencies absolute minimum, so if we chose to do it, it must be via a opt-in feature flag, so users can knowingly opt-in to this.
There was a problem hiding this comment.
perfectly reasonable! if we do go forward, any thoughts on feature name? alt_hasher? ahash? (naming is hard 😅 )
There was a problem hiding this comment.
Tagging @utpilla for thoughts as he might have explored this before.
Any prior art in similar crates to steal feature names from ? metrics-use-ahash (to indicate this is specifically for metrics)
We don't use
A lot of users provide keys and values from incoming request etc, so we should be safe by default. If a user explicitly opts-in to |
cijothomas
left a comment
There was a problem hiding this comment.
Thanks for striving to improve metrics perf. As noted in my comments, I am okay with this change if we feature gate it, so users are explicitly opting into this.
@utpilla might have considered/tried this before, but we didn't quite end up adding it - Would want to get his thoughts too.
f80ee5d to
5e4684b
Compare
92730a0 to
916519b
Compare
|
taking a deeper look - |
916519b to
d58b454
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3388 +/- ##
=====================================
Coverage 82.3% 82.3%
=====================================
Files 128 128
Lines 24612 24617 +5
=====================================
+ Hits 20257 20262 +5
Misses 4355 4355 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
That's not entirely true. While That said, I do acknowledge that most deployments are probably low-risk, dimension values typically come from controlled sources, and our cardinality capping provides an additional layer of protection. I'm supportive of giving users a performance escape hatch, I just want us to be thoughtful about how we expose it. On the implementation approach: my concern with a crate-specific feature flag is maintenance. If we add A cleaner approach would be to make the internal storage generic over If we do decide to go with a feature flag approach in the meantime, I'd strongly prefer the feature name include an |
Yes, I can take a look and propose something. However, it does feel like a piece of functionality that will be commonly unknown to users. If most users don't know they can improve performance by bringing a different hash algorithm in their implementation then I would argue it's a change that is not valuable. |
|
I dug into how every other major OTel SDK handles hashing for this exact same data structure — the per-instrument map that looks up attribute sets to aggregation buckets on every
Key takeaways:
On the On naming — I'd push back on the |
|
@bryantbiggs @utpilla "unsafe" prefix feels a bit aggressive and can incorrectly imply (Btw, I don't think we can just follow what Go or C++ is doing, or go easy just because no documented exploit has occurred. As a foundation library, we need to be safe-by-default; if a particular user knows their scenario won't result in an exploitation, then can opt-in to things.) |
Summary
Adds an opt-in
metrics-use-foldhashfeature flag that replaces the defaultSipHash-1-3hasher with foldhash for theHashMapused inValueMap::trackers— the metrics hot path.SipHash's HashDoS resistance is unnecessary here since
ValueMapispub(crate)and keys (Vec<KeyValue>) are not attacker-controlled. When the feature is not enabled, the standard libraryHashMap(SipHash) is used — no mandatory dependency is added.Why foldhash?
I benchmarked four hashers — std SipHash, ahash, foldhash, and rapidhash — on the actual
Vec<KeyValue>key type used by ValueMap, with 1600 time series and 2/4/8 attributes per entry matching real-world metrics cardinality.Benchmark results (Apple Silicon M4 Max,
aarch64-apple-darwin)2 attributes per entry:
4 attributes per entry (most common OTel workload):
8 attributes per entry:
Summary: mixed_rw (models the ValueMap hot path)
foldhash was chosen over rapidhash because:
HashMapbacking (rust-lang/hashbrown#563), meaning it's been heavily vetted for mixed read/write workloadsEcosystem context: ahash is no longer best-in-class
The non-cryptographic hashing landscape has shifted significantly:
hashbrown(rust-lang/hashbrown#563)target-cpu=native(#190)Design decisions
metrics-use-foldhash): Avoids adding a mandatory dependency. Users who want maximum metrics throughput can opt in.Changes
opentelemetry-sdk/Cargo.toml: Addfoldhashas optional dependency, addmetrics-use-foldhashfeature flagopentelemetry-sdk/src/metrics/internal/mod.rs: Conditionally usefoldhash::fast::RandomStatewhen feature is enabledTest plan
cargo check --features metrics(default path, std HashMap)cargo check --features metrics-use-foldhash(foldhash path)Refs: #3371