PoC: Add x.WithUnsafeAttributes for zero-allocation metrics API usage#8179
PoC: Add x.WithUnsafeAttributes for zero-allocation metrics API usage#8179dashpole wants to merge 9 commits intoopen-telemetry:mainfrom
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #8179 +/- ##
=======================================
- Coverage 82.3% 82.3% -0.1%
=======================================
Files 310 310
Lines 24258 24426 +168
=======================================
+ Hits 19979 20107 +128
- Misses 3902 3937 +35
- Partials 377 382 +5
🚀 New features to boost your workflow:
|
acaf3cc to
7251ad9
Compare
733c7f5 to
bc86fbe
Compare
| return s | ||
| } | ||
|
|
||
| // SortAndDedup sorts and de-duplicates the passed attributes in-place. |
There was a problem hiding this comment.
is this called in hot path, when when using WithUnsafeAttributes API?
There was a problem hiding this comment.
Yes. Because we don't every copy the slice, I have to sort and dedup within the WithUnsafeAttributes function to avoid concurrent modification of the slice later when it is used or re-used.
There was a problem hiding this comment.
Given that sort+dedup still has to run on every call, does WithUnsafeAttributes provide enough of a performance boost over the existing API to warrant new public API surface (especially one with "Unsafe" semantics)?
For reference, both the .NET and Rust OTel SDKs achieve zero-allocation, zero-sort hot paths without any new API — the optimization is entirely internal to the SDK. The approach: store each attribute combination in the hashmap under two keys — one in the caller-provided order and one in sorted+deduped order. Since a given callsite almost always passes attributes in the same order, the unsorted lookup hits, skipping sort+dedup entirely. The sorted lookup only kicks in on a miss (i.e new KV slice never seen before), and the extra map entry per unique combination is likely negligible overhead.
A similar approach in Go could deliver the same (or better) performance gains through the existing WithAttributes path itself.
(I am not very familiar with Go implementation, so feel free to discard if this is not applicable/feasible)
There was a problem hiding this comment.
Right now, WithAttributes makes a copy of the provided attributes for safety. If a user provides a slice of attributes today using WithAttributes, it is safe for them to modify the slice afterwards without issue. I've opted to call it WithUnsafeAttributes to indicate that it is no longer safe for users to modify the slice of attributes passed to the option. We could have adopted that stance from the beginning, and we wouldn't need to introduce a second option.
Your point about being able to skip sort + dedup is a good one. We could definitely implement that to provide a further speed-up. The cost of sorting + deduping doesn't seem to be too substantial (sorting + deduping + hashing seems to take ~30ns at 10 attributes). I've focused this PoC primarily on avoiding allocations.
There was a problem hiding this comment.
The dual-insertion approach would also avoid allocations on the hot path — the SDK can hash the incoming slice as-is (no copy, no sort) and look up in the sync.Map. Only on a miss does it need to sort, dedup, and copy. So it gives you both: zero allocation and zero sort, using the existing WithAttributes API.
There was a problem hiding this comment.
Because of how our options pattern works, we can't do that without changing our safety guarantees around WithAttributes.
There was a problem hiding this comment.
Ah! Got it. (Thanks for explaining this!)
Fixes #7743
x.WithUnsafeAttributes allows a user to provide a slice of attributes without copying the slice. It includes Unsafe in the name to warn that modifying the attributes after passing it to the option is not safe. When combined with Resettable, it allows making dynamic or precomputed Add calls with or without a filter with zero allocations, and significantly better performance in many scenarios.
This PR currently adds additional API surface to the attribute package: NewDistinct, SortAndDedup, NewDistinctFromSorted, and NewDistinctFromSortedWithFilter. Alternatively, we can use templating to copy the hashing code into the metrics SDK. But providing a hashing function to users for attribute slices seems generally useful.
Optimizations
This PR makes the following optimizations to eliminate an allocation:
Providing a slice of attributes instead of an attribute set allows the SDK to compute the attribute.Distinct without ever computing the full attribute.Set. This avoids an allocation caused by attribute.NewSet in cases where the attribute set is already present in the SDK, and is significantly more performant in that scenario.
Benchmark Results
In Precomputed cases, WithUnsafeAttributes has slightly worse performance (73ns vs 53ns with 10 attributes) than WithAttributes or WithAttributeSet because it compute the Distinct of attributes provided on each call, rather than once when the WithUnsafeAttributes option is constructed. It still has zero allocations on this path. The advantage of not computing Distinct is that it makes the Dynamic Filtered case (which always needs to compute a Distinct anyways) closer in performance to the Dynamic NoFilter case.
In all other cases (especially filtered cases), this is a significant improvement (7-12 x) in runtime.
Benchstat results:
Code written with Gemini's help. The ideas and PR description are my own.