Skip to content

PoC: Add x.WithUnsafeAttributes for zero-allocation metrics API usage#8179

Closed
dashpole wants to merge 9 commits intoopen-telemetry:mainfrom
dashpole:fast_attributes
Closed

PoC: Add x.WithUnsafeAttributes for zero-allocation metrics API usage#8179
dashpole wants to merge 9 commits intoopen-telemetry:mainfrom
dashpole:fast_attributes

Conversation

@dashpole
Copy link
Copy Markdown
Contributor

@dashpole dashpole commented Apr 11, 2026

Fixes #7743

x.WithUnsafeAttributes allows a user to provide a slice of attributes without copying the slice. It includes Unsafe in the name to warn that modifying the attributes after passing it to the option is not safe. When combined with Resettable, it allows making dynamic or precomputed Add calls with or without a filter with zero allocations, and significantly better performance in many scenarios.

This PR currently adds additional API surface to the attribute package: NewDistinct, SortAndDedup, NewDistinctFromSorted, and NewDistinctFromSortedWithFilter. Alternatively, we can use templating to copy the hashing code into the metrics SDK. But providing a hashing function to users for attribute slices seems generally useful.

Optimizations

This PR makes the following optimizations to eliminate an allocation:

  • Use Resettable to re-use options.
  • Do not make a copy of attributes passed to WithUnsafeAttributes.
  • Only compute attribute.NewSet when the attribute.Distinct is not found in the sync.Map.
  • [Filtered] Only compute a filtered attribute.NewSet when the filtered attribute.Distinct is not found in the sync.Map
  • [Filtered] Only compute the dropped attribute slice when the exemplar reservoir samples the Offer call.

Providing a slice of attributes instead of an attribute set allows the SDK to compute the attribute.Distinct without ever computing the full attribute.Set. This avoids an allocation caused by attribute.NewSet in cases where the attribute set is already present in the SDK, and is significantly more performant in that scenario.

Benchmark Results

In Precomputed cases, WithUnsafeAttributes has slightly worse performance (73ns vs 53ns with 10 attributes) than WithAttributes or WithAttributeSet because it compute the Distinct of attributes provided on each call, rather than once when the WithUnsafeAttributes option is constructed. It still has zero allocations on this path. The advantage of not computing Distinct is that it makes the Dynamic Filtered case (which always needs to compute a Distinct anyways) closer in performance to the Dynamic NoFilter case.

In all other cases (especially filtered cases), this is a significant improvement (7-12 x) in runtime.

Benchstat results:

=== sec/op ===
EndToEndCounterAdd/Filtered/Attributes/10/Dynamic/WithAttributeSet-24                               2.078µ ± ∞ ¹   2.167µ ±  3%        ~ (p=0.247 n=5+10)
EndToEndCounterAdd/Filtered/Attributes/10/Dynamic/WithUnsafeAttributes-24                                         166.8n ±  3%
EndToEndCounterAdd/Filtered/Attributes/10/Precomputed/WithAttributeSet-24                           1.077µ ± ∞ ¹   1.168µ ±  5%   +8.45% (p=0.003 n=5+10)
EndToEndCounterAdd/Filtered/Attributes/10/Precomputed/WithUnsafeAttributes-24                                     144.1n ±  2%
EndToEndCounterAdd/NoFilter/Attributes/10/Dynamic/WithAttributeSet-24                               863.9n ± ∞ ¹   967.8n ±  7%  +12.02% (p=0.001 n=5+10)
EndToEndCounterAdd/NoFilter/Attributes/10/Dynamic/WithUnsafeAttributes-24                                         84.01n ±  5%
EndToEndCounterAdd/NoFilter/Attributes/10/Precomputed/WithAttributeSet-24                           53.53n ± ∞ ¹   63.18n ±  5%  +18.03% (p=0.001 n=5+10)
EndToEndCounterAdd/NoFilter/Attributes/10/Precomputed/WithUnsafeAttributes-24                                     72.57n ±  2%

=== B/op ===
EndToEndCounterAdd/Filtered/Attributes/10/Dynamic/WithAttributeSet-24                               2.027Ki ± ∞ ¹ 2.006Ki ± 0%   -1.04% (p=0.001 n=5+10)
EndToEndCounterAdd/Filtered/Attributes/10/Dynamic/WithUnsafeAttributes-24                                         0.000 ± 0%
EndToEndCounterAdd/Filtered/Attributes/10/Precomputed/WithAttributeSet-24                           1.312Ki ± ∞ ¹ 1.312Ki ± 0%        ~ (p=1.000 n=5+10) ²
EndToEndCounterAdd/Filtered/Attributes/10/Precomputed/WithUnsafeAttributes-24                                     0.000 ± 0%
EndToEndCounterAdd/NoFilter/Attributes/10/Dynamic/WithAttributeSet-24                               730.0 ± ∞ ¹   706.0 ± 0%   -3.29% (n=5+10)
EndToEndCounterAdd/NoFilter/Attributes/10/Dynamic/WithUnsafeAttributes-24                                         0.000 ± 0%
EndToEndCounterAdd/NoFilter/Attributes/10/Precomputed/WithAttributeSet-24                           0.000 ± ∞ ¹   0.000 ± 0%        ~ (p=1.000 n=5+10) ²
EndToEndCounterAdd/NoFilter/Attributes/10/Precomputed/WithUnsafeAttributes-24                                     0.000 ± 0%

=== allocs/op ===
EndToEndCounterAdd/Filtered/Attributes/10/Dynamic/WithAttributeSet-24                               4.000 ± ∞ ¹   3.000 ± 0%  -25.00% (n=5+10)
EndToEndCounterAdd/Filtered/Attributes/10/Dynamic/WithUnsafeAttributes-24                                         0.000 ± 0%
EndToEndCounterAdd/Filtered/Attributes/10/Precomputed/WithAttributeSet-24                           2.000 ± ∞ ¹   2.000 ± 0%        ~ (p=1.000 n=5+10) ²
EndToEndCounterAdd/Filtered/Attributes/10/Precomputed/WithUnsafeAttributes-24                                     0.000 ± 0%
EndToEndCounterAdd/NoFilter/Attributes/10/Dynamic/WithAttributeSet-24                               2.000 ± ∞ ¹   1.000 ± 0%  -50.00% (n=5+10)
EndToEndCounterAdd/NoFilter/Attributes/10/Dynamic/WithUnsafeAttributes-24                                         0.000 ± 0%
EndToEndCounterAdd/NoFilter/Attributes/10/Precomputed/WithAttributeSet-24                           0.000 ± ∞ ¹   0.000 ± 0%        ~ (p=1.000 n=5+10) ²
EndToEndCounterAdd/NoFilter/Attributes/10/Precomputed/WithUnsafeAttributes-24                                     0.000 ± 0%

Code written with Gemini's help. The ideas and PR description are my own.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 11, 2026

Codecov Report

❌ Patch coverage is 81.15942% with 39 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.3%. Comparing base (fe35606) to head (e73ecda).
⚠️ Report is 42 commits behind head on main.

Files with missing lines Patch % Lines
sdk/metric/internal/aggregate/aggregate.go 64.5% 14 Missing and 3 partials ⚠️
attribute/hash.go 0.0% 6 Missing ⚠️
attribute/set.go 84.0% 4 Missing ⚠️
metric/x/options.go 62.5% 3 Missing ⚠️
...dk/metric/internal/aggregate/filtered_reservoir.go 57.1% 1 Missing and 2 partials ⚠️
sdk/metric/instrument.go 95.8% 2 Missing ⚠️
sdk/metric/internal/aggregate/atomic.go 83.3% 2 Missing ⚠️
...metric/internal/aggregate/exponential_histogram.go 80.0% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #8179     +/-   ##
=======================================
- Coverage   82.3%   82.3%   -0.1%     
=======================================
  Files        310     310             
  Lines      24258   24426    +168     
=======================================
+ Hits       19979   20107    +128     
- Misses      3902    3937     +35     
- Partials     377     382      +5     
Files with missing lines Coverage Δ
metric/instrument.go 100.0% <100.0%> (ø)
sdk/metric/internal/aggregate/drop.go 100.0% <100.0%> (ø)
sdk/metric/internal/aggregate/histogram.go 100.0% <100.0%> (ø)
sdk/metric/internal/aggregate/lastvalue.go 100.0% <100.0%> (ø)
sdk/metric/internal/aggregate/sum.go 100.0% <100.0%> (ø)
sdk/metric/meter.go 92.3% <100.0%> (+<0.1%) ⬆️
sdk/metric/instrument.go 96.7% <95.8%> (-0.9%) ⬇️
sdk/metric/internal/aggregate/atomic.go 89.0% <83.3%> (+0.8%) ⬆️
...metric/internal/aggregate/exponential_histogram.go 99.1% <80.0%> (-0.9%) ⬇️
metric/x/options.go 38.4% <62.5%> (+38.4%) ⬆️
... and 4 more

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dashpole dashpole changed the title PoC: Add x.WithUnsafeAttributes to improve metrics API performance PoC: Add x.WithUnsafeAttributes for zero-allocation metrics API usage Apr 13, 2026
Comment thread attribute/set.go
return s
}

// SortAndDedup sorts and de-duplicates the passed attributes in-place.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this called in hot path, when when using WithUnsafeAttributes API?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Because we don't every copy the slice, I have to sort and dedup within the WithUnsafeAttributes function to avoid concurrent modification of the slice later when it is used or re-used.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that sort+dedup still has to run on every call, does WithUnsafeAttributes provide enough of a performance boost over the existing API to warrant new public API surface (especially one with "Unsafe" semantics)?

For reference, both the .NET and Rust OTel SDKs achieve zero-allocation, zero-sort hot paths without any new API — the optimization is entirely internal to the SDK. The approach: store each attribute combination in the hashmap under two keys — one in the caller-provided order and one in sorted+deduped order. Since a given callsite almost always passes attributes in the same order, the unsorted lookup hits, skipping sort+dedup entirely. The sorted lookup only kicks in on a miss (i.e new KV slice never seen before), and the extra map entry per unique combination is likely negligible overhead.

A similar approach in Go could deliver the same (or better) performance gains through the existing WithAttributes path itself.

(I am not very familiar with Go implementation, so feel free to discard if this is not applicable/feasible)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, WithAttributes makes a copy of the provided attributes for safety. If a user provides a slice of attributes today using WithAttributes, it is safe for them to modify the slice afterwards without issue. I've opted to call it WithUnsafeAttributes to indicate that it is no longer safe for users to modify the slice of attributes passed to the option. We could have adopted that stance from the beginning, and we wouldn't need to introduce a second option.

Your point about being able to skip sort + dedup is a good one. We could definitely implement that to provide a further speed-up. The cost of sorting + deduping doesn't seem to be too substantial (sorting + deduping + hashing seems to take ~30ns at 10 attributes). I've focused this PoC primarily on avoiding allocations.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dual-insertion approach would also avoid allocations on the hot path — the SDK can hash the incoming slice as-is (no copy, no sort) and look up in the sync.Map. Only on a miss does it need to sort, dedup, and copy. So it gives you both: zero allocation and zero sort, using the existing WithAttributes API.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of how our options pattern works, we can't do that without changing our safety guarantees around WithAttributes.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! Got it. (Thanks for explaining this!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve performance of attribute.NewSet

2 participants