Lazily compute filtered and dropped attributes only when required#8230
Lazily compute filtered and dropped attributes only when required#8230dashpole wants to merge 11 commits into
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #8230 +/- ##
=======================================
+ Coverage 82.9% 83.1% +0.1%
=======================================
Files 314 315 +1
Lines 24985 25126 +141
=======================================
+ Hits 20730 20882 +152
+ Misses 3882 3870 -12
- Partials 373 374 +1
🚀 New features to boost your workflow:
|
56e479f to
18523eb
Compare
077b835 to
a23ccc6
Compare
There was a problem hiding this comment.
Pull request overview
This PR optimizes the metrics SDK’s attribute-filtered measurement path by deferring creation of filtered sets and dropped-attribute slices until they’re actually needed (notably, only when an exemplar is sampled). It also introduces small public API additions to support hashing and lazy exemplar offering without forcing intermediate allocations.
Changes:
- Add
attribute.Set.NewDistinctWithFilterto compute a Distinct hash for a filtered view of a set without constructing a new filteredattribute.Set. - Thread the original
attribute.Set+attribute.Filterdeeper into aggregators and exemplar offering, deferring filtered-set/dropped-slice construction. - Add
FixedSizeReservoir.OfferLazyand wireFilteredExemplarReservoirto use it when supported, plus add a new benchmark scenario for filtered exemplars.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
attribute/hash.go |
Adds Set.NewDistinctWithFilter to hash filtered attributes without building a filtered set. |
attribute/hash_test.go |
Adds unit tests for NewDistinctWithFilter behavior. |
sdk/metric/exemplar/fixed_size_reservoir.go |
Adds OfferLazy and refactors sampling logic to compute dropped attrs only on sampling. |
sdk/metric/internal/aggregate/aggregate.go |
Changes the filter plumbing to pass (attribute.Set, attribute.Filter) instead of eagerly filtering. |
sdk/metric/internal/aggregate/atomic.go |
Updates limitedSyncMap.LoadOrStoreAttr to key by filtered Distinct without constructing filtered sets on hot path. |
sdk/metric/internal/aggregate/atomic_test.go |
Adds a regression test for overflow behavior in limitedSyncMap. |
sdk/metric/internal/aggregate/sum.go |
Updates sum aggregation to use lazy filtered-attr handling and new reservoir Offer signature. |
sdk/metric/internal/aggregate/lastvalue.go |
Updates lastvalue aggregation to use lazy filtered-attr handling and new reservoir Offer signature. |
sdk/metric/internal/aggregate/histogram.go |
Updates explicit bucket histogram aggregation to use lazy filtered-attr handling and new reservoir Offer signature. |
sdk/metric/internal/aggregate/exponential_histogram.go |
Uses NewDistinctWithFilter for fast map lookup before allocating filtered sets; updates exemplar offering signature. |
sdk/metric/internal/aggregate/drop.go |
Updates drop reservoir to new FilteredExemplarReservoir.Offer signature. |
sdk/metric/internal/aggregate/filtered_reservoir.go |
Changes FilteredExemplarReservoir.Offer to accept (Set, Filter) and adds lazy-reservoir dispatch + dropped attr computation. |
sdk/metric/internal/aggregate/filtered_reservoir_test.go |
Extends tests to validate that reservoirs implementing OfferLazy use the lazy path. |
sdk/metric/internal/aggregate/aggregate_test.go |
Updates filter tests to validate filtered/dropped attrs derived from (Set, Filter) instead of eager filtering. |
sdk/metric/benchmark_test.go |
Adds FilteredWithExemplars benchmark scenario. |
CHANGELOG.md |
Notes the performance improvement in the unreleased changelog. |
|
side note: the branch name is on point 👨🍳 |
MrAlias
left a comment
There was a problem hiding this comment.
This optimization introduces a correctness problem in the first-seen series path.
The implementation now evaluates attribute.Filter more than once for a single measurement: first to derive the filtered identity, and later to materialize the filtered attributes on the miss path. That is not safe under the current API contract. attribute.Filter can wrap mutable closure state, so those evaluations can disagree for the same measurement.
Once that happens, the aggregation key and the stored or reported attributes can diverge. That breaks the invariant that metric state is keyed by the post-filtered attribute set. Given that, I do not think this lazy strategy is viable unless the filter contract is narrowed or the filtering result is computed once per measurement.
|
Good points. The alternative I had been considering was to try and avoid the copy within Set itself. E.g. func (s *Set) WithFilter(f Filter) *Set {
// return a set that uses a reference to the old set's storage instead of making a copy.
// But the filter is still only evaluated when the set is created.
}
func (s *Set) Dropped() []KeyValue {
// returns the set of labels that were dropped if the set was created using WithFilter. Otherwise, returns the empty set.
}This will be an issue (although I think easier to solve) for the new WithUnsafeAttributes or Bind functions as well, but will need a different solution. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
21c08e2 to
c1596b7
Compare
46fa4c2 to
f8af860
Compare
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
I think this is ready now! I looked at a few ways to try and lazily compute attribute sets, but the cleanest way was to make it a separate thing in the attribute package. It can be done without doing that, but it is very complex (you pass around the mask, a bool, the filter, the original set, etc). I would've kept it in the sdk/metric package, but the sdk/metric/exemplar package needs to accept it. |
Part of #7743
This takes a slightly different approach from #8179, and is first just trying to apply some of the optimizations to the current WithAttributeSet / WithAttributes path.
The basic optimization pattern is that we pass the original
attribute.Setand theattribute.Filteras far as possible to defer creating the filtered and dropped attributes until absolutely necessary. We use a new type,attribute.LazyFilteredSet, to encapsulate this state and guarantee that the filter is evaluated exactly once.There are two places where exemplars can be discarded: By the ExemplarFilter (e.g. AlwaysOn vs AlwaysOff vs TraceBased), and by the ExemplarReservoir itself (the time-weighted algorithm discards Offer calls). The current Filtered benchmark only exercises the first filter by passing an AlwaysOff filter. I've added a FilteredWithExemplars case to demonstrate improvements to the case where exemplars are Offered to the reservoir.
Public API Additions
attribute:LazyFilteredSettype.sdk/metric/exemplar:FixedSizeReservoir.OfferLazytakes anattribute.LazyFilteredSetinstead of anattribute.Setandattribute.Filter.LazyFilteredSet.Considerations and Alternatives
The above public API additions are specific to the path where attributes are provided using
attribute.Set. For a futurex.WithUnsafeAttributes, where the attributes are passed as a slice, we would need to add equivalent lazy evaluation mechanisms for slices. Unfortunately, you lose all of the performance gains in this PR if we have to convert theattribute.Setto an[]attribute.KeyValuebefore getting the Distinct or before Offering the attributes to the reservoir.Alternative considered: We could skip this PR, and just focus on optimizing the
[]attribute.KeyValuepath, since that will eventually be more performant, but it seems valuable to me to get this performance win without the new option.Alternative considered: We originally considered adding
set.NewDistinctWithFilter(filter attribute.Filter)to compute the hash of a filtered attribute set without recomputing the full set. However, this resulted in multiple evaluations of the filter function. We replaced it withLazyFilteredSetto ensure the filter is evaluated exactly once and to cleanly encapsulate all state needed for lazy evaluation.Benchmarks
This PR removes one allocation from the Filtered path across the board (from computing the new attribute.Set after a filter is applied), and removes a second allocation from the FilteredWithExemplars (new) path by not computing the dropped attribute set unless the reservoir samples the observation.
For Precomputed cases, this eliminates most of the performance difference between the NoFilter and Filtered or FilteredWithExemplars cases -- meaning applying a filter to a metric no longer imposes an ~order-of-magnitude performance penalty, and is now just slightly slower
For dynamic cases, this is a substantial improvement, but the cost of attribute.NewSet accounts for most of the difference now (as was already the case in the
NoFiltercases).Gemini helped implement this.