Skip to content

Conversation

@wmdanor
Copy link
Contributor

@wmdanor wmdanor commented Mar 28, 2025

Towards: #6382

This PR optimizes Logs SDK BatchProcessor.

Why?

Batch processor does not check if buffer exporter is full, instead it clones records to buffer, then tries to push them to buffer exporter, and then buffer exporter will return false from EnqueueExport if it is full.

It does not cause any performance issues when using NOOP exporter, because it "exports" instantly. -> Buffer exporter input channel will almost never be full.

But actual implementations of exporters are not going to act that way, they need some time for export operation. And so buffer exporter will get full fairly frequently. Each export attempt to buffer exporter costs us copying entire queue inside of batch processor.

In order to catch this or similar performance bottleneck in the future, new benchmark (BatchSimulateExport) was added which will use exporter that will simulate non-instant export by sleeping for 5ms.

Benchmarks

goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/sdk/log
cpu: AMD Ryzen 9 7900X 12-Core Processor
                                 │   base.txt   │               new.txt               │
                                 │    sec/op    │   sec/op     vs base                │
Processor/Simple-24                 340.9n ± 2%   343.5n ± 5%        ~ (p=0.971 n=10)
Processor/Batch-24                  667.6n ± 2%   666.3n ± 4%        ~ (p=0.896 n=10)
Processor/BatchSimulateExport-24   4441.5n ± 5%   526.4n ± 4%  -88.15% (p=0.000 n=10)
Processor/SetTimestampSimple-24     360.0n ± 4%   351.0n ± 1%        ~ (p=0.247 n=10)
Processor/SetTimestampBatch-24      669.8n ± 3%   647.0n ± 5%        ~ (p=0.052 n=10)
Processor/AddAttributesSimple-24    383.9n ± 1%   386.6n ± 5%        ~ (p=0.138 n=10)
Processor/AddAttributesBatch-24     759.9n ± 6%   734.5n ± 5%        ~ (p=0.481 n=10)
Processor/SetAttributesSimple-24    369.9n ± 2%   372.4n ± 3%        ~ (p=0.643 n=10)
Processor/SetAttributesBatch-24     684.2n ± 4%   674.0n ± 8%        ~ (p=0.529 n=10)
geomean                             639.9n        499.9n       -21.88%

                                 │   base.txt   │                new.txt                 │
                                 │     B/op     │     B/op      vs base                  │
Processor/Simple-24                  450.0 ± 0%     450.0 ± 0%        ~ (p=1.000 n=10) ¹
Processor/Batch-24                 1.153Ki ± 2%   1.125Ki ± 1%   -2.41% (p=0.001 n=10)
Processor/BatchSimulateExport-24     649.0 ± 1%     473.0 ± 0%  -27.12% (p=0.000 n=10)
Processor/SetTimestampSimple-24      450.0 ± 0%     450.0 ± 0%        ~ (p=1.000 n=10) ¹
Processor/SetTimestampBatch-24     1.145Ki ± 2%   1.122Ki ± 2%   -2.01% (p=0.003 n=10)
Processor/AddAttributesSimple-24     450.0 ± 0%     450.0 ± 0%        ~ (p=1.000 n=10) ¹
Processor/AddAttributesBatch-24    1.169Ki ± 2%   1.148Ki ± 1%   -1.80% (p=0.000 n=10)
Processor/SetAttributesSimple-24     450.0 ± 0%     450.0 ± 0%        ~ (p=1.000 n=10) ¹
Processor/SetAttributesBatch-24    1.141Ki ± 2%   1.128Ki ± 2%   -1.11% (p=0.030 n=10)
geomean                              719.3          688.7        -4.24%
¹ all samples are equal

                                 │  base.txt  │               new.txt               │
                                 │ allocs/op  │ allocs/op   vs base                 │
Processor/Simple-24                1.000 ± 0%   1.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/Batch-24                 1.000 ± 0%   1.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/BatchSimulateExport-24   1.000 ± 0%   1.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/SetTimestampSimple-24    1.000 ± 0%   1.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/SetTimestampBatch-24     1.000 ± 0%   1.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/AddAttributesSimple-24   1.000 ± 0%   1.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/AddAttributesBatch-24    1.000 ± 0%   1.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/SetAttributesSimple-24   1.000 ± 0%   1.000 ± 0%       ~ (p=1.000 n=10) ¹
Processor/SetAttributesBatch-24    1.000 ± 0%   1.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                            1.000        1.000       +0.00%
¹ all samples are equal

Benchmarks from issue (external)

goos: linux
goarch: amd64
pkg: github.com/pellared/spanevents-vs-logs
cpu: AMD Ryzen 9 7900X 12-Core Processor
           │ spanevents.txt │              logs.base.txt              │             logs.new.txt             │
           │     sec/op     │    sec/op      vs base                  │   sec/op     vs base                 │
/OTLP-24        41.40µ ± 3%   2935.15µ ± 9%  +6989.05% (p=0.000 n=10)   84.19µ ± 4%  +103.34% (p=0.000 n=10)
/STDOUT-24      39.55µ ± 4%    623.45µ ± 1%  +1476.21% (p=0.000 n=10)   89.51µ ± 3%  +126.29% (p=0.000 n=10)
/NOOP-24       13.179µ ± 2%     4.950µ ± 1%    -62.44% (p=0.000 n=10)   4.932µ ± 1%   -62.58% (p=0.000 n=10)
geomean         27.84µ          208.4µ        +648.68%                  33.37µ        +19.86%

           │ spanevents.txt │             logs.base.txt              │             logs.new.txt             │
           │      B/op      │     B/op       vs base                 │     B/op      vs base                │
/OTLP-24       98.90Ki ± 8%   210.44Ki ± 0%  +112.79% (p=0.000 n=10)   52.01Ki ± 0%  -47.41% (p=0.000 n=10)
/STDOUT-24     91.01Ki ± 0%   286.62Ki ± 0%  +214.93% (p=0.000 n=10)   81.67Ki ± 1%  -10.27% (p=0.000 n=10)
/NOOP-24       29840.0 ± 0%      240.0 ± 0%   -99.20% (p=0.000 n=10)     240.0 ± 0%  -99.20% (p=0.000 n=10)
geomean        64.01Ki         24.18Ki        -62.23%                  9.985Ki       -84.40%

           │ spanevents.txt │            logs.base.txt             │            logs.new.txt            │
           │   allocs/op    │  allocs/op   vs base                 │ allocs/op   vs base                │
/OTLP-24        699.0 ± 20%   2174.5 ± 1%  +211.09% (p=0.000 n=10)   249.5 ± 1%  -64.31% (p=0.000 n=10)
/STDOUT-24      646.5 ±  1%   4912.5 ± 0%  +659.86% (p=0.000 n=10)   872.5 ± 2%  +34.96% (p=0.000 n=10)
/NOOP-24      303.000 ±  0%    3.000 ± 0%   -99.01% (p=0.000 n=10)   3.000 ± 0%  -99.01% (p=0.000 n=10)
geomean         515.4          317.6        -38.37%                  86.76       -83.17%
goos: linux
goarch: amd64
pkg: github.com/pellared/spanevents-vs-logs
cpu: AMD Ryzen 9 7900X 12-Core Processor
           │ logs.base.txt │            logs.new.txt             │
           │    sec/op     │   sec/op     vs base                │
/OTLP-24     2935.15µ ± 9%   84.19µ ± 4%  -97.13% (p=0.000 n=10)
/STDOUT-24    623.45µ ± 1%   89.51µ ± 3%  -85.64% (p=0.000 n=10)
/NOOP-24       4.950µ ± 1%   4.932µ ± 1%        ~ (p=0.342 n=10)
geomean        208.4µ        33.37µ       -83.99%

           │ logs.base.txt │              logs.new.txt              │
           │     B/op      │     B/op      vs base                  │
/OTLP-24     210.44Ki ± 0%   52.01Ki ± 0%  -75.28% (p=0.000 n=10)
/STDOUT-24   286.62Ki ± 0%   81.67Ki ± 1%  -71.51% (p=0.000 n=10)
/NOOP-24        240.0 ± 0%     240.0 ± 0%        ~ (p=1.000 n=10) ¹
geomean       24.18Ki        9.985Ki       -58.70%
¹ all samples are equal

           │ logs.base.txt │             logs.new.txt             │
           │   allocs/op   │ allocs/op   vs base                  │
/OTLP-24       2174.5 ± 1%   249.5 ± 1%  -88.53% (p=0.000 n=10)
/STDOUT-24     4912.5 ± 0%   872.5 ± 2%  -82.24% (p=0.000 n=10)
/NOOP-24        3.000 ± 0%   3.000 ± 0%        ~ (p=1.000 n=10) ¹
geomean         317.6        86.76       -72.69%
¹ all samples are equal

@wmdanor wmdanor marked this pull request as ready for review March 28, 2025 02:19
@wmdanor wmdanor marked this pull request as draft March 28, 2025 07:54
Copy link
Member

@pellared pellared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batch processor does not check if buffer exporter is full, instead it clones records to buffer, then tries to push them to buffer exporter, and then buffer exporter will return false from EnqueueExport if it is full.
[...]
But actual implementations of exporters are not going to act that way, they need some time for export operation. And so buffer exporter will get full fairly frequently. Each export attempt to buffer exporter costs us copying entire queue inside of batch processor.

This a very good finding 👍

PS. I am happy that improving the logs batching processor does not seem to require making changes in the exported API surface 🎉

@pellared
Copy link
Member

@wmdanor, do you think it is ready for review?

@pellared pellared self-requested a review March 28, 2025 10:55
@wmdanor
Copy link
Contributor Author

wmdanor commented Mar 28, 2025

@wmdanor, do you think it is ready for review?

It have noticed that CI failed this morning, so I put it back to draft as I did not have enough time to fix it before going to work.

I saw that you have helped me with resolving those issues, thank you for that.

I have addressed your comment about b.exporter.IsQueueFull() { continue }, so it should be ready now.

Other than that, there is race condition comment you left, should I be trying to add some mutex locks for additional security for this in this PR or do we merge this one as good enough?

@wmdanor wmdanor marked this pull request as ready for review March 28, 2025 17:50
@wmdanor wmdanor requested a review from pellared March 28, 2025 17:50
@pellared
Copy link
Member

I think after this is addressed: #6569 (comment) it can be reviewed by others.

@wmdanor
Copy link
Contributor Author

wmdanor commented Mar 28, 2025

I think after this is addressed: #6569 (comment) it can be reviewed by others.

@pellared pushed the revert.

Copy link
Member

@pellared pellared left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MrAlias, PTAL

@dmathieu dmathieu merged commit a5abd88 into open-telemetry:main Apr 9, 2025
28 of 29 checks passed
}
return ok
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
} else {
qLen = b.q.Len()
}

pellared added a commit that referenced this pull request Apr 11, 2025
@pellared pellared mentioned this pull request Apr 11, 2025
@MrAlias MrAlias added this to the v1.36.0 milestone Apr 11, 2025
dmathieu added a commit that referenced this pull request May 21, 2025
# Overview

Closes #6786

### Added

- Add exponential histogram support in
`go.opentelemetry.io/otel/exporters/prometheus`. (#6421)
- The `go.opentelemetry.io/otel/semconv/v1.31.0` package.
The package contains semantic conventions from the `v1.31.0` version of
the OpenTelemetry Semantic Conventions.
See the [migration documentation](./semconv/v1.31.0/MIGRATION.md) for
information on how to upgrade from
`go.opentelemetry.io/otel/semconv/v1.30.0`. (#6479)
- Add `Recording`, `Scope`, and `Record` types in
`go.opentelemetry.io/otel/log/logtest`. (#6507)
- Add `WithHTTPClient` option to configure the `http.Client` used by
`go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp`.
(#6751)
- Add `WithHTTPClient` option to configure the `http.Client` used by
`go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp`.
(#6752)
- Add `WithHTTPClient` option to configure the `http.Client` used by
`go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp`. (#6688)
- Add `ValuesGetter` in `go.opentelemetry.io/otel/propagation`, a
`TextMapCarrier` that supports retrieving multiple values for a single
key. (#5973)
- Add `Values` method to `HeaderCarrier` to implement the new
`ValuesGetter` interface in `go.opentelemetry.io/otel/propagation`.
(#5973)
- Update `Baggage` in `go.opentelemetry.io/otel/propagation` to retrieve
multiple values for a key when the carrier implements `ValuesGetter`.
(#5973)
- Add `AssertEqual` function in `go.opentelemetry.io/otel/log/logtest`.
(#6662)
- The `go.opentelemetry.io/otel/semconv/v1.32.0` package.
The package contains semantic conventions from the `v1.32.0` version of
the OpenTelemetry Semantic Conventions.
See the [migration documentation](./semconv/v1.32.0/MIGRATION.md) for
information on how to upgrade from
`go.opentelemetry.io/otel/semconv/v1.31.0`(#6782)
- Add `Transform` option in `go.opentelemetry.io/otel/log/logtest`.
(#6794)
- Add `Desc` option in `go.opentelemetry.io/otel/log/logtest`. (#6796)

### Removed

- Drop support for [Go 1.22]. (#6381, #6418)
- Remove `Resource` field from `EnabledParameters` in
`go.opentelemetry.io/otel/sdk/log`. (#6494)
- Remove `RecordFactory` type from
`go.opentelemetry.io/otel/log/logtest`. (#6492)
- Remove `ScopeRecords`, `EmittedRecord`, and `RecordFactory` types from
`go.opentelemetry.io/otel/log/logtest`. (#6507)
- Remove `AssertRecordEqual` function in
`go.opentelemetry.io/otel/log/logtest`, use `AssertEqual` instead.
(#6662)

### Changed

- ⚠️ Update `github.com/prometheus/client_golang` to `v1.21.1`, which
changes the `NameValidationScheme` to `UTF8Validation`.
This allows metrics names to keep original delimiters (e.g. `.`), rather
than replacing with underscores.
This can be reverted by setting
`github.com/prometheus/common/model.NameValidationScheme` to
`LegacyValidation` in `github.com/prometheus/common/model`. (#6433)
- Initialize map with `len(keys)` in `NewAllowKeysFilter` and
`NewDenyKeysFilter` to avoid unnecessary allocations in
`go.opentelemetry.io/otel/attribute`. (#6455)
- `go.opentelemetry.io/otel/log/logtest` is now a separate Go module.
(#6465)
- `go.opentelemetry.io/otel/sdk/log/logtest` is now a separate Go
module. (#6466)
- `Recorder` in `go.opentelemetry.io/otel/log/logtest` no longer
separately stores records emitted by loggers with the same
instrumentation scope. (#6507)
- Improve performance of `BatchProcessor` in
`go.opentelemetry.io/otel/sdk/log` by not exporting when exporter cannot
accept more. (#6569, #6641)

### Deprecated

- Deprecate support for `model.LegacyValidation` for
`go.opentelemetry.io/otel/exporters/prometheus`. (#6449)

### Fixes

- Stop percent encoding header environment variables in
`go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc` and
`go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp`. (#6392)
- Ensure the `noopSpan.tracerProvider` method is not inlined in
`go.opentelemetry.io/otel/trace` so the `go.opentelemetry.io/auto`
instrumentation can instrument non-recording spans. (#6456)
- Use a `sync.Pool` instead of allocating `metricdata.ResourceMetrics`
in `go.opentelemetry.io/otel/exporters/prometheus`. (#6472)

---------

Co-authored-by: Tyler Yahn <[email protected]>
Co-authored-by: Robert Pająk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants