-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Optimize Logs SDK BatchProcessor #6569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Batch processor does not check if buffer exporter is full, instead it clones records to buffer, then tries to push them to buffer exporter, and then buffer exporter will return false from EnqueueExport if it is full.
[...]
But actual implementations of exporters are not going to act that way, they need some time for export operation. And so buffer exporter will get full fairly frequently. Each export attempt to buffer exporter costs us copying entire queue inside of batch processor.
This a very good finding 👍
PS. I am happy that improving the logs batching processor does not seem to require making changes in the exported API surface 🎉
|
@wmdanor, do you think it is ready for review? |
It have noticed that CI failed this morning, so I put it back to draft as I did not have enough time to fix it before going to work. I saw that you have helped me with resolving those issues, thank you for that. I have addressed your comment about Other than that, there is race condition comment you left, should I be trying to add some mutex locks for additional security for this in this PR or do we merge this one as good enough? |
|
I think after this is addressed: #6569 (comment) it can be reviewed by others. |
@pellared pushed the revert. |
pellared
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MrAlias, PTAL
| } | ||
| return ok | ||
| }) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | |
| } else { | |
| qLen = b.q.Len() | |
| } |
# Overview Closes #6786 ### Added - Add exponential histogram support in `go.opentelemetry.io/otel/exporters/prometheus`. (#6421) - The `go.opentelemetry.io/otel/semconv/v1.31.0` package. The package contains semantic conventions from the `v1.31.0` version of the OpenTelemetry Semantic Conventions. See the [migration documentation](./semconv/v1.31.0/MIGRATION.md) for information on how to upgrade from `go.opentelemetry.io/otel/semconv/v1.30.0`. (#6479) - Add `Recording`, `Scope`, and `Record` types in `go.opentelemetry.io/otel/log/logtest`. (#6507) - Add `WithHTTPClient` option to configure the `http.Client` used by `go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp`. (#6751) - Add `WithHTTPClient` option to configure the `http.Client` used by `go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp`. (#6752) - Add `WithHTTPClient` option to configure the `http.Client` used by `go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp`. (#6688) - Add `ValuesGetter` in `go.opentelemetry.io/otel/propagation`, a `TextMapCarrier` that supports retrieving multiple values for a single key. (#5973) - Add `Values` method to `HeaderCarrier` to implement the new `ValuesGetter` interface in `go.opentelemetry.io/otel/propagation`. (#5973) - Update `Baggage` in `go.opentelemetry.io/otel/propagation` to retrieve multiple values for a key when the carrier implements `ValuesGetter`. (#5973) - Add `AssertEqual` function in `go.opentelemetry.io/otel/log/logtest`. (#6662) - The `go.opentelemetry.io/otel/semconv/v1.32.0` package. The package contains semantic conventions from the `v1.32.0` version of the OpenTelemetry Semantic Conventions. See the [migration documentation](./semconv/v1.32.0/MIGRATION.md) for information on how to upgrade from `go.opentelemetry.io/otel/semconv/v1.31.0`(#6782) - Add `Transform` option in `go.opentelemetry.io/otel/log/logtest`. (#6794) - Add `Desc` option in `go.opentelemetry.io/otel/log/logtest`. (#6796) ### Removed - Drop support for [Go 1.22]. (#6381, #6418) - Remove `Resource` field from `EnabledParameters` in `go.opentelemetry.io/otel/sdk/log`. (#6494) - Remove `RecordFactory` type from `go.opentelemetry.io/otel/log/logtest`. (#6492) - Remove `ScopeRecords`, `EmittedRecord`, and `RecordFactory` types from `go.opentelemetry.io/otel/log/logtest`. (#6507) - Remove `AssertRecordEqual` function in `go.opentelemetry.io/otel/log/logtest`, use `AssertEqual` instead. (#6662) ### Changed -⚠️ Update `github.com/prometheus/client_golang` to `v1.21.1`, which changes the `NameValidationScheme` to `UTF8Validation`. This allows metrics names to keep original delimiters (e.g. `.`), rather than replacing with underscores. This can be reverted by setting `github.com/prometheus/common/model.NameValidationScheme` to `LegacyValidation` in `github.com/prometheus/common/model`. (#6433) - Initialize map with `len(keys)` in `NewAllowKeysFilter` and `NewDenyKeysFilter` to avoid unnecessary allocations in `go.opentelemetry.io/otel/attribute`. (#6455) - `go.opentelemetry.io/otel/log/logtest` is now a separate Go module. (#6465) - `go.opentelemetry.io/otel/sdk/log/logtest` is now a separate Go module. (#6466) - `Recorder` in `go.opentelemetry.io/otel/log/logtest` no longer separately stores records emitted by loggers with the same instrumentation scope. (#6507) - Improve performance of `BatchProcessor` in `go.opentelemetry.io/otel/sdk/log` by not exporting when exporter cannot accept more. (#6569, #6641) ### Deprecated - Deprecate support for `model.LegacyValidation` for `go.opentelemetry.io/otel/exporters/prometheus`. (#6449) ### Fixes - Stop percent encoding header environment variables in `go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc` and `go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp`. (#6392) - Ensure the `noopSpan.tracerProvider` method is not inlined in `go.opentelemetry.io/otel/trace` so the `go.opentelemetry.io/auto` instrumentation can instrument non-recording spans. (#6456) - Use a `sync.Pool` instead of allocating `metricdata.ResourceMetrics` in `go.opentelemetry.io/otel/exporters/prometheus`. (#6472) --------- Co-authored-by: Tyler Yahn <[email protected]> Co-authored-by: Robert Pająk <[email protected]>
Towards: #6382
This PR optimizes Logs SDK BatchProcessor.
Why?
Batch processor does not check if buffer exporter is full, instead it clones records to buffer, then tries to push them to buffer exporter, and then buffer exporter will return false from EnqueueExport if it is full.
It does not cause any performance issues when using NOOP exporter, because it "exports" instantly. -> Buffer exporter input channel will almost never be full.
But actual implementations of exporters are not going to act that way, they need some time for export operation. And so buffer exporter will get full fairly frequently. Each export attempt to buffer exporter costs us copying entire queue inside of batch processor.
In order to catch this or similar performance bottleneck in the future, new benchmark (
BatchSimulateExport) was added which will use exporter that will simulate non-instant export by sleeping for 5ms.Benchmarks
goos: linux goarch: amd64 pkg: go.opentelemetry.io/otel/sdk/log cpu: AMD Ryzen 9 7900X 12-Core Processor │ base.txt │ new.txt │ │ sec/op │ sec/op vs base │ Processor/Simple-24 340.9n ± 2% 343.5n ± 5% ~ (p=0.971 n=10) Processor/Batch-24 667.6n ± 2% 666.3n ± 4% ~ (p=0.896 n=10) Processor/BatchSimulateExport-24 4441.5n ± 5% 526.4n ± 4% -88.15% (p=0.000 n=10) Processor/SetTimestampSimple-24 360.0n ± 4% 351.0n ± 1% ~ (p=0.247 n=10) Processor/SetTimestampBatch-24 669.8n ± 3% 647.0n ± 5% ~ (p=0.052 n=10) Processor/AddAttributesSimple-24 383.9n ± 1% 386.6n ± 5% ~ (p=0.138 n=10) Processor/AddAttributesBatch-24 759.9n ± 6% 734.5n ± 5% ~ (p=0.481 n=10) Processor/SetAttributesSimple-24 369.9n ± 2% 372.4n ± 3% ~ (p=0.643 n=10) Processor/SetAttributesBatch-24 684.2n ± 4% 674.0n ± 8% ~ (p=0.529 n=10) geomean 639.9n 499.9n -21.88% │ base.txt │ new.txt │ │ B/op │ B/op vs base │ Processor/Simple-24 450.0 ± 0% 450.0 ± 0% ~ (p=1.000 n=10) ¹ Processor/Batch-24 1.153Ki ± 2% 1.125Ki ± 1% -2.41% (p=0.001 n=10) Processor/BatchSimulateExport-24 649.0 ± 1% 473.0 ± 0% -27.12% (p=0.000 n=10) Processor/SetTimestampSimple-24 450.0 ± 0% 450.0 ± 0% ~ (p=1.000 n=10) ¹ Processor/SetTimestampBatch-24 1.145Ki ± 2% 1.122Ki ± 2% -2.01% (p=0.003 n=10) Processor/AddAttributesSimple-24 450.0 ± 0% 450.0 ± 0% ~ (p=1.000 n=10) ¹ Processor/AddAttributesBatch-24 1.169Ki ± 2% 1.148Ki ± 1% -1.80% (p=0.000 n=10) Processor/SetAttributesSimple-24 450.0 ± 0% 450.0 ± 0% ~ (p=1.000 n=10) ¹ Processor/SetAttributesBatch-24 1.141Ki ± 2% 1.128Ki ± 2% -1.11% (p=0.030 n=10) geomean 719.3 688.7 -4.24% ¹ all samples are equal │ base.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ Processor/Simple-24 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/Batch-24 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/BatchSimulateExport-24 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/SetTimestampSimple-24 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/SetTimestampBatch-24 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/AddAttributesSimple-24 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/AddAttributesBatch-24 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/SetAttributesSimple-24 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/SetAttributesBatch-24 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ geomean 1.000 1.000 +0.00% ¹ all samples are equalBenchmarks from issue (external)
goos: linux goarch: amd64 pkg: github.com/pellared/spanevents-vs-logs cpu: AMD Ryzen 9 7900X 12-Core Processor │ spanevents.txt │ logs.base.txt │ logs.new.txt │ │ sec/op │ sec/op vs base │ sec/op vs base │ /OTLP-24 41.40µ ± 3% 2935.15µ ± 9% +6989.05% (p=0.000 n=10) 84.19µ ± 4% +103.34% (p=0.000 n=10) /STDOUT-24 39.55µ ± 4% 623.45µ ± 1% +1476.21% (p=0.000 n=10) 89.51µ ± 3% +126.29% (p=0.000 n=10) /NOOP-24 13.179µ ± 2% 4.950µ ± 1% -62.44% (p=0.000 n=10) 4.932µ ± 1% -62.58% (p=0.000 n=10) geomean 27.84µ 208.4µ +648.68% 33.37µ +19.86% │ spanevents.txt │ logs.base.txt │ logs.new.txt │ │ B/op │ B/op vs base │ B/op vs base │ /OTLP-24 98.90Ki ± 8% 210.44Ki ± 0% +112.79% (p=0.000 n=10) 52.01Ki ± 0% -47.41% (p=0.000 n=10) /STDOUT-24 91.01Ki ± 0% 286.62Ki ± 0% +214.93% (p=0.000 n=10) 81.67Ki ± 1% -10.27% (p=0.000 n=10) /NOOP-24 29840.0 ± 0% 240.0 ± 0% -99.20% (p=0.000 n=10) 240.0 ± 0% -99.20% (p=0.000 n=10) geomean 64.01Ki 24.18Ki -62.23% 9.985Ki -84.40% │ spanevents.txt │ logs.base.txt │ logs.new.txt │ │ allocs/op │ allocs/op vs base │ allocs/op vs base │ /OTLP-24 699.0 ± 20% 2174.5 ± 1% +211.09% (p=0.000 n=10) 249.5 ± 1% -64.31% (p=0.000 n=10) /STDOUT-24 646.5 ± 1% 4912.5 ± 0% +659.86% (p=0.000 n=10) 872.5 ± 2% +34.96% (p=0.000 n=10) /NOOP-24 303.000 ± 0% 3.000 ± 0% -99.01% (p=0.000 n=10) 3.000 ± 0% -99.01% (p=0.000 n=10) geomean 515.4 317.6 -38.37% 86.76 -83.17%goos: linux goarch: amd64 pkg: github.com/pellared/spanevents-vs-logs cpu: AMD Ryzen 9 7900X 12-Core Processor │ logs.base.txt │ logs.new.txt │ │ sec/op │ sec/op vs base │ /OTLP-24 2935.15µ ± 9% 84.19µ ± 4% -97.13% (p=0.000 n=10) /STDOUT-24 623.45µ ± 1% 89.51µ ± 3% -85.64% (p=0.000 n=10) /NOOP-24 4.950µ ± 1% 4.932µ ± 1% ~ (p=0.342 n=10) geomean 208.4µ 33.37µ -83.99% │ logs.base.txt │ logs.new.txt │ │ B/op │ B/op vs base │ /OTLP-24 210.44Ki ± 0% 52.01Ki ± 0% -75.28% (p=0.000 n=10) /STDOUT-24 286.62Ki ± 0% 81.67Ki ± 1% -71.51% (p=0.000 n=10) /NOOP-24 240.0 ± 0% 240.0 ± 0% ~ (p=1.000 n=10) ¹ geomean 24.18Ki 9.985Ki -58.70% ¹ all samples are equal │ logs.base.txt │ logs.new.txt │ │ allocs/op │ allocs/op vs base │ /OTLP-24 2174.5 ± 1% 249.5 ± 1% -88.53% (p=0.000 n=10) /STDOUT-24 4912.5 ± 0% 872.5 ± 2% -82.24% (p=0.000 n=10) /NOOP-24 3.000 ± 0% 3.000 ± 0% ~ (p=1.000 n=10) ¹ geomean 317.6 86.76 -72.69% ¹ all samples are equal