Skip to content

test(sdk): add processor test suite coverage for shutdown and timeout scenarios#3382

Open
bryantbiggs wants to merge 4 commits intoopen-telemetry:mainfrom
bryantbiggs:test/processor-test-suite-gaps
Open

test(sdk): add processor test suite coverage for shutdown and timeout scenarios#3382
bryantbiggs wants to merge 4 commits intoopen-telemetry:mainfrom
bryantbiggs:test/processor-test-suite-gaps

Conversation

@bryantbiggs
Copy link
Contributor

@bryantbiggs bryantbiggs commented Feb 21, 2026

Summary

Addresses test coverage gaps identified in #3381:

  • Shutdown regression tests: Added shutdown() tests with TokioSpawn*Exporter mocks for all three processors (BatchSpanProcessor, BatchLogProcessor, PeriodicReader) on multi_thread(worker_threads = 1) runtime — verifying the BlockingStrategy fix for force_flush() deadlocks (OTLP MetricExporter deadlock issue #2802) also works for shutdown(), which goes through the same export code path.

  • Timeout behavior tests: Added HangingExporter mocks for BatchSpanProcessor and BatchLogProcessor that block forever in export(), verifying shutdown_with_timeout() returns Err(Timeout) as expected. PeriodicReader timeout test is omitted (documented why) due to its hardcoded 5-second timeout making it too slow for the regular test suite.

  • SimpleLogProcessor documentation: Improved comments on the two #[ignore]d async exporter deadlock tests, explaining they demonstrate inherent design limitations of SimpleLogProcessor (not bugs), and linking to Processor test suite gaps: no coverage for tokio-dependent exporters or runtime deadlock scenarios #3381 and OTLP MetricExporter deadlock issue #2802. These deadlocks cannot be fixed without changing SimpleLogProcessor's synchronous block_on design — users should use BatchLogProcessor for production async exporters.

  • current_thread limitation documentation: Documented across all three processor test files that current_thread runtime with tokio-dependent exporters is a fundamental limitation, while multi_thread(1) (the 1-vCPU k8s pod scenario) is supported via BlockingStrategy.

Design notes

Why shutdown needs separate tests from force_flush: Both shutdown() and force_flush() funnel through the same export function (get_spans_and_export / get_logs_and_export / collect_and_export) which uses BlockingStrategy to properly enter the tokio runtime context. The shutdown path then additionally calls exporter.shutdown(). Without separate tests, a regression in the shutdown-specific code path could go undetected.

Why pending() instead of Notify: The hanging exporter mocks use futures_util::future::pending() rather than tokio::sync::Notify. pending() is simpler — it creates a future that never resolves, which is all we need to simulate a permanently hanging exporter. Since we're testing the caller's timeout behavior (not the exporter's ability to be cancelled), there's no need for the test to control when the hang resolves.

PeriodicReader's timeout gap: PeriodicReader::shutdown() has a hardcoded 5-second timeout that ignores the _timeout parameter in shutdown_with_timeout() (marked with a TODO in the source). Until that's made configurable, a hanging exporter timeout test would take 5 seconds per run — too slow for CI. A comment documents this gap with an issue link.

Dependencies

This PR is based on #3380 (BlockingStrategy fix) and should be merged after it.

Test plan

  • All new shutdown regression tests pass on multi_thread(worker_threads = 1)
  • Timeout tests verify Err(Timeout) is returned within expected duration (~500ms)
  • All 59 existing processor tests continue to pass (no regressions)
  • SimpleLogProcessor's 2 ignored tests remain ignored (intentional deadlock demos)

The default thread-based processors (BatchSpanProcessor, BatchLogProcessor,
PeriodicReader) call futures_executor::block_on() on their dedicated worker
threads. When the exporter uses tonic/gRPC, the export future depends on
tokio tasks (e.g. tonic's Buffer worker) that can only be polled by tokio
worker threads. If all tokio worker threads are blocked (single-threaded
runtime, or multi-thread with 1 worker), this creates a circular wait.

Add BlockingStrategy that captures the tokio runtime handle at construction
time and enters the runtime context via Handle::enter() before calling
futures_executor::block_on(). This makes tokio types available on the
dedicated background threads without taking ownership of the reactor.
Falls back to plain futures_executor::block_on() without tokio.

Fixes: open-telemetry#2802
Add tests with TokioSpawn*Exporter mocks that call tokio::spawn()
inside export(), simulating tonic/gRPC exporters. These prove that
BlockingStrategy correctly provides tokio runtime context on the
processor's dedicated OS thread, preventing deadlocks on constrained
multi_thread(1) runtimes (open-telemetry#2802, open-telemetry#3356).
…st suite

Add coverage for processor test suite gaps identified in open-telemetry#3381:

- Add shutdown() regression tests with TokioSpawn*Exporter mocks for
  BatchSpanProcessor, BatchLogProcessor, and PeriodicReader on
  multi_thread(1) runtime, verifying the same BlockingStrategy fix
  that resolved force_flush() deadlocks also works for shutdown().

- Add timeout behavior tests with HangingExporter mocks for
  BatchSpanProcessor and BatchLogProcessor, verifying that
  shutdown_with_timeout returns Err(Timeout) when exporters hang.

- Improve documentation on SimpleLogProcessor's ignored deadlock tests,
  explaining they demonstrate inherent design limitations (not bugs)
  and linking to relevant issues (open-telemetry#2802, open-telemetry#3381).

- Document current_thread runtime limitation across all processors
  and explain why PeriodicReader timeout test is omitted (hardcoded
  5s timeout makes it too slow for regular test suite).

Closes open-telemetry#3381
@bryantbiggs bryantbiggs requested a review from a team as a code owner February 21, 2026 01:39
@codecov
Copy link

codecov bot commented Feb 21, 2026

Codecov Report

❌ Patch coverage is 91.66667% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.3%. Comparing base (09b85b5) to head (1293251).

Files with missing lines Patch % Lines
opentelemetry-sdk/src/logs/batch_log_processor.rs 90.6% 6 Missing ⚠️
opentelemetry-sdk/src/metrics/periodic_reader.rs 87.7% 6 Missing ⚠️
opentelemetry-sdk/src/trace/span_processor.rs 93.5% 5 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff           @@
##            main   #3382    +/-   ##
======================================
  Coverage   82.2%   82.3%            
======================================
  Files        128     128            
  Lines      24626   24825   +199     
======================================
+ Hits       20267   20453   +186     
- Misses      4359    4372    +13     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant