Skip to content

fix: OTLP logging improvements#3343

Open
cijothomas wants to merge 10 commits intoopen-telemetry:mainfrom
cijothomas:cijothomas/fixotlplogging
Open

fix: OTLP logging improvements#3343
cijothomas wants to merge 10 commits intoopen-telemetry:mainfrom
cijothomas:cijothomas/fixotlplogging

Conversation

@cijothomas
Copy link
Member

Fixes #3021
Supersedes #3108

@cijothomas cijothomas requested a review from a team as a code owner February 4, 2026 19:29
@codecov
Copy link

codecov bot commented Feb 4, 2026

Codecov Report

❌ Patch coverage is 0% with 58 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.0%. Comparing base (8cfa981) to head (ee681f4).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
opentelemetry-otlp/src/exporter/tonic/logs.rs 0.0% 19 Missing ⚠️
opentelemetry-otlp/src/exporter/tonic/metrics.rs 0.0% 15 Missing ⚠️
opentelemetry-otlp/src/exporter/tonic/trace.rs 0.0% 13 Missing ⚠️
opentelemetry-otlp/src/exporter/http/mod.rs 0.0% 11 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##            main   #3343     +/-   ##
=======================================
- Coverage   82.2%   82.0%   -0.2%     
=======================================
  Files        128     128             
  Lines      24506   24553     +47     
=======================================
+ Hits       20147   20155      +8     
- Misses      4359    4398     +39     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cijothomas cijothomas changed the title fix: OTLP logging improvmenets fix: OTLP logging improvements Feb 4, 2026
let response = client.send_bytes(request).await.map_err(|e| {
HttpExportError::new(0, format!("Network error: {e:?}")) // Network error
otel_warn!(
name: "HttpClient.ExportFailed",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using the same event name HttpClient.ExportFailed for two different kinds of errors:

  • Network error from client.send_bytes
  • HTTP status error when response.status() is not successful.

Maybe we could use different event names for better diagnosability: HttpClient.NetworkError and HttpClientStatusError.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did consider that and ended up with same one. I am not 100% happy with this either.

I tried to distinguish between failures where we don't get a response at all and the ones where we get response, but it has status code indicating failure. But then realized that request client don't expose us a way to obtain StatusCode from its Error. Without having firmer control over the client, this is a decent tradeoff.
(The otel-http crate could offer additional logging, as it knows internals of the client it uses, but OTLP exporter does not know that much).

let me know if we should try to make this better or this is acceptable tradeoff.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a security vulnerability where authentication tokens and other sensitive information could be logged at ERROR/WARN levels during OTLP export failures. The fix implements a two-tier logging approach: non-sensitive error indicators (status codes, URLs) are logged at WARN level for operational visibility, while potentially sensitive details (error messages, response bodies) are logged only at DEBUG level.

Changes:

  • Modified gRPC exporters (traces, metrics, logs) to split error logging into WARN-level status codes and DEBUG-level detailed messages
  • Updated HTTP exporter to separate status codes/URLs (WARN) from error details/response bodies (DEBUG)
  • Simplified error messages returned to callers to prevent sensitive data propagation upstream

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
opentelemetry-otlp/src/exporter/tonic/trace.rs Adds two-tier logging for trace export failures: WARN for gRPC codes, DEBUG for sensitive message/details
opentelemetry-otlp/src/exporter/tonic/metrics.rs Adds two-tier logging for metrics export failures: WARN for gRPC codes, DEBUG for sensitive message/details
opentelemetry-otlp/src/exporter/tonic/logs.rs Adds two-tier logging for logs export failures: WARN for gRPC codes, DEBUG for sensitive message/details
opentelemetry-otlp/src/exporter/http/mod.rs Adds two-tier logging for HTTP export failures: WARN for status/URL, DEBUG for error details/response body
opentelemetry-otlp/CHANGELOG.md Documents the security improvement and references issue #3021

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@krisztianfekete krisztianfekete left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this direction, but as discussed in #3331 I think we can improve the UX where one have to enable DEBUG level logging to see that the problem is a user-side misconfiguration that is easily fixable. I am proposing surfacing the ConnectError from the above example at an elevated log level. In this case, further sanitization isn't even required as that message doesn't even contain the actual connection string.
cc. @cijothomas

cijothomas and others added 4 commits February 10, 2026 09:53
Co-authored-by: Utkarsh Umesan Pillai <[email protected]>
Co-authored-by: Utkarsh Umesan Pillai <[email protected]>
Co-authored-by: Utkarsh Umesan Pillai <[email protected]>
// For auth errors (Unauthenticated, PermissionDenied), the message may contain
// sensitive information, so we only log the code at WARN level.
let code = tonic_status.code();
let is_connection_error = matches!(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - this ~35-40 lines of code seems to be repeated for all the signals - can have a shared helper method.

);
}
Err(OTelSdkError::InternalFailure(
"Metrics export failed".into(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grpc_code should also be added here?

);
// Convert interceptor errors to tonic::Status for retry classification
tonic::Status::internal(format!("interceptor error: {e:?}"))
tonic::Status::internal("Traces export failed in interceptor")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grpc_code here too.

@cijothomas
Copy link
Member Author

I like this direction, but as discussed in #3331 I think we can improve the UX where one have to enable DEBUG level logging to see that the problem is a user-side misconfiguration that is easily fixable. I am proposing surfacing the ConnectError from the above example at an elevated log level. In this case, further sanitization isn't even required as that message doesn't even contain the actual connection string. cc. @cijothomas

can you recheck now? I made some changes to help.

@krisztianfekete
Copy link
Member

krisztianfekete commented Feb 12, 2026

I like this direction, but as discussed in #3331 I think we can improve the UX where one have to enable DEBUG level logging to see that the problem is a user-side misconfiguration that is easily fixable. I am proposing surfacing the ConnectError from the above example at an elevated log level. In this case, further sanitization isn't even required as that message doesn't even contain the actual connection string. cc. @cijothomas

can you recheck now? I made some changes to help.

@cijothomas It looks like the source error is still not logged at WARN/ERROR level, although we have some additional gRPC context for some codes. Do you think we can surface the source error message at a higher level without losing it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: OTEL logs raw Authentication token to std-out on auth error

5 participants