-
Notifications
You must be signed in to change notification settings - Fork 2
[LFXv2-610] Add OpenTelemetry tracing support #40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Implement OTel infrastructure with support for traces, metrics, and logs - Support both gRPC and HTTP OTLP protocols with configurable exporters - Add Jaeger propagator alongside W3C tracecontext and baggage - Integrate slog-otel to include trace_id and span_id in structured logs - Add Helm chart configuration for OTel settings (endpoint, protocol, sample ratio, exporters) - Include comprehensive unit tests for OTel configuration and setup Configuration is environment-driven via OTEL_* variables with sensible defaults. Exporters are disabled by default (set to "none") and can be enabled per-signal (traces, metrics, logs). Issue: LFXV2-610 Co-Authored-By: Claude Opus 4.5 <[email protected]> Signed-off-by: Trevor Bramwell <[email protected]>
WalkthroughThis PR adds comprehensive OpenTelemetry (OTEL) instrumentation support to the LFX Indexer Service. Changes include OTEL SDK initialization, configurable trace/metric/log exporters, HTTP handler instrumentation, Helm chart updates with OTEL environment variables, new logging utilities for trace context propagation, and updated dependencies. Changes
Sequence DiagramssequenceDiagram
participant App as Application
participant OTel as OTEL SDK
participant Exporter as OTLP Exporter
participant Agent as Agent/Backend
App->>OTel: SetupOTelSDKWithConfig(ctx, config)
activate OTel
OTel->>OTel: newResource(serviceName, version)
OTel->>OTel: newPropagator(propagators)
OTel->>OTel: newTraceProvider(endpoint, protocol)
OTel->>OTel: newMetricsProvider(endpoint, protocol)
OTel->>OTel: newLoggerProvider(endpoint, protocol)
OTel->>Exporter: Configure OTLP exporter
deactivate OTel
OTel-->>App: shutdown function
App->>App: HTTP Server Start
App->>OTel: Wrap Handler with otelhttp
activate OTel
OTel-->>App: Instrumented Handler
deactivate OTel
App->>App: Request arrives
activate App
App->>OTel: Extract/Inject span context
OTel->>OTel: Create span with trace_id
App->>App: Add trace context to logs
OTel->>Exporter: Batch send traces/metrics
Exporter->>Agent: Send to Agent/Backend
App-->>App: Response
deactivate App
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds comprehensive OpenTelemetry (OTel) tracing support to the lfx-v2-indexer-service, enabling distributed tracing and improved observability across the service. The implementation includes environment-based configuration, HTTP instrumentation, logging enhancements with trace context propagation, and comprehensive test coverage for the OTel utilities.
Changes:
- Introduced OpenTelemetry SDK initialization with configurable exporters for traces, metrics, and logs via environment variables
- Added HTTP request instrumentation using otelhttp middleware for automatic trace generation
- Enhanced structured logging to automatically include trace_id and span_id from context using slog-otel handler
- Added comprehensive test suite for OpenTelemetry configuration and utilities
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/utils/otel.go | Core OpenTelemetry configuration and SDK setup with support for OTLP exporters over gRPC and HTTP protocols |
| pkg/utils/otel_test.go | Comprehensive test suite for OTel configuration parsing, resource creation, propagator setup, and SDK initialization |
| pkg/logging/context.go | Utility functions to extract and inject trace/span IDs from OpenTelemetry context into log messages |
| pkg/logging/logger.go | Updated logger initialization to wrap handlers with slog-otel for automatic trace context propagation |
| cmd/lfx-indexer/main.go | Added OTel SDK initialization at startup and graceful shutdown on application exit |
| cmd/lfx-indexer/server.go | Wrapped HTTP handler with OpenTelemetry instrumentation for automatic request tracing |
| charts/lfx-v2-indexer-service/values.yaml | Added OTel configuration options with documented defaults for deployment customization |
| charts/lfx-v2-indexer-service/templates/deployment.yaml | Environment variable injection for OTel configuration from Helm values |
| charts/lfx-v2-indexer-service/Chart.yaml | Incremented chart version to 0.4.14 to reflect new features |
| go.mod | Added OpenTelemetry dependencies and updated Go version declaration |
| go.sum | Dependency checksums for new OpenTelemetry packages |
Comments suppressed due to low confidence (1)
cmd/lfx-indexer/main.go:65
- A context variable
ctxis created twice in the main function: once on line 31 for OTel setup and again on line 65 with cancellation support. The first context on line 31 is shadowed and becomes unused after line 33. Consider reusing the same context variable or giving them distinct names to avoid confusion. For example, name the first oneotelCtxor remove it by directly passingcontext.Background()toSetupOTelSDKWithConfig.
ctx := context.Background()
otelConfig := utils.OTelConfigFromEnv()
otelShutdown, err := utils.SetupOTelSDKWithConfig(ctx, otelConfig)
if err != nil {
logger.Error("error setting up OpenTelemetry SDK", "error", err)
os.Exit(1)
}
defer func() {
if shutdownErr := otelShutdown(context.Background()); shutdownErr != nil {
logger.Error("error shutting down OpenTelemetry SDK", "error", shutdownErr)
}
}()
// Log configuration with sources for transparency
logger.Info("Configuration loaded",
"port", flags.Port,
"debug", flags.Debug,
"bind", flags.Bind,
"no_janitor", flags.NoJanitor,
"simple_health", flags.SimpleHealth)
logger.Info("LFX Indexer Service startup initiated")
// Initialize dependency injection container
logger.Info("Initializing dependency injection container...")
// CLI config is already the right type - no conversion needed
container, err := container.NewContainer(logger, flags)
if err != nil {
logger.Error("Failed to initialize container", "error", err.Error())
os.Exit(1)
}
// Create context with cancellation
ctx, cancel := context.WithCancel(context.Background())
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func LogAttrsFromContext(ctx context.Context) []slog.Attr { | ||
| spanCtx := trace.SpanContextFromContext(ctx) | ||
| if !spanCtx.IsValid() { | ||
| return nil | ||
| } | ||
| return []slog.Attr{ | ||
| slog.String("trace_id", spanCtx.TraceID().String()), | ||
| slog.String("span_id", spanCtx.SpanID().String()), | ||
| } | ||
| } | ||
|
|
||
| // LogWithContext returns a logger with trace context attributes added. | ||
| // If the context contains a valid span context, trace_id and span_id will be added to the logger. | ||
| // Usage: logging.LogWithContext(ctx, slog.Default()).Info("message", "key", "value") | ||
| func LogWithContext(ctx context.Context, logger *slog.Logger) *slog.Logger { | ||
| spanCtx := trace.SpanContextFromContext(ctx) | ||
| if !spanCtx.IsValid() { | ||
| return logger | ||
| } | ||
| return logger.With( | ||
| slog.String("trace_id", spanCtx.TraceID().String()), | ||
| slog.String("span_id", spanCtx.SpanID().String()), | ||
| ) | ||
| } |
Copilot
AI
Jan 14, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new functions LogAttrsFromContext and LogWithContext in context.go lack test coverage. These functions extract and inject trace/span IDs from context and should be tested to ensure they handle both valid and invalid span contexts correctly. Add tests covering cases where the context contains a valid span context and where it doesn't.
| {{- if .Values.app.otel.serviceName }} | ||
| - name: OTEL_SERVICE_NAME | ||
| value: {{ .Values.app.otel.serviceName | quote }} | ||
| {{- end }} | ||
| {{- if .Values.app.otel.serviceVersion }} | ||
| - name: OTEL_SERVICE_VERSION | ||
| value: {{ .Values.app.otel.serviceVersion | quote }} | ||
| {{- end }} | ||
| {{- if .Values.app.otel.endpoint }} | ||
| - name: OTEL_EXPORTER_OTLP_ENDPOINT | ||
| value: {{ .Values.app.otel.endpoint | quote }} | ||
| {{- end }} | ||
| {{- if .Values.app.otel.protocol }} | ||
| - name: OTEL_EXPORTER_OTLP_PROTOCOL | ||
| value: {{ .Values.app.otel.protocol | quote }} | ||
| {{- end }} | ||
| {{- if .Values.app.otel.insecure }} | ||
| - name: OTEL_EXPORTER_OTLP_INSECURE | ||
| value: {{ .Values.app.otel.insecure | quote }} | ||
| {{- end }} | ||
| {{- if .Values.app.otel.tracesExporter }} | ||
| - name: OTEL_TRACES_EXPORTER | ||
| value: {{ .Values.app.otel.tracesExporter | quote }} | ||
| {{- end }} | ||
| {{- if .Values.app.otel.tracesSampleRatio }} | ||
| - name: OTEL_TRACES_SAMPLE_RATIO | ||
| value: {{ .Values.app.otel.tracesSampleRatio | quote }} | ||
| {{- end }} | ||
| {{- if .Values.app.otel.metricsExporter }} | ||
| - name: OTEL_METRICS_EXPORTER | ||
| value: {{ .Values.app.otel.metricsExporter | quote }} | ||
| {{- end }} | ||
| {{- if .Values.app.otel.logsExporter }} | ||
| - name: OTEL_LOGS_EXPORTER | ||
| value: {{ .Values.app.otel.logsExporter | quote }} | ||
| {{- end }} | ||
| {{- if .Values.app.otel.propagators }} | ||
| - name: OTEL_PROPAGATORS | ||
| value: {{ .Values.app.otel.propagators | quote }} | ||
| {{- end }} |
Copilot
AI
Jan 14, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deployment template unconditionally sets OTEL environment variables when corresponding values are defined in values.yaml, even if they are empty strings. The if conditionals check for the presence of the field but don't verify it's not an empty string. For example, if serviceName: "" is set, the template will still inject OTEL_SERVICE_NAME="" as an environment variable, which may not be the intended behavior. Consider using if and .Values.app.otel.serviceName (ne .Values.app.otel.serviceName "") or similar to ensure only non-empty values are injected.
| defer func() { | ||
| if shutdownErr := otelShutdown(context.Background()); shutdownErr != nil { | ||
| logger.Error("error shutting down OpenTelemetry SDK", "error", shutdownErr) | ||
| } | ||
| }() |
Copilot
AI
Jan 14, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The OpenTelemetry SDK is set up and shutdown using separate contexts. During setup in main.go, ctx := context.Background() is used (line 31), but the shutdown is called with a fresh context.Background() (line 39). While this may work, it's more conventional to use a timeout context for shutdown operations to ensure they complete within a reasonable time frame. Consider using context.WithTimeout(context.Background(), 5*time.Second) or similar for the shutdown context to prevent hanging during shutdown.
| protocol: "grpc" | ||
| # endpoint is the OTLP collector endpoint | ||
| # For gRPC: typically "host:4317", for HTTP: typically "host:4318" | ||
| endpoint: "" | ||
| # insecure disables TLS for the OTLP connection | ||
| # Set to "true" for in-cluster communication without TLS | ||
| insecure: "false" | ||
| # tracesExporter specifies the traces exporter: "otlp" or "none" | ||
| # (default: "none") | ||
| tracesExporter: "none" | ||
| # tracesSampleRatio specifies the sampling ratio for traces (0.0 to 1.0) | ||
| # A value of 1.0 means all traces are sampled, 0.5 means 50% are sampled | ||
| # (default: "1.0") | ||
| tracesSampleRatio: "1.0" | ||
| # metricsExporter specifies the metrics exporter: "otlp" or "none" | ||
| # (default: "none") | ||
| metricsExporter: "none" | ||
| # logsExporter specifies the logs exporter: "otlp" or "none" | ||
| # (default: "none") | ||
| logsExporter: "none" | ||
| # propagators specifies the propagators to use, comma-separated | ||
| # Supported values: "tracecontext", "baggage", "jaeger" | ||
| # (default: "tracecontext,baggage") | ||
| propagators: "tracecontext,baggage,jaeger" |
Copilot
AI
Jan 14, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default values in values.yaml use explicit values for fields that have documented defaults in the code. For example, protocol: "grpc", insecure: "false", tracesExporter: "none", etc. are set to their default values. This creates redundancy and potential for drift between the documented defaults in otel.go and the Helm chart defaults. Consider setting these to empty strings (like serviceName and endpoint) to allow the Go code defaults to take effect, or add clear comments indicating these values override code defaults.
| # serviceVersion is the service version for OpenTelemetry resource identification | ||
| # (default: "1.0.0") | ||
| serviceVersion: "" |
Copilot
AI
Jan 14, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The values.yaml documentation claims the default for serviceVersion is "1.0.0", but the actual code in otel.go (line 90) doesn't provide any default value for serviceVersion - it uses whatever is in the environment variable or an empty string. This creates a discrepancy between the documented behavior and actual behavior. Either update the code to provide "1.0.0" as the default, or update the documentation to clarify that the default is an empty string.
| # propagators specifies the propagators to use, comma-separated | ||
| # Supported values: "tracecontext", "baggage", "jaeger" | ||
| # (default: "tracecontext,baggage") | ||
| propagators: "tracecontext,baggage,jaeger" |
Copilot
AI
Jan 14, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default propagators value in values.yaml is "tracecontext,baggage,jaeger" but the code default in otel.go line 118 is "tracecontext,baggage" (without jaeger). This inconsistency means that deployments using default values.yaml will have different behavior than running the application with default environment variables. Consider aligning these defaults or documenting why they differ.
| // TestOTelConfigFromEnv_CustomValues verifies that OTelConfigFromEnv correctly | ||
| // reads and parses all supported OTEL_* environment variables. | ||
| func TestOTelConfigFromEnv_CustomValues(t *testing.T) { | ||
| t.Setenv("OTEL_SERVICE_NAME", "test-service") | ||
| t.Setenv("OTEL_SERVICE_VERSION", "1.2.3") | ||
| t.Setenv("OTEL_EXPORTER_OTLP_PROTOCOL", "http") | ||
| t.Setenv("OTEL_EXPORTER_OTLP_ENDPOINT", "localhost:4318") | ||
| t.Setenv("OTEL_EXPORTER_OTLP_INSECURE", "true") | ||
| t.Setenv("OTEL_TRACES_EXPORTER", "otlp") | ||
| t.Setenv("OTEL_TRACES_SAMPLE_RATIO", "0.5") | ||
| t.Setenv("OTEL_METRICS_EXPORTER", "otlp") | ||
| t.Setenv("OTEL_LOGS_EXPORTER", "otlp") | ||
|
|
||
| cfg := OTelConfigFromEnv() | ||
|
|
||
| if cfg.ServiceName != "test-service" { | ||
| t.Errorf("expected ServiceName 'test-service', got %q", cfg.ServiceName) | ||
| } | ||
| if cfg.ServiceVersion != "1.2.3" { | ||
| t.Errorf("expected ServiceVersion '1.2.3', got %q", cfg.ServiceVersion) | ||
| } | ||
| if cfg.Protocol != OTelProtocolHTTP { | ||
| t.Errorf("expected Protocol %q, got %q", OTelProtocolHTTP, cfg.Protocol) | ||
| } | ||
| if cfg.Endpoint != "localhost:4318" { | ||
| t.Errorf("expected Endpoint 'localhost:4318', got %q", cfg.Endpoint) | ||
| } | ||
| if cfg.Insecure != true { | ||
| t.Errorf("expected Insecure true, got %t", cfg.Insecure) | ||
| } | ||
| if cfg.TracesExporter != OTelExporterOTLP { | ||
| t.Errorf("expected TracesExporter %q, got %q", OTelExporterOTLP, cfg.TracesExporter) | ||
| } | ||
| if cfg.TracesSampleRatio != 0.5 { | ||
| t.Errorf("expected TracesSampleRatio 0.5, got %f", cfg.TracesSampleRatio) | ||
| } | ||
| if cfg.MetricsExporter != OTelExporterOTLP { | ||
| t.Errorf("expected MetricsExporter %q, got %q", OTelExporterOTLP, cfg.MetricsExporter) | ||
| } | ||
| if cfg.LogsExporter != OTelExporterOTLP { | ||
| t.Errorf("expected LogsExporter %q, got %q", OTelExporterOTLP, cfg.LogsExporter) | ||
| } | ||
| } |
Copilot
AI
Jan 14, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests for OTelConfigFromEnv don't cover edge cases for OTEL_TRACES_SAMPLE_RATIO validation. The code in otel.go (lines 122-134) handles invalid values (out of range or parse errors) by logging a warning and using the default value of 1.0. Add test cases that verify this error handling works correctly, such as testing with values like "invalid", "-0.5", "1.5", etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@charts/lfx-v2-indexer-service/templates/deployment.yaml`:
- Around line 103-106: The template condition skips a valid value of 0 because
`{{- if .Values.app.otel.tracesSampleRatio }}` treats 0/ "0" as false; change
the conditional to detect presence instead (e.g., `{{- if ne
.Values.app.otel.tracesSampleRatio nil }}` or `{{- if hasKey .Values.app.otel
"tracesSampleRatio" }}`) or unconditionally emit the OTEL_TRACES_SAMPLE_RATIO
with a sensible default using `default` so that a configured 0 is rendered;
update the block that creates the OTEL_TRACES_SAMPLE_RATIO environment variable
accordingly.
🧹 Nitpick comments (4)
pkg/logging/context.go (1)
16-39: Consider documenting the relationship with OtelHandler.These utilities work correctly. However, since
pkg/logging/logger.goalready wraps the handler withslogotel.OtelHandler(which auto-injects trace_id/span_id), usingLogWithContext()with the default logger could result in duplicate trace attributes.Consider adding a doc comment clarifying when to use these functions (e.g., for loggers not wrapped with OtelHandler, or for explicit attribute access).
pkg/utils/otel.go (1)
85-90: Consider using constants frompkg/constants/app.go.The default service name
"lfx-v2-indexer-service"is hardcoded here, butpkg/constants/app.goalready definesServiceName = "lfx-v2-indexer-service"andServiceVersion = "2.0.0". Using these constants would ensure consistency and reduce duplication.♻️ Proposed fix
+import ( + ... + "github.com/linuxfoundation/lfx-v2-indexer-service/pkg/constants" +) func OTelConfigFromEnv() OTelConfig { serviceName := os.Getenv("OTEL_SERVICE_NAME") if serviceName == "" { - serviceName = "lfx-v2-indexer-service" + serviceName = constants.ServiceName } serviceVersion := os.Getenv("OTEL_SERVICE_VERSION") + if serviceVersion == "" { + serviceVersion = constants.ServiceVersion + }cmd/lfx-indexer/main.go (1)
30-42: Consider adding a timeout for OTEL shutdown.The OTEL SDK setup and deferred shutdown are correctly placed. However,
otelShutdown(context.Background())on line 39 has no timeout, which could delay application exit if the OTEL collector is unresponsive.♻️ Suggested improvement
defer func() { - if shutdownErr := otelShutdown(context.Background()); shutdownErr != nil { + shutdownCtx, shutdownCancel := context.WithTimeout(context.Background(), 5*time.Second) + defer shutdownCancel() + if shutdownErr := otelShutdown(shutdownCtx); shutdownErr != nil { logger.Error("error shutting down OpenTelemetry SDK", "error", shutdownErr) } }()pkg/utils/otel_test.go (1)
28-30: Optional: Simplify boolean assertions.The boolean comparisons can be simplified for readability.
♻️ Suggested simplification
- if cfg.Insecure != false { + if cfg.Insecure { t.Errorf("expected Insecure false, got %t", cfg.Insecure) }And for the custom values test:
- if cfg.Insecure != true { + if !cfg.Insecure { t.Errorf("expected Insecure true, got %t", cfg.Insecure) }Also applies to: 72-74
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (10)
charts/lfx-v2-indexer-service/Chart.yamlcharts/lfx-v2-indexer-service/templates/deployment.yamlcharts/lfx-v2-indexer-service/values.yamlcmd/lfx-indexer/main.gocmd/lfx-indexer/server.gogo.modpkg/logging/context.gopkg/logging/logger.gopkg/utils/otel.gopkg/utils/otel_test.go
🧰 Additional context used
📓 Path-based instructions (3)
**/*.go
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.go: Run golangci-lint before committing code
Run fmt, vet, lint, and test together before committing
Files:
pkg/logging/logger.gocmd/lfx-indexer/server.gopkg/utils/otel_test.gocmd/lfx-indexer/main.gopkg/logging/context.gopkg/utils/otel.go
**/*_test.go
📄 CodeRabbit inference engine (CLAUDE.md)
**/*_test.go: Run all tests with race detection and coverage
Mock external dependencies using interfaces in internal/mocks/
Include race detection in all test runs
Files:
pkg/utils/otel_test.go
cmd/lfx-indexer/main.go
📄 CodeRabbit inference engine (CLAUDE.md)
Dependency injection only in cmd/lfx-indexer/main.go
Files:
cmd/lfx-indexer/main.go
🧬 Code graph analysis (2)
cmd/lfx-indexer/main.go (1)
pkg/utils/otel.go (2)
OTelConfigFromEnv(84-161)SetupOTelSDKWithConfig(171-239)
pkg/utils/otel.go (1)
pkg/constants/app.go (2)
ServiceName(11-11)ServiceVersion(12-12)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Agent
- GitHub Check: MegaLinter
🔇 Additional comments (12)
charts/lfx-v2-indexer-service/Chart.yaml (1)
9-9: LGTM!Version bump to 0.4.14 appropriately reflects the new OpenTelemetry configuration additions.
pkg/utils/otel.go (1)
171-239: LGTM!The SDK setup follows the recommended OpenTelemetry Go patterns correctly:
- Proper error aggregation with
errors.Join- Conditional provider initialization based on exporter configuration
- Cleanup via deferred shutdown functions
- Propagator setup before providers
cmd/lfx-indexer/main.go (1)
31-33: LGTM!OTEL SDK initialization is correctly placed early in the startup sequence, before container initialization, ensuring tracing is available for all subsequent operations. The dependency injection pattern follows the coding guidelines for
cmd/lfx-indexer/main.go.pkg/logging/logger.go (1)
58-61: Correct slog-otel handler implementation. The OtelHandler wraps the base handler properly and matches the v1.3.4 usage pattern. Automatic trace_id/span_id injection will work when using context-aware logging methods (slog.InfoContext(),slog.ErrorContext(), etc.).Note:
LogWithContext()in context.go provides manual trace extraction as an alternative. Both approaches serve different purposes, but using them together on the same log call would create duplicate trace fields—verify that callers use one approach consistently.cmd/lfx-indexer/server.go (1)
24-27: LGTM! Clean OpenTelemetry HTTP instrumentation.The handler wrapping pattern is correct and follows the recommended approach for instrumenting HTTP servers with OpenTelemetry. The instrumentation will automatically create spans for incoming requests with proper context propagation.
pkg/utils/otel_test.go (1)
45-87: Good test coverage for environment variable parsing.The test correctly uses
t.Setenvwhich handles cleanup automatically. Consider adding a test case for invalidOTEL_TRACES_SAMPLE_RATIOvalues (e.g., "invalid", "-1", "2.0") to verify the parsing fallback behavior.go.mod (2)
17-18: Note: OTel logs packages are pre-1.0 (experimental).The log exporter packages (
otlploggrpc,otlploghttp,otel/log,otel/sdk/log) are at v0.15.0, indicating they are still in experimental status. Their APIs may change in future releases. This is expected as OTel logs support is relatively new.Also applies to: 23-23, 25-25
6-6: Go 1.24.0 is a stable release (February 2025) and is widely available. No action required.charts/lfx-v2-indexer-service/templates/deployment.yaml (2)
76-78: Good pattern for extra environment variables.The
with/toYamlpattern provides flexibility for injecting additional environment variables without modifying the template.
79-118: OTEL environment variable injection looks good.The conditional blocks properly guard each environment variable, and values are correctly quoted. The template integrates well with the values.yaml configuration.
charts/lfx-v2-indexer-service/values.yaml (2)
69-71: Good addition ofextraEnvfor flexibility.This provides an escape hatch for injecting additional environment variables without template changes.
72-105: Well-documented OTel configuration block.The configuration options are comprehensive and well-documented. Good defaults with exporters disabled by default (safe for production). Note that
propagatorsincludesjaegeralongside W3C propagators, which is useful for mixed environments.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| {{- if .Values.app.otel.tracesSampleRatio }} | ||
| - name: OTEL_TRACES_SAMPLE_RATIO | ||
| value: {{ .Values.app.otel.tracesSampleRatio | quote }} | ||
| {{- end }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential issue: tracesSampleRatio of "0" won't be rendered.
The {{- if .Values.app.otel.tracesSampleRatio }} condition evaluates to false when the value is "0" or 0, which is a valid sample ratio meaning "sample no traces". If disabling sampling via 0 is a valid use case, consider using a different conditional approach.
🔧 Suggested fix using `ne` comparison
- {{- if .Values.app.otel.tracesSampleRatio }}
+ {{- if ne .Values.app.otel.tracesSampleRatio "" }}
- name: OTEL_TRACES_SAMPLE_RATIO
value: {{ .Values.app.otel.tracesSampleRatio | quote }}
{{- end }}Alternatively, always set the value since there's a sensible default:
- {{- if .Values.app.otel.tracesSampleRatio }}
- name: OTEL_TRACES_SAMPLE_RATIO
value: {{ .Values.app.otel.tracesSampleRatio | default "1.0" | quote }}
- {{- end }}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| {{- if .Values.app.otel.tracesSampleRatio }} | |
| - name: OTEL_TRACES_SAMPLE_RATIO | |
| value: {{ .Values.app.otel.tracesSampleRatio | quote }} | |
| {{- end }} | |
| {{- if ne .Values.app.otel.tracesSampleRatio "" }} | |
| - name: OTEL_TRACES_SAMPLE_RATIO | |
| value: {{ .Values.app.otel.tracesSampleRatio | quote }} | |
| {{- end }} |
🤖 Prompt for AI Agents
In `@charts/lfx-v2-indexer-service/templates/deployment.yaml` around lines 103 -
106, The template condition skips a valid value of 0 because `{{- if
.Values.app.otel.tracesSampleRatio }}` treats 0/ "0" as false; change the
conditional to detect presence instead (e.g., `{{- if ne
.Values.app.otel.tracesSampleRatio nil }}` or `{{- if hasKey .Values.app.otel
"tracesSampleRatio" }}`) or unconditionally emit the OTEL_TRACES_SAMPLE_RATIO
with a sensible default using `default` so that a configured 0 is rendered;
update the block that creates the OTEL_TRACES_SAMPLE_RATIO environment variable
accordingly.
| if cfg.Protocol == OTelProtocolHTTP { | ||
| opts := []otlptracehttp.Option{} | ||
| if cfg.Endpoint != "" { | ||
| opts = append(opts, otlptracehttp.WithEndpoint(cfg.Endpoint)) | ||
| } | ||
| if cfg.Insecure { | ||
| opts = append(opts, otlptracehttp.WithInsecure()) | ||
| } | ||
| exporter, err = otlptracehttp.New(ctx, opts...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
otlptracehttp.WithEndpoint vs WithEndpointURL OpenTelemetry Go SDK
💡 Result:
Short answer: WithEndpoint(endpoint string) sets only the host[:port] (e.g. "collector.example.com:4318") and is intended when you want to supply just the host/port (no scheme or path). WithEndpointURL(u string) accepts a full endpoint URL (scheme, host, optional port and path — e.g. "https://collector.example.com:4318/v1/traces"). Both override the OTEL env vars when provided, the last-called option wins if both are used, and neither has effect when WithGRPCConn is used. WithEndpointURL was added in v1.23.0; if you pass an invalid URL WithEndpointURL will keep the default. [1]
Sources:
- otlptracehttp package docs (WithEndpoint, WithEndpointURL). [1]
🌐 Web query:
OpenTelemetry OTEL_EXPORTER_OTLP_ENDPOINT environment variable format specification
💡 Result:
Summary — OTEL_EXPORTER_OTLP_ENDPOINT format
- Value is a URL (scheme, host, optional port, optional path). Scheme MUST be http or https; https means secure. [1]
- For OTLP/HTTP the variable is treated as a base URL: SDKs append the signal paths v1/traces, v1/metrics, v1/logs when per-signal endpoints are not set. If the base URL has no path, the root path / is used. [1]
- For OTLP/gRPC SDKs SHOULD accept forms allowed by the gRPC client; URLs with http or https are supported and scheme semantics (secure/insecure) take precedence. [1]
- Default values: gRPC default http://localhost:4317, HTTP default http://localhost:4318 (and HTTP signal defaults append /v1/…). Examples: export OTEL_EXPORTER_OTLP_ENDPOINT="https://my-api-endpoint:443" or "http://my-api-endpoint/". [2][1]
Sources:
[1] OpenTelemetry OTLP Exporter specification (Endpoint rules).
[2] OpenTelemetry SDK configuration — OTLP exporter (examples, defaults).
🏁 Script executed:
cd pkg/utils && sed -n '1,350p' otel.go | head -100Repository: linuxfoundation/lfx-v2-indexer-service
Length of output: 3632
🏁 Script executed:
cd pkg/utils && sed -n '275,310p' otel.goRepository: linuxfoundation/lfx-v2-indexer-service
Length of output: 1170
🏁 Script executed:
cd pkg/utils && grep -n "otlptrace\|otlpmetric\|otlplog" go.mod 2>/dev/null || cat ../../go.mod | grep -E "(otlptrace|otlpmetric|otlplog|go\.opentelemetry)"Repository: linuxfoundation/lfx-v2-indexer-service
Length of output: 1103
Use WithEndpointURL() instead of WithEndpoint() for HTTP protocol to handle full URLs correctly.
The OTEL_EXPORTER_OTLP_ENDPOINT environment variable per OpenTelemetry specification is a full URL (e.g., http://localhost:4318 or https://collector.example.com:4318/path), but WithEndpoint() expects only host[:port] format (e.g., localhost:4318). This causes a mismatch when users provide the endpoint according to the OTel spec standard.
Switch to WithEndpointURL() (available since v1.23.0; your project uses v1.39.0) which correctly handles full URLs. Apply the same fix to newMetricsProvider and newLoggerProvider functions which have the same issue.
This pull request introduces OpenTelemetry (OTel) tracing support to the lfx-v2-indexer-service, enabling distributed tracing and improved observability. The changes include configuration options for OTel in the Helm chart, environment variable injection for OTel settings, HTTP handler instrumentation, and logging enhancements to include trace and span IDs. Several new dependencies are added to support OTel features.
Key changes:
OpenTelemetry Tracing Integration:
values.yamland enabled their injection as environment variables in the deployment manifest (deployment.yaml). [1] [2]main.go). [1] [2]server.go). [1] [2] [3]Logging Enhancements:
slog-otelhandler for automatic trace context propagation (context.go,logger.go). [1] [2] [3]Dependency Updates:
go.modto support tracing, metrics, and logging exporters.Helm Chart Version Bump:
0.4.14to reflect these significant changes.