fix(base-cluster/monitoring)!: grafana-tempo-distributed would need s3#1875
fix(base-cluster/monitoring)!: grafana-tempo-distributed would need s3#1875
Conversation
Summary of ChangesHello @cwrau, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical issue where the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughThis PR replaces the Grafana Tempo distributed chart with the standalone tempo chart, restructures tempo-related Helm templates/values (removes ingester block, promotes its fields), and updates the OTLP exporter endpoint in the Alloy gateway from Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–30 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (5)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used🧠 Learnings (4)📓 Common learnings📚 Learning: 2025-07-24T09:41:28.072ZApplied to files:
📚 Learning: 2025-07-24T09:55:53.655ZApplied to files:
📚 Learning: 2025-07-24T09:56:41.380ZApplied to files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
🔇 Additional comments (8)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request addresses an important issue with trace cleanup by switching from the grafana-tempo-distributed Helm chart to the simpler tempo single-binary chart. This is a good move as it removes the requirement for S3-compatible object storage, which was causing the compactor to fail. The configuration changes across the files are mostly correct and reflect the new chart's structure. However, I've found a critical issue in the resource configuration for Tempo that will prevent it from getting the correct resource limits and requests, as detailed in the specific comment.
There was a problem hiding this comment.
Pull request overview
This PR migrates Grafana Tempo from the distributed chart (tempo-distributed) to the single binary chart (tempo) to address an issue where the compactor component doesn't work without S3 storage, preventing trace cleanup.
Key Changes
- Chart migration from
tempo-distributed(v1.57.0) totempo(v1.24.1) - Reconfiguration of deployment from distributed microservices to single binary architecture
- Updated service endpoint from
grafana-tempo-distributor:4317tografana-tempo:4317for trace ingestion
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| charts/base-cluster/values.yaml | Updated chart reference from tempo-distributed to tempo with new version |
| charts/base-cluster/templates/monitoring/tracing/grafana-tempo.yaml | Restructured configuration for single binary deployment including OTLP receivers, persistence, and metrics generator |
| charts/base-cluster/templates/monitoring/alloy-gateway.yaml | Updated trace export endpoint to match new single binary service name |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
charts/base-cluster/templates/monitoring/alloy-gateway.yaml(1 hunks)charts/base-cluster/templates/monitoring/tracing/grafana-tempo.yaml(1 hunks)charts/base-cluster/values.yaml(1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-07-24T09:27:02.830Z
Learnt from: cwrau
Repo: teutonet/teutonet-helm-charts PR: 1605
File: charts/base-cluster/templates/monitoring/tracing/grafana-tempo.yaml:41-46
Timestamp: 2025-07-24T09:27:02.830Z
Learning: The grafana/tempo-distributed Helm chart uses port 3200 (not 3100) for the query-frontend HTTP service by default. The URL "http://grafana-tempo-query-frontend.monitoring:3200" is correct when configuring Grafana datasources for this chart.
Applied to files:
charts/base-cluster/templates/monitoring/alloy-gateway.yaml
📚 Learning: 2025-07-24T09:41:28.072Z
Learnt from: cwrau
Repo: teutonet/teutonet-helm-charts PR: 1604
File: charts/base-cluster/templates/monitoring/metrics-server/metrics-server.yaml:20-21
Timestamp: 2025-07-24T09:41:28.072Z
Learning: The official kubernetes-sigs/metrics-server Helm chart pins the image tag to a specific version in its values.yaml by default, ensuring reproducible deployments without requiring explicit tag specification in consuming charts.
Applied to files:
charts/base-cluster/templates/monitoring/tracing/grafana-tempo.yaml
📚 Learning: 2025-07-24T09:55:53.655Z
Learnt from: cwrau
Repo: teutonet/teutonet-helm-charts PR: 1601
File: charts/base-cluster/templates/dns/external-dns.yaml:30-32
Timestamp: 2025-07-24T09:55:53.655Z
Learning: In charts/base-cluster/templates/dns/external-dns.yaml, the dns.provider field in values.yaml has always been expected to be a map format (e.g., `{ cloudflare: {} }`), never a string format. The template correctly uses `{{ .Values.dns.provider | keys | first }}` to extract the provider name from the map keys.
Applied to files:
charts/base-cluster/values.yaml
📚 Learning: 2025-07-24T09:56:41.380Z
Learnt from: cwrau
Repo: teutonet/teutonet-helm-charts PR: 1601
File: charts/base-cluster/templates/dns/external-dns.yaml:33-39
Timestamp: 2025-07-24T09:56:41.380Z
Learning: In the teutonet-helm-charts base-cluster chart, secret names like "external-dns" for Cloudflare provider are intentionally hard-coded. Users who need custom secret names should use Helm's `valuesFrom` feature to override values rather than expecting dedicated fields in values.yaml. This design keeps the values.yaml clean while still allowing full customization flexibility.
Applied to files:
charts/base-cluster/values.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: check licenses
- GitHub Check: Agent
- GitHub Check: lint helm chart (base-cluster)
🔇 Additional comments (5)
charts/base-cluster/templates/monitoring/alloy-gateway.yaml (1)
99-107: Endpoint update aligns with chart migration.The OTLP exporter endpoint correctly targets the new tempo deployment service. Port 4317 is the appropriate OTLP gRPC receiver port.
charts/base-cluster/values.yaml (1)
114-114: Chart version update is appropriate.The migration to tempo 1.24.1 is consistent with the PR objective to switch from tempo-distributed to the single binary tempo chart.
charts/base-cluster/templates/monitoring/tracing/grafana-tempo.yaml (3)
30-35: OTLP receiver endpoint configuration is correct.The OTLP gRPC endpoint
0.0.0.0:4317matches the grafana/tempo chart's default receiver configuration and is properly exposed as containerPort 4317 in the deployment.
52-52: remoteWriteUrl is correctly configured for kube-prometheus-stack 79.11.0.The endpoint
http://kube-prometheus-stack-prometheus:9090/api/v1/writeon line 52 is accurate. The service name follows the standard Helm release naming pattern ({release}-{component}), port 9090 is the correct Prometheus default, and/api/v1/writeis the correct remote write API path. This configuration is consistent with other components in the codebase (e.g., alloy-collector.yaml uses the same endpoint).
75-75: The Grafana datasource endpoint is correct. The tempo Helm chart (v1.24.1) deployed here uses distributed architecture with a query-frontend component, and port 3200 is the standard HTTP port for the query-frontend service. The service namegrafana-tempo-query-frontendand namespacemonitoringfollow the chart's default naming conventions.Likely an incorrect or invalid review comment.
186e11a to
7e3185a
Compare
Therefore we switch to the singleBinary chart. Without this change the compactor doesn't work; meaning the traces are never cleaned up...
7e3185a to
4a3566a
Compare
🤖 I have created a release *beep* *boop* --- ## [11.0.0](base-cluster-v10.1.2...base-cluster-v11.0.0) (2025-12-10) ### ⚠ BREAKING CHANGES * **base-cluster/monitoring:** grafana-tempo-distributed would need s3 ([#1875](#1875)) ### Features * **base-cluster/flux:** use centralised helmRepositories template ([#1845](#1845)) ([60658ed](60658ed)) ### Bug Fixes * **base-cluster/monitoring:** grafana-tempo-distributed would need s3 ([#1875](#1875)) ([df5c372](df5c372)) * **base-cluster/monitoring:** increase limit of grafana sidecar ([#1878](#1878)) ([fb1ebc6](fb1ebc6)) * **base-cluster:** correctly set image repository ([#1848](#1848)) ([0e108dc](0e108dc)) ### Miscellaneous Chores * **base-cluster/dependencies:** update common docker tag to v1.7.0 ([#1872](#1872)) ([b50ea98](b50ea98)) * **base-cluster/dependencies:** update helm release reflector to v9.1.44 ([#1861](#1861)) ([2cfe4e5](2cfe4e5)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Breaking Changes** * grafana-tempo-distributed now requires S3 configuration * **New Features** * Centralized helmRepositories template management * **Bug Fixes** * Increased Grafana sidecar limit * Corrected image repository reference * **Chores** * Updated dependencies and component versions <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Therefore we switch to the singleBinary chart.
Without this change the compactor doesn't work; meaning the traces are
never cleaned up...
Summary by CodeRabbit
Chores
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.