Feature Request: Prometheus Remote Write v1 Source for Data Prepper
Is your feature request related to a problem?
Currently, there is no native way to ingest metrics from Prometheus servers directly into OpenSearch. Organizations using Prometheus for metrics collection must build and maintain custom exporters to convert Prometheus Remote Write format, which is time-consuming and error-prone.
What solution would you like?
Implement a Prometheus Remote Write v1.0 source plugin for Data Prepper that:
- Receives Prometheus Remote Write requests on a configurable HTTP endpoint (default:
http://data-prepper:9090/api/v1/write)
- Parses the protocol (Snappy decompression + Protocol Buffer parsing)
- Converts to multiple output formats (TSDB, OTEL, OpenSearch, Prometheus)
- Supports OTLP collectors via OpenTelemetry Collector's
prometheusremotewrite exporter
Configuration:
Prometheus side (2-line change):
# prometheus.yml
remote_write:
- url: "http://data-prepper:9090/api/v1/write"
Data Prepper pipeline:
prometheus-pipeline:
source:
prometheus:
port: 9090
path: "/api/v1/write"
# Source emits Metric events (Data Prepper internal model)
sink:
- opensearch:
hosts: ["https://opensearch:9200"]
index_type: tsdb # Sink converts Metric → TSDB format
index: metrics
Format Specifications
TSDB Format
OpenSearch TSDB expects documents with space-separated labels format:
Format:
{
"labels": "__name__ http_requests_total method POST handler /api/items status 200",
"timestamp": 1633072800000,
"value": 1.1
}
Specification:
- https://github.com/opensearch-project/time-series-db
- Labels are space-separated key-value pairs: key1 value1 key2 value2
- See https://github.com/opensearch-project/time-series-db#index-some-metrics
Sink Configuration:
sink:
- opensearch:
hosts: ["https://opensearch:9200"]
index_type: tsdb
index: metrics
OTEL Format
OpenTelemetry Metrics format following the OTEL specification:
Format:
{
"kind": "gauge",
"name": "http_requests",
"value": 100,
"attributes": {"method": "GET"},
"time": "2024-02-12T15:30:00.000Z"
}
Specification:
- https://opentelemetry.io/docs/specs/otel/metrics/data-model/
- Data Prepper's internal Metric interface aligns with OTEL model
Sink Configuration:
sink:
- opensearch:
hosts: ["https://opensearch:9200"]
index_type: otel_metrics # If supported
index: metrics
OpenSearch Format
Standard OpenSearch document format with conventional field names:
Format:
{
"@timestamp": "2024-02-12T15:30:00Z",
"metric_name": "http_requests",
"metric_value": 100,
"labels": {"method": "GET"}
}
Specification:
- Standard OpenSearch document structure
- Compatible with existing dashboards and queries
Sink Configuration:
sink:
- opensearch:
hosts: ["https://opensearch:9200"]
index: metrics # Regular index
Prometheus Format
Preserve original Prometheus Remote Write structure for compatibility:
Format:
{
"name": "http_requests",
"labels": {
"__name__": "http_requests",
"method": "GET"
},
"value": 100,
"timestamp": 1707523200000
}
Specification:
- https://prometheus.io/docs/specs/remote_write_spec/
- Used for debugging or forwarding to other Prometheus systems
Sink Configuration:
sink:
- prometheus:
url: "https://prometheus:9200" # Amazon Managed Prometheus
Examples
Basic Example
Input: Prometheus Remote Write
TimeSeries {
labels: [{name: "__name__", value: "http_requests"}, {name: "method", value: "GET"}]
samples: [{value: 100, timestamp: 1707523200000}]
}
Output (TSDB format):
{"labels": "__name__ http_requests method GET", "timestamp": 1707523200000, "value": 100}
Output (OTEL format):
{"kind": "gauge", "name": "http_requests", "value": 100, "attributes": {"method": "GET"}}
Output (OpenSearch format):
{"@timestamp": "2024-02-12T15:30:00Z", "metric_name": "http_requests", "metric_value": 100, "labels": {"method": "GET"}}
Output (Prometheus format):
{"name": "http_requests", "labels": {"__name__": "http_requests", "method": "GET"}, "value": 100, "timestamp": 1707523200000}
## Multi-Sink Example
Send to both Prometheus and OpenSearch TSDB simultaneously:
```yaml
prometheus-pipeline:
source:
prometheus:
port: 9090
path: "/api/v1/write"
sink:
- prometheus:
url: "https://aps-workspaces.us-west-2.amazonaws.com/workspaces/ws-xxx"
- opensearch:
hosts: ["https://opensearch:9200"]
index_type: tsdb
index: metrics
Feature Request: Prometheus Remote Write v1 Source for Data Prepper
Is your feature request related to a problem?
Currently, there is no native way to ingest metrics from Prometheus servers directly into OpenSearch. Organizations using Prometheus for metrics collection must build and maintain custom exporters to convert Prometheus Remote Write format, which is time-consuming and error-prone.
What solution would you like?
Implement a Prometheus Remote Write v1.0 source plugin for Data Prepper that:
http://data-prepper:9090/api/v1/write)prometheusremotewriteexporterConfiguration:
Prometheus side (2-line change):
Data Prepper pipeline:
Format Specifications
TSDB Format
OpenSearch TSDB expects documents with space-separated labels format:
Format:
{ "labels": "__name__ http_requests_total method POST handler /api/items status 200", "timestamp": 1633072800000, "value": 1.1 } Specification: - https://github.com/opensearch-project/time-series-db - Labels are space-separated key-value pairs: key1 value1 key2 value2 - See https://github.com/opensearch-project/time-series-db#index-some-metrics Sink Configuration: sink: - opensearch: hosts: ["https://opensearch:9200"] index_type: tsdb index: metrics OTEL Format OpenTelemetry Metrics format following the OTEL specification: Format: { "kind": "gauge", "name": "http_requests", "value": 100, "attributes": {"method": "GET"}, "time": "2024-02-12T15:30:00.000Z" } Specification: - https://opentelemetry.io/docs/specs/otel/metrics/data-model/ - Data Prepper's internal Metric interface aligns with OTEL model Sink Configuration: sink: - opensearch: hosts: ["https://opensearch:9200"] index_type: otel_metrics # If supported index: metrics OpenSearch Format Standard OpenSearch document format with conventional field names: Format: { "@timestamp": "2024-02-12T15:30:00Z", "metric_name": "http_requests", "metric_value": 100, "labels": {"method": "GET"} } Specification: - Standard OpenSearch document structure - Compatible with existing dashboards and queries Sink Configuration: sink: - opensearch: hosts: ["https://opensearch:9200"] index: metrics # Regular index Prometheus Format Preserve original Prometheus Remote Write structure for compatibility: Format: { "name": "http_requests", "labels": { "__name__": "http_requests", "method": "GET" }, "value": 100, "timestamp": 1707523200000 } Specification: - https://prometheus.io/docs/specs/remote_write_spec/ - Used for debugging or forwarding to other Prometheus systems Sink Configuration: sink: - prometheus: url: "https://prometheus:9200" # Amazon Managed PrometheusExamples
Basic Example
Input: Prometheus Remote Write
Output (TSDB format):
{"labels": "__name__ http_requests method GET", "timestamp": 1707523200000, "value": 100}Output (OTEL format):
{"kind": "gauge", "name": "http_requests", "value": 100, "attributes": {"method": "GET"}}Output (OpenSearch format):
{"@timestamp": "2024-02-12T15:30:00Z", "metric_name": "http_requests", "metric_value": 100, "labels": {"method": "GET"}}Output (Prometheus format):
{"name": "http_requests", "labels": {"__name__": "http_requests", "method": "GET"}, "value": 100, "timestamp": 1707523200000}