You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Feature Request: OpenSearch Sink index_type: tsdb for TSDB Index Support
Is your feature request related to a problem?
With the Prometheus Remote Write source (#6533, PR #6627), Data Prepper will be able to ingest Prometheus metrics as Metric events. But there's no way to write these to OpenSearch TSDB indices. The existing index_type options like metric-analytics produce OTel-format documents. TSDB expects a completely different structure: {labels, timestamp, value} with space-separated label strings.
Anyone wanting to use the OpenSearch TSDB plugin for metrics storage and M3QL queries has no path through Data Prepper today.
Proposed solution
A new index_type: tsdb in the OpenSearch sink. The sink converts Metric events to TSDB document format directly and no processor in between.
The sink takes standard Data Prepper Metric events (JacksonGauge, JacksonSum, JacksonHistogram, JacksonSummary) and converts them. TSDB stores one value per document, so complex types need expansion.
Sum (Counter) --> 1 document:
For monotonic sums, the sink appends _total to the metric name if not already present (Prometheus convention for counters in TSDB).
Histogram --> N+2 documents:
The sink treats bucketCountsList as per-bucket counts and converts to cumulative — which is what TSDB and M3QL expect (Prometheus convention). Each bucket becomes a separate document with a le label, plus _count and _sum documents.
value uses double instead of float --> Prometheus uses float64 internally, and OpenSearch float (32-bit) would silently lose precision on large counter values.
labels has ignore_above: 4096 --> guards against indexing failures if a metric has many labels. Typical Prometheus label strings are ~200 chars.
TSDB-engine settings (tsdb_engine.enabled, tsdb_store, labels.storage_type) are intentionally not in the template and those are cluster-level settings the admin configures when installing the TSDB plugin. Keeping only standard OpenSearch mappings means the template works on any OpenSearch version.
Things worth noting
No processor needed --> histogram/summary expansion happens at the sink level in doOutput(), so the pipeline stays simple (source --> sink) and multi-sink works.
No routing --> TSDB's TSDBAutoRoutingActionFilter handles shard placement from labels automatically. The sink sends documents without routing.
No ISM --> TSDB manages lifecycle through head chunks and blocks. ISM rollover would conflict with that.
Label sanitization --> spaces in label values get replaced with underscores since space is TSDB's delimiter.
Histogram bucket assumption --> the sink treats bucketCountsList as per-bucket (delta) counts and converts to cumulative via running sum. This matches what the prometheus source (Feature Request: Prometheus Remote Write v1 Source for OpenSearch TSDB #6533) produces. If Metric events from other sources already have cumulative bucket counts, the conversion would need to be aware of that.
Compatibility
The TSDB plugin (built against OpenSearch 3.5.0-SNAPSHOT per its build.gradle) requires OpenSearch 3.5.0+. But our index template uses standard mappings only, so it works on any OpenSearch version. TSDB features become available when the plugin is installed.
All changes would be behind indexType == IndexType.TSDB. Existing index types are not affected.
The source (Feature Request: Prometheus Remote Write v1 Source for OpenSearch TSDB #6533) emits Data Prepper Metric events (JacksonGauge, JacksonSum, JacksonHistogram, JacksonSummary). The TSDB sink converts these to TSDB's {labels, timestamp, value} format and no processor needed between source and sink.
Feature Request: OpenSearch Sink
index_type: tsdbfor TSDB Index SupportIs your feature request related to a problem?
With the Prometheus Remote Write source (#6533, PR #6627), Data Prepper will be able to ingest Prometheus metrics as Metric events. But there's no way to write these to OpenSearch TSDB indices. The existing
index_typeoptions likemetric-analyticsproduce OTel-format documents. TSDB expects a completely different structure:{labels, timestamp, value}with space-separated label strings.Anyone wanting to use the OpenSearch TSDB plugin for metrics storage and M3QL queries has no path through Data Prepper today.
Proposed solution
A new
index_type: tsdbin the OpenSearch sink. The sink converts Metric events to TSDB document format directly and no processor in between.This keeps the pipeline simple and also lets us do multi-sink --> send to both Prometheus and TSDB in one pipeline:
TSDB Document Format
TSDB expects three fields per document:
{ "labels": "__name__ http_requests_total method POST handler /api/items status 200", "timestamp": 1633072800000, "value": 1.1 }labels— space-separated key-value pairs, sorted by key.__name__is the metric name.timestamp— epoch milliseconds.value— single numeric value. One value per document.How metric types map to TSDB documents
The sink takes standard Data Prepper Metric events (JacksonGauge, JacksonSum, JacksonHistogram, JacksonSummary) and converts them. TSDB stores one value per document, so complex types need expansion.
Gauge --> 1 document:
Sum (Counter) --> 1 document:
For monotonic sums, the sink appends
_totalto the metric name if not already present (Prometheus convention for counters in TSDB).Histogram --> N+2 documents:
The sink treats
bucketCountsListas per-bucket counts and converts to cumulative — which is what TSDB and M3QL expect (Prometheus convention). Each bucket becomes a separate document with alelabel, plus_countand_sumdocuments.Summary --> N+2 documents:
Each quantile becomes a separate document with a
quantilelabel, plus_countand_sumdocuments.Implementation approach
Following the same pattern as
metric-analytics,log-analytics,trace-analytics-raw:IndexType.java--> addTSDB("tsdb")enum valueIndexConstants.java--> template filename + default aliasmetrics-tsdb-v1IndexConfiguration.java--> TSDB branch inreadIndexTemplate()IndexManagerFactory.java-->case TSDB:withNoIsmPolicyManagementsince TSDB manages its own lifecycle (head chunks → blocks, no ISM rollover needed)OpenSearchSink.java--> TSDB branch indoOutput()using a newTSDBDocumentBuilderTSDBDocumentBuilder.java--> new class: Metric → TSDB document conversion with histogram/summary expansiontsdb-index-template.json--> index template with TSDB mappingsIndex template
{ "version": 1, "mappings": { "properties": { "series_ref": {"type": "long", "doc_values": false}, "labels": {"type": "keyword", "ignore_above": 4096}, "value": {"type": "double", "doc_values": false}, "timestamp": {"type": "date", "format": "epoch_millis"}, "timestamp_range": {"type": "long_range"} } }, "settings": { "index.translog.durability": "async", "index.translog.sync_interval": "1s", "refresh_interval": "1s" } }Notes on deviations from the TSDB README example:
valueusesdoubleinstead offloat--> Prometheus uses float64 internally, and OpenSearchfloat(32-bit) would silently lose precision on large counter values.labelshasignore_above: 4096--> guards against indexing failures if a metric has many labels. Typical Prometheus label strings are ~200 chars.tsdb_engine.enabled,tsdb_store,labels.storage_type) are intentionally not in the template and those are cluster-level settings the admin configures when installing the TSDB plugin. Keeping only standard OpenSearch mappings means the template works on any OpenSearch version.Things worth noting
doOutput(), so the pipeline stays simple (source --> sink) and multi-sink works.TSDBAutoRoutingActionFilterhandles shard placement from labels automatically. The sink sends documents without routing.bucketCountsListas per-bucket (delta) counts and converts to cumulative via running sum. This matches what the prometheus source (Feature Request: Prometheus Remote Write v1 Source for OpenSearch TSDB #6533) produces. If Metric events from other sources already have cumulative bucket counts, the conversion would need to be aware of that.Compatibility
3.5.0-SNAPSHOTper its build.gradle) requires OpenSearch 3.5.0+. But our index template uses standard mappings only, so it works on any OpenSearch version. TSDB features become available when the plugin is installed.indexType == IndexType.TSDB. Existing index types are not affected.{labels, timestamp, value}format and no processor needed between source and sink.Related content