Skip to content

Conversation

@carsonip
Copy link
Contributor

@carsonip carsonip commented Mar 7, 2025

Description

Breaking change: change default mapping::mode config to otel for the best user experience and the most intuitive document structure in Elasticsearch. See README to learn more about otel mapping mode. To retain the old behavior, explicitly set mapping::mode to none.

Should be released together with #38500

Link to tracking issue

Fixes #37241

Documentation

Updated README

Copy link
Contributor Author

@carsonip carsonip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README under otel mapping mode

data_stream.dataset will always be appended with .otel. It is recommended to use with
*_dynamic_index::enabled: true (e.g. logs_dynamic_index::enabled) to route documents to data stream
${data_stream.type}-${data_stream.dataset}-${data_stream.namespace}

TODO: This seems out of place. It is the responsibility of *_dynamic_index. fixed

@carsonip carsonip marked this pull request as ready for review March 11, 2025 14:39
@carsonip carsonip requested a review from a team as a code owner March 11, 2025 14:39
@carsonip carsonip requested a review from jpkrohling March 11, 2025 14:39
`data_stream.dataset` will always be appended with `.otel`. It is recommended to use with
`*_dynamic_index::enabled: true` (e.g. `logs_dynamic_index::enabled`) to route documents to data stream
`${data_stream.type}-${data_stream.dataset}-${data_stream.namespace}`.
`data_stream.dataset` will always be appended with `.otel` if [dynamic data stream routing mode](#elasticsearch-document-routing) is active.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expecting a conflict with #38500 . Can be resolved once either PR is merged.

Copy link
Contributor

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Requires Elasticsearch 8.12 or above.
The default and recommended "OTel-native" mapping mode.

Requires Elasticsearch 8.12 or above[^1], works best with Elasticsearch 8.16 or above[^2].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a user that uses the Elasticsearch exporter with Elasticsearch < 8.12 and the default mapping mode, they will stop seeing their data in Elasticsearch, and start seeing rejection logs in the collector. This is not ideal.

On the other hand, 8.12 is over a year old (January 2024) and the officially supported versions of Elasticsearch are currently 8.16 and 8.17.

I'm on board with this change as it makes new users' experience much better. Also the remediation is easy for affected users: add a configuration flag, as described in the changelog entry.

2. Otherwise, check your metrics pipeline setup for misconfiguration that causes an actual violation of the [single writer principle](https://opentelemetry.io/docs/specs/otel/metrics/data-model/#single-writer).
This means that the same metric with the same dimensions is sent from multiple sources, which is not allowed in the OTel metrics data model.

### flush failed (400) illegal_argument_exception
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for adding this. It should make it easier for users affected by this change to fix their setup.

@andrzej-stencel andrzej-stencel added the ready to merge Code review completed; ready to merge by maintainers label Mar 12, 2025
@songy23 songy23 merged commit 1930d76 into open-telemetry:main Mar 12, 2025
171 checks passed
@github-actions github-actions bot added this to the next release milestone Mar 12, 2025
Requires Elasticsearch 8.12 or above[^1], works best with Elasticsearch 8.16 or above[^2].

[^1]: as it uses the undocumented `require_data_stream` bulk API parameter supported from Elasticsearch 8.12
[^2]: Elasticsearch 8.16 contains a built-in `otel-data` plugin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi, Is there a link to the relevant introduction?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a link to the initial PR: elastic/elasticsearch#111091

songy23 pushed a commit that referenced this pull request Mar 14, 2025
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

Breaking change.

Overhaul in document routing. New document routing logic:
```
Documents are statically or dynamically routed to the target index / data stream in the following order. The first routing mode that applies will be used.
1. "Static mode": Route to `logs_index` for log records, `metrics_index` for data points and `traces_index` for spans, if these configs are not empty respectively. [^3]
2. "Dynamic - Index attribute mode": Route to index name specified in `elasticsearch.index` attribute (precedence: log record / data point / span attribute > scope attribute > resource attribute) if the attribute exists. [^3]
3. "Dynamic - Data stream routing mode": Route to data stream constructed from `${data_stream.type}-${data_stream.dataset}-${data_stream.namespace}`,
where `data_stream.type` is `logs` for log records, `metrics` for data points, and `traces` for spans, and is static. [^3]
In a special case with `mapping::mode: bodymap`, `data_stream.type` field (valid values: `logs`, `metrics`) can be dynamically set from attributes.
The resulting documents will contain the corresponding `data_stream.*` fields, see restrictions applied to [Data Stream Fields](https://www.elastic.co/guide/en/ecs/current/ecs-data_stream.html).
   1. `data_stream.dataset` or `data_stream.namespace` in attributes (precedence: log record / data point / span attribute > scope attribute > resource attribute)
   2. Otherwise, if scope name matches regex `/receiver/(\w*receiver)`, `data_stream.dataset` will be capture group #1
   3. Otherwise, `data_stream.dataset` falls back to `generic` and `data_stream.namespace` falls back to `default`. 
```

```
In OTel mapping mode (`mapping::mode: otel`), there is special handling in addition to the above document routing rules in [Elasticsearch document routing](#elasticsearch-document-routing).
The order to determine the routing mode is the same as [Elasticsearch document routing](#elasticsearch-document-routing).

1. "Static mode": Span events are separate documents routed to `logs_index` if non-empty.
2. "Dynamic - Index attribute mode": Span events are separate documents routed using attribute `elasticsearch.index` (precedence: span event attribute > scope attribute > resource attribute) if the attribute exists.
3. "Dynamic - Data stream routing mode":
  - For all documents, `data_stream.dataset` will always be appended with `.otel`.
  - A special case to (3)(1) in [Elasticsearch document routing](#elasticsearch-document-routing), span events are separate documents that have `data_stream.type: logs` and are routed using data stream attributes (precedence: span event attribute > scope attribute > resource attribute)

```

Effective changes:
- Deprecate and make `{logs,metrics,traces}_dynamic_index` config no-op
- Config validation error on
`{logs,metrics,traces}_dynamic_index::enabled` and
`{logs,metrics,traces}_index` set at the same time, as users who rely on
dynamic index should not set `{logs,metrics,traces}_index`.
- Remove `elasticsearch.index.{prefix,suffix}` handling. Replace it with
`elasticsearch.index` handling that uses attribute value as index
directly. Users rely on the previously supported
`elasticsearch.index.prefix` and `elasticsearch.index.suffix` should
migrate to a transform processor that sets `elasticsearch.index`.
- Fix a bug where receiver-based routing overwrites data_stream.dataset.

Should be released together with
#38458

<!-- Issue number (e.g. #1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes #38361

<!--Describe what testing was performed and which tests were added.-->
#### Testing

<!--Describe the documentation added.-->
#### Documentation

<!--Please delete paragraphs that you did not use before submitting.-->

---------

Co-authored-by: Andrzej Stencel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

exporter/elasticsearch ready to merge Code review completed; ready to merge by maintainers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[exporter/elasticsearch] Change default mapping mode to otel

7 participants