-
Notifications
You must be signed in to change notification settings - Fork 3.2k
[datadogexporter] Implement translation and export of traces to Datadog format #1203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
commit 99129fb96e29e9c1a92da00b7e3f8efcae8a31e8 Author: Pablo Baeyens <[email protected]> Date: Thu Sep 3 18:10:28 2020 +0200 Handle namespace at initialization time commit babca25927926a60c0c416294af3aadf784d41b9 Author: Pablo Baeyens <[email protected]> Date: Thu Sep 3 17:23:53 2020 +0200 Initialize on a separate function This way the variables can be checked without worrying about the env commit 24d0cb4cc566fa5313a8650c904a27bea68bf555 Author: Pablo Baeyens <[email protected]> Date: Thu Sep 3 14:30:35 2020 +0200 Check environment variables for unified service tagging commit 6695f8297ab8b1fcae71b05acb027c4a0992e3a0 Author: Pablo Baeyens <[email protected]> Date: Wed Sep 2 14:57:37 2020 +0200 Add support for sending metrics through the API - Use datadog.Metric type for simplicity - Get host if unset commit c366603 Author: Pablo Baeyens <[email protected]> Date: Wed Sep 2 09:56:24 2020 +0200 Disable Queue and Retry settings (#72) These are handled by the statsd package. OpenTelemetry docs are confusing and the default configuration (disabled) is different from the one returned by "GetDefault..." functions commit a660b56 Author: Pablo Baeyens <[email protected]> Date: Tue Sep 1 15:26:14 2020 +0200 Add support for summary and distribution metric types (#65) * Add support for summary metric type * Add support for distribution metrics * Refactor metrics construction - Drop name in Metrics (now they act as Metric values) - Refactor constructor so that errors happen at compile-time * Report Summary total sum and count values Snapshot values are not filled in by OpenTelemetry * Report p00 and p100 as `.min` and `.max` This is more similar to what we do for our own non-additive type * Keep hostname if it has not been overridden commit c95adc4 Author: Pablo Baeyens <[email protected]> Date: Thu Aug 27 13:00:02 2020 +0200 Update dependencies and `make gofmt` The collector was updated to 0.9.0 upstream commit 20afb0e Author: Pablo Baeyens <[email protected]> Date: Wed Aug 26 18:25:49 2020 +0200 Refactor configuration (#45) * Refactor configuration * Implement telemetry and tags configuration handling * Update example configuration and README file Co-authored-by: Kylian Serrania <[email protected]> commit fdc98b5 Author: Pablo Baeyens <[email protected]> Date: Fri Aug 21 11:09:08 2020 +0200 Initial DogStatsD implementation (#15) Initial metrics exporter through DogStatsD with support for all metric types but summary and distribution commit e848a60 Author: Pablo Baeyens <[email protected]> Date: Fri Aug 21 10:42:45 2020 +0200 Bump collector version commit 58be9a4 Author: Pablo Baeyens <[email protected]> Date: Thu Aug 6 10:04:32 2020 +0200 Address linter commit 695430c Author: Pablo Baeyens <[email protected]> Date: Tue Aug 4 13:28:01 2020 +0200 Fix field name error MetricsEndpoint was renamed to MetricsURL commit 168b319 Author: Pablo Baeyens <[email protected]> Date: Mon Aug 3 11:05:01 2020 +0200 Create initial outline for Datadog exporter (#1) * Add support for basic configuration options * Documents configuration options
* Backport changes from upstream PR Remove `err` from MapMetrics * Remove usage of pdatautil * Fix tests * Use TCPAddr * Review which functions should be private
* Remove DogStatsD mode * go mod tidy * Remove mentions to DogStatSD
* Improve test coverage Added unit tests for - API key censoring - Hostname - Metrics exporter Renamed test and implementation files for consistency * Add one additional test
The zorkian API does not validate the API key unless you also have an application key, even though the endpoint works without it. I am removing this validation until this gets fixed on the zorkian library
…ed serverless code
* Rewrite without using OpenCensus metrics We will need to add support for Summary metrics when they are added back to the spec * Remove `report_percentiles` option It is no longer used * Handle correctly the timestamps They are given in Unix Timestamps in nanoseconds, thus we need to transform them into Unix Timestamps in seconds
Codecov Report
@@ Coverage Diff @@
## master #1203 +/- ##
==========================================
- Coverage 89.82% 89.52% -0.30%
==========================================
Files 285 290 +5
Lines 13882 14252 +370
==========================================
+ Hits 12469 12759 +290
- Misses 1044 1100 +56
- Partials 369 393 +24
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@ericmustin It would be great if you could split the PR and also make it clear what commit of @mx-psi it builds on top of. Ideally the PR would contain 2 commits: the one that it builds on top of and the one that are the new changes that need to be reviewed. Also, make sure the build passes. |
|
@tigrannajaryan sounds good, i've open up the first of these PRs, will close this PR in favor of the smaller ones |
This PR adds flushing+export of traces and trace-related statistics to the `datadogexporter`, as well as some very minor changes to the translation of internal traces into Datadog format. It represents the second of two PRs for the work contained in #1203. It builds on top of current master branch, and follows up to the work [done here](#1208). The final PR explicitly enabling The Datadog exporter will follow, and will allow users to export traces to Datadog's API Intake. This PR Split was requested by @tigrannajaryan and hopefully should make code review a bit less cumbersome. However if there are any questions or changes to the PR format needed, please let me know. **Testing:** There are unit tests for the different methods and helper methods within the export code. **Documentation:** Appropriate usage, including best practices for which processors to also enable, has been documented in the README, `testdata/config.yaml` and `example/config.yaml` samples. **Notes**: This PR includes a trace exporter for non-windows environments only (metrics are fine in windows, just traces that are the issue), due to reasons explained in this pr #1274 . tl;dr is our trace export code for windows env would rely on CGO for now, which is not permitted in the collector
* Add API key validation (#1216) Adds API key validation to the Datadog metrics exporter. When created, the Datadog metrics exporter now sends a requests to the `/api/v1/validate` endpoint of the Datadog backend to check that the configured API key is valid. If it's not, a warning log is emitted. Tests were amended to take into account that validation call. Test utils were added to mock an HTTP server that performs validation. * sapmexporter: make span source attribute and destination dimension names configurable (#1286) If dimension names are being translated in the signalfxexporter then the map values should be set to the signalfx names. Ideally we can sync to OT dimension names with translation being done on the backend (the default). * Update README (#1294) * Release v0.13.0 (#1295) * Remove duplicate definition of cloud providers with core conventions (#1288) * Remove duplicate definition of cloud providers Signed-off-by: Bogdan Drutu <[email protected]> * Fix more duplicate usage of the cloud providers semconv Signed-off-by: Bogdan Drutu <[email protected]> * Splunkhec receiver metrics (#1276) Adds the ability for Splunk HEC to ingest metrics. This is a follow up to #1268 which adds the ability to ingest logs. * Add jpkrohling as an approver (#1296) * Remove pjanotti from maintainers (#1300) * Auto assign approver and maintainers to PRs (#1301) Signed-off-by: Bogdan Drutu <[email protected]> * Add codeowners to ensure components are assigned to the appropriate reviewers (#1304) This is the initial list extracted from README. * Moved the groupbytrace processor to contrib (#1179) Signed-off-by: Juraci Paixão Kröhling <[email protected]> Co-authored-by: Bogdan Drutu <[email protected]> * Add codeowners for interanl components (#1307) Signed-off-by: Bogdan Drutu <[email protected]> * Small fixes to CODEOWNERS (#1312) Signed-off-by: Juraci Paixão Kröhling <[email protected]> **Description:** This PR changes the CODEOWNERS in a couple of aspects: 1. Fixed the order of the directories, so that 'internal' comes after 'extension' 1. Fixed the name of a few components 1. Added missing components and directories Verified with: ``` for component in exporter extension processor receiver; do ls ${component}/ -1 > /tmp/${component}.txt grep ${component} .github/CODEOWNERS | awk -F\/ '{print $2}' > /tmp/${component}-codeowners.txt diff /tmp/${component}.txt /tmp/${component}-codeowners.txt done ``` Result of the script before this PR: ```diff 11d10 < loadbalancingexporter 2c2 < jmxmetricsextension --- > jmxmetrics 1d0 < groupbytraceprocessor 5c4 < routingprocessor --- > routing ``` * Update collector version in groupbytraceprocessor (#1309) Signed-off-by: Bogdan Drutu <[email protected]> * Update dependabot to ensure all projects are added (#1303) * Update dependabot to ensure all projects are added Signed-off-by: Bogdan Drutu <[email protected]> * Update dependabot.yml * Do not run tests/lint/etc for all component tags (e.g. tag testbed/v0.13.0) (#1298) Signed-off-by: Bogdan Drutu <[email protected]> * tests: increase TestTrace10kSPS memory limits (#1314) * Bump k8s.io/client-go in /receiver/k8sclusterreceiver (#1323) Bumps [k8s.io/client-go](https://github.com/kubernetes/client-go) from 0.19.2 to 0.19.3. - [Release notes](https://github.com/kubernetes/client-go/releases) - [Changelog](https://github.com/kubernetes/client-go/blob/master/CHANGELOG.md) - [Commits](kubernetes/client-go@v0.19.2...v0.19.3) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump github.com/aliyun/aliyun-log-go-sdk (#1321) Bumps [github.com/aliyun/aliyun-log-go-sdk](https://github.com/aliyun/aliyun-log-go-sdk) from 0.1.13 to 0.1.14. - [Release notes](https://github.com/aliyun/aliyun-log-go-sdk/releases) - [Commits](aliyun/aliyun-log-go-sdk@v0.1.13...v0.1.14) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump go.opencensus.io in /processor/groupbytraceprocessor (#1320) Bumps [go.opencensus.io](https://github.com/census-instrumentation/opencensus-go) from 0.22.4 to 0.22.5. - [Release notes](https://github.com/census-instrumentation/opencensus-go/releases) - [Commits](census-instrumentation/opencensus-go@v0.22.4...v0.22.5) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump k8s.io/client-go from 0.19.2 to 0.19.3 in /internal/k8sconfig (#1318) Bumps [k8s.io/client-go](https://github.com/kubernetes/client-go) from 0.19.2 to 0.19.3. - [Release notes](https://github.com/kubernetes/client-go/releases) - [Changelog](https://github.com/kubernetes/client-go/blob/master/CHANGELOG.md) - [Commits](kubernetes/client-go@v0.19.2...v0.19.3) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Add more @elastic folks to codeowners (#1313) In answer to open-telemetry#1304 (comment) add two more codeowners for the Elastic exporter. * Add contrib approvers as owners to all the components. (#1325) Without this change if there is a listed owner with write permission in the component owners list, the contrib approvers will lose their power see #1316. Signed-off-by: Bogdan Drutu <[email protected]> * Bump github.com/aws/aws-sdk-go in /internal/awsxray (#1316) Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.35.9 to 1.35.10. - [Release notes](https://github.com/aws/aws-sdk-go/releases) - [Changelog](https://github.com/aws/aws-sdk-go/blob/master/CHANGELOG.md) - [Commits](aws/aws-sdk-go@v1.35.9...v1.35.10) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump github.com/aws/aws-sdk-go in /internal/awsxray/testdata/sampleapp (#1317) Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.35.9 to 1.35.10. - [Release notes](https://github.com/aws/aws-sdk-go/releases) - [Changelog](https://github.com/aws/aws-sdk-go/blob/master/CHANGELOG.md) - [Commits](aws/aws-sdk-go@v1.35.9...v1.35.10) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Clarify PR reviewing and facilitating (#1315) We recently introduced the automatic assignments of PRs to reviewers and to facilitators. This change explains the process. * Handle nil references from the kubelet API (#1326) * Update to latest collector, update deprecated calls (#1308) Signed-off-by: Bogdan Drutu <[email protected]> * signalfx Receiver: Better Pipeline Error Handling (#1329) If logs aren't configured and events are sent, return a clear error response instead of panicing. Vice-versa for metrics. * [datadogexporter] Improve hostname resolution (#1285) Improve system hostname detection for the Datadog exporter. This PR: - Moves config and host code to their own packages to avoid dependency cycles - Adds hostname validation - Adds fully qualified domain name hostname resolution on some platforms - Adds support for caching hostname Added unit tests, tested on an end to end test environment with the component activated. Documentation was added to all public functions. * Temporarily remove dmitryax from PR facilitators (#1330) dmitryax will be unavailable for a while, removing him from the list of PR facilitators. * Update otel collector, fix breaking change for renaming TracesConsumer (#1328) * Update otel collector, fix breaking change for renaming TracesConsumer Signed-off-by: Bogdan Drutu <[email protected]> * More fixes of usages Signed-off-by: Bogdan Drutu <[email protected]> * Add batchpertrace library (#1257) Signed-off-by: Juraci Paixão Kröhling <[email protected]> Adds a library that will split the incoming batch into several batches, one per trace. **Link to tracking Issue:** Closes #1235. * Fix the link to the release notes (#1327) * Datadog trace flushing/export (#1266) This PR adds flushing+export of traces and trace-related statistics to the `datadogexporter`, as well as some very minor changes to the translation of internal traces into Datadog format. It represents the second of two PRs for the work contained in open-telemetry#1203. It builds on top of current master branch, and follows up to the work [done here](open-telemetry#1208). The final PR explicitly enabling The Datadog exporter will follow, and will allow users to export traces to Datadog's API Intake. This PR Split was requested by @tigrannajaryan and hopefully should make code review a bit less cumbersome. However if there are any questions or changes to the PR format needed, please let me know. **Testing:** There are unit tests for the different methods and helper methods within the export code. **Documentation:** Appropriate usage, including best practices for which processors to also enable, has been documented in the README, `testdata/config.yaml` and `example/config.yaml` samples. **Notes**: This PR includes a trace exporter for non-windows environments only (metrics are fine in windows, just traces that are the issue), due to reasons explained in this pr open-telemetry#1274 . tl;dr is our trace export code for windows env would rely on CGO for now, which is not permitted in the collector * Logzio exporter impl (#1161) Added a logz.io traces exporter **Link to tracking Issue**: #686 **Testing**: Added test for each of the components in the new exporter **Documentation**: Added a readme specifying how to use the exporter and its parameters with an example. * Add the notion of unstable components and unstable executable (#1299) The list of experimental components is defined in unstable_components_enabled.go. These components are only enabled when enable_unstable build tag is defined. We define this tag and produce an executable named otelcontribcol_unstable_$(GOOS)_$(GOARCH)$(EXTENSION) when `make otelcontribcol-unstable` is invoked. For now the new executable is not used anywhere. Next I will look into modifying the testbed to call the new unstable executable for certain tests. To verify that the unstable build functionality is enabled I added stanzareceiver to the list of unstable components and manually verified that it is indeed enabled in the unstable executable but is not available in the regular otelcontribcol executable. Contributes to open-telemetry#873 * JMX Metric Extension: Initial implementation (#1182) * Add JMX Metric Extension implementation * rename package to jmxmetricextension * jmxmetricextension s/metrics/metric * jmx metrics: fix prometheus typo * jmx metrics: capitalize acronyms * jmx metrics: clarify interval * Enable stale PR action (#1341) To help reviewers and authors remember to make progress on PR this action will mark PRs as stale after inactivity of 7 days and will close the PR after 7 more days of inactivity. * [datadogexporter] Enable traces on Windows (#1340) * Re-enable traces code on Windows Use a custom-made version of the Datadog Agent repository that greatly reduces the number of dependencies needed and removes the osext one that depends on CGo * Address linter issue * Empty commit to retrigger CI * Build traces flush/export code on Windows * Add kind type to root span to fix the empty parentID problem (#1338) * Add kind type to root span to fix the empty parentID problem * Set kind type for root span in Xray receiver * Update receiver/awsxrayreceiver/internal/translator/translator.go Co-authored-by: Anuraag Agrawal <[email protected]> Co-authored-by: Bogdan Drutu <[email protected]> Co-authored-by: Anuraag Agrawal <[email protected]> * [awsecscontainermetrics] receiver- Update README (#1358) * [awsecscontainermetrics] receiver- Update README Signed-off-by: Rayhan Hossain <[email protected]> * Use full form of metric units Signed-off-by: Rayhan Hossain <[email protected]> * Add timer support for statsD receiver (#1335) * [datadogexporter] Add Datadog exporter to the otelcontribcol binary (#1352) * Add datadogexporter to the binary * Disable environment variables They don't work; we will revisit it in the future * [datadogexporter] Update go-datadog-api.v2 dependency to v2.30.0 (#1365) * [signalfx_correlation] Add signalfx_correlation exporter skeleton (#1332) * [signalfx_correlation] Add signalfx_correlation exporter skeleton This is for moving the correlation out of sapmexporter into a dedicated exporter so that the correlation can be used even when sapm isn't (for example, on an agent that is exporting in otlp to a gateway instead of sapm.) * fix readme * [awsemfexporter] Restructure Metric Translator Logic (#1353) * Restructure buildCWMetric logic (#1) * Restructure code to remove duplicated logic * Update format * Improve function and variable names * Extract logic for dimension creation and add test * Implement minor fixes * Remove changes to go.sum * Implement tests for getCWMetrics * Implement tests for buildCWMetric * Format metric_translator_test.go * Run with gofmt -s * Disregard ordering of dimensions in test case * Perform dimension equality checking as a helper function * Setting the tlsconfig InsecureSkipVerify using NoVerifySSL (#1350) Co-authored-by: Kylian Serrania <[email protected]> Co-authored-by: Jay Camp <[email protected]> Co-authored-by: Steve Flanders <[email protected]> Co-authored-by: Jeff Cheng <[email protected]> Co-authored-by: Bogdan Drutu <[email protected]> Co-authored-by: Antoine Toulme <[email protected]> Co-authored-by: Tigran Najaryan <[email protected]> Co-authored-by: Paulo Janotti <[email protected]> Co-authored-by: Juraci Paixão Kröhling <[email protected]> Co-authored-by: Eric Mustin <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Andrew Wilkins <[email protected]> Co-authored-by: Pablo Collins <[email protected]> Co-authored-by: Ben Keith <[email protected]> Co-authored-by: Pablo Baeyens <[email protected]> Co-authored-by: Yogev Mets <[email protected]> Co-authored-by: Ryan Fitzpatrick <[email protected]> Co-authored-by: John <[email protected]> Co-authored-by: Bogdan Drutu <[email protected]> Co-authored-by: Anuraag Agrawal <[email protected]> Co-authored-by: Rayhan Hossain (Mukla.C) <[email protected]> Co-authored-by: Gavin Zhang (Kunyuan Zhang) <[email protected]> Co-authored-by: shreyas Darwhatkar <[email protected]>
This PR adds flushing+export of traces and trace-related statistics to the `datadogexporter`, as well as some very minor changes to the translation of internal traces into Datadog format. It represents the second of two PRs for the work contained in open-telemetry#1203. It builds on top of current master branch, and follows up to the work [done here](open-telemetry#1208). The final PR explicitly enabling The Datadog exporter will follow, and will allow users to export traces to Datadog's API Intake. This PR Split was requested by @tigrannajaryan and hopefully should make code review a bit less cumbersome. However if there are any questions or changes to the PR format needed, please let me know. **Testing:** There are unit tests for the different methods and helper methods within the export code. **Documentation:** Appropriate usage, including best practices for which processors to also enable, has been documented in the README, `testdata/config.yaml` and `example/config.yaml` samples. **Notes**: This PR includes a trace exporter for non-windows environments only (metrics are fine in windows, just traces that are the issue), due to reasons explained in this pr open-telemetry#1274 . tl;dr is our trace export code for windows env would rely on CGO for now, which is not permitted in the collector
* Bump jaeger version with thrift 0.13 Signed-off-by: Pavol Loffay <[email protected]> * Go mod tidy Signed-off-by: Pavol Loffay <[email protected]> * Sync Signed-off-by: Pavol Loffay <[email protected]> * Sync 2 Signed-off-by: Pavol Loffay <[email protected]> * clean modules Signed-off-by: Pavol Loffay <[email protected]>
Description: This PR adds utilities to both translate internal trace representation to Datadog format, as well as export those traces (via protobuf) and their respective statistics payloads (via json) to Datadog's API Intake
For the trace translation we are not relying on OpenCensus helpers as other exporters do based on this Gitter thread.
The hostname resolution is currently very simple but will be expanded in future PRs.
Additionally, this work is meant to build on top of my colleague @mx-psi metric related work, so happy to rebase at any time if some work from his branch gets merged into master during review process
Lastly, while most of the added lines here are just from mocks in the tests, i'd be happy to split this PR into a
trace translationsection andtrace exportsection if that's preferable for the reviewers.Link to tracking Issue: n/a
Testing: There are unit tests for the different methods and helper methods within the translation code, as well as integration tests for configuration and expected payload format of the export client.
Documentation: We'd like to eventually add some suggested best practices on using the
groupbytraceandbatchprocessors in pipelines, but for now thetestdata/config.yamlandexample/config.yamldemonstrate it's usage.example trace