Skip to content

Conversation

@Nicolas-MacBeth
Copy link
Owner

@Nicolas-MacBeth Nicolas-MacBeth commented Jun 22, 2020

This is a work in progress!

It works! Along with my fix in Collector core, I am successfully collecting metrics from 2 different databases at once (Postgres and MySQL)!!! 😄 Everything is implemented, so please go ahead and give feedback! The only thing that may change at a high-level is how I log the subprocesses' output, which we'll talk about in coming meetings. Also, I am currently working on tests, they are coming.

WIP PR for the prometheusexec receiver. Most of the logic is in receiver.go and manager.go. Both config.go files simply define the configuration structs and factory.go is mostly boilerplate code for all receivers in the collector.

@Nicolas-MacBeth Nicolas-MacBeth self-assigned this Jun 22, 2020
Copy link

@draffensperger draffensperger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you create tests for this? I realize it's trickier with processes, but it should be possible - the processes could be simple scripts that just fail, delay for a certain amount of time, etc.

@Nicolas-MacBeth
Copy link
Owner Author

Could you create tests for this? I realize it's trickier with processes, but it should be possible - the processes could be simple scripts that just fail, delay for a certain amount of time, etc.

Yes I will of course be creating tests for all my code, I'll figure out a way for the processes!

@Nicolas-MacBeth Nicolas-MacBeth marked this pull request as draft June 22, 2020 18:52
Nicolas-MacBeth and others added 23 commits June 26, 2020 19:49
Small update
* Revert "Revert "Export spans to hec (#359)" (#375)"

This reverts commit 891b1bc.

This unreverts the revert of #359, now that the release is out.
The jaeger legacy and kubeletstats receivers were not updated to the latest change in the ReceiverFactoryOld interface open-telemetry/opentelemetry-collector@2f6d603
Bump CPU limits for SAPM traces and OC metrics stability tests to align with the unstable CircleCI environment and avoid sporadic stability tests failures like these: open-telemetry#369 , open-telemetry#354
scrapeConfig.JobName = customName

// This is a default Prometheus scrape config value, which indicates that the scraped metrics can be modified
scrapeConfig.HonorLabels = false

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these things that the user might want to configure? Does the regular Prometheus receiver allow you to configure those?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add them! I recall having a conversation with @quentinmit about this and coming to the conclusion to not allow the user to have access to all those settings, and instead going mostly with defaults. Don't mind changing this to enable the user to choose some settings. If I recall correctly we chose not to since the exporters probably all function on the defaults.

For the pods that are just created, some stats can be absent. And it causes crashing the otel agent with nil pointer exception.
Copy link

@draffensperger draffensperger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High level suggestion: I think you should have more testing for this. Specifically:

  • Unit tests for any parts you can isolate fully
  • A functional test that actually starts a subprocess (or two) and restarts them with appropriate random ports
  • Ideally the functional test would be of a subprocess that actually exports a simple hello-world type Prometheus metric (can just run an HTTP server and make it expose a metric via text)

* Add SignalFx demo configuration

* collector.yaml
* k8s.yaml

* Update signalfx-k8s.yaml

* Enable Zipkin for Istio Mixer Adapter

* Update examples/signalfx/signalfx-collector.yaml

Co-authored-by: Paulo Janotti <[email protected]>

* Update examples/signalfx/signalfx-k8s.yaml

Co-authored-by: Paulo Janotti <[email protected]>

* Move to exporter directory

Co-authored-by: Paulo Janotti <[email protected]>
Comment on lines 142 to 145
subprocessConfig.Command = cfg.SubprocessConfig.CommandString
subprocessConfig.Port = cfg.SubprocessConfig.Port
subprocessConfig.Env = cfg.SubprocessConfig.Env
subprocessConfig.CustomName = customName

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's confusing to have two things called config here that have similar but differently named fields - maybe you can reuse the same struct for both?

Copy link
Owner Author

@Nicolas-MacBeth Nicolas-MacBeth Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed the fields to have the same name, but I think the reason I made it this way is that receiver.go (in prometheus_exec) import subprocessmanager already and I can't import prometheus_exec in subprocessmanager to use the same struct since that's a cycle import!

owais and others added 14 commits July 7, 2020 11:34
Publish builds binaries for all supported platforms with the
cross-compile job. Running the build job is redundant for this workflow
as the amd64 linux binary is overwritten by cross-compile anyway. Also
this can cause CI job to fail if build and cross-compile both persist
the same files to CI workspace. However, using cross-compile for publish
workflow and only build for regular PR workflow is a bit awkward and
adds unnecessary complexity to the CI workflow definition. It is far
simpler to just replace build with cross-compile. This means all PR
builds will attempt to build binaries for all supported platforms even
if we loadtest only linux amd64 right now. I think is a good outcome
anyway as PR CI jobs can now catch issues that prevent the collector
from building for any of the supported architectures. This also paves
the way to enable functional/integration/load testing for all platforms
and architectures.
Signed-off-by: Bogdan Drutu <[email protected]>
…ce & added build step for generating Windows MSI (#408)
* Add logging on errors in Runnable

Kubeletstats receiver was missing log statements on
important errors.

* Fix self reported metrics

* Fix obsreport calls
Signed-off-by: Bogdan Drutu <[email protected]>
Signed-off-by: Bogdan Drutu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.