-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Add Host Metrics receiver scaffolding #848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Host Metrics receiver scaffolding #848
Conversation
go.mod
Outdated
| github.com/grpc-ecosystem/grpc-gateway v1.11.1 | ||
| github.com/hashicorp/consul/api v1.2.0 // indirect | ||
| github.com/jaegertracing/jaeger v1.17.0 | ||
| github.com/james-bebbington/winperfcounters v0.0.0-20200417100953-e7a978334ec0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've temporarily put this code into a separate repo: https://github.com/james-bebbington/winperfcounters - it's just a fork of https://github.com/influxdata/telegraf/tree/master/plugins/inputs/win_perf_counters (didn't want to pull in that entire repo as it includes many unrelated dependencies)
I'm not sure where this should live in the long term. One possibility is to try and get this included in gopsutil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would try to avoid depending on a "personal" repo. At least create an issue to track this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would directly depend on telegraf instead. If and when it is included in gopsutil change the dependency. If there is a concern that brings unwanted dependency then this could be moved to the contrib repo.
Another possibility is to copy the winperfcounters as an internal package. Not sure how licensing would work out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also breakdown the commit into two parts to make it easy to review.
- metrics that depend on runtime (with initial frame work).
- metrics that depend on winperfcounters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would directly depend on telegraf instead. If and when it is included in gopsutil change the dependency. If there is a concern that brings unwanted dependency then this could be moved to the contrib repo.
I initially recommended not to depend on telegraf as it brings in too many dependencies. I would have the same concerns if this were in contrib as well. @bogdandrutu @tigrannajaryan thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would try to avoid depending on a "personal" repo. At least create an issue to track this problem.
While I understand the general aversion, we also depend on many libraries in personal repos. It's the nature of git. There isn't much risk given go modules proxying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm quite hesitant to pull in telegraf as per jrcamp's reasoning. Given that its only 5-6 relatively small files, I may just bring them into the collector. I've split this PR up now to make it easier to review, but I'll come back to this in the next one.
544ed7e to
1fba6c9
Compare
Codecov Report
@@ Coverage Diff @@
## master #848 +/- ##
==========================================
- Coverage 85.95% 85.62% -0.33%
==========================================
Files 157 160 +3
Lines 11841 11926 +85
==========================================
+ Hits 10178 10212 +34
- Misses 1296 1341 +45
- Partials 367 373 +6
Continue to review full report at Codecov.
|
1fba6c9 to
d0bdc37
Compare
bogdandrutu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest that initially you split this into 2 PRs:
- Adds receiver + scraper interfaces
- Adds some of the scrapers.
go.mod
Outdated
| github.com/grpc-ecosystem/grpc-gateway v1.11.1 | ||
| github.com/hashicorp/consul/api v1.2.0 // indirect | ||
| github.com/jaegertracing/jaeger v1.17.0 | ||
| github.com/james-bebbington/winperfcounters v0.0.0-20200417100953-e7a978334ec0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would try to avoid depending on a "personal" repo. At least create an issue to track this problem.
|
|
||
| config.Scrapers = map[string]hmcomponent.ScraperConfig{} | ||
|
|
||
| scrapersViperSection := componentViperSection.Sub("scrapers") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use viperSub from the config package. see there the reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config.viperSub isn't exported at the moment. I can change it to be exported, but I'll do it in another PR to avoid the refactoring noise in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
| @@ -0,0 +1,23 @@ | |||
| receivers: | |||
| hostmetrics: | |||
| scrapers: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to have these under scrapers instead of being top-level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be more confusing to not have a header to separate any top level config from the dynamic scraper config? (both for readability and for the code which would have to figure out which keys are top level or not)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it probably avoids some potential confusion at the expense of nesting. Though in the example config we did without the nesting it's not too confusing I think. @tigrannajaryan has a good eye for config so maybe he can chime in.
I think your way is the safer approach though. What do you think about making it collectors or collect to keep with the collection_interval naming scheme?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I see there is default_collection_interval at the top-level so we likely want the additional nesting for consistency. Otherwise we would have default_collection_interval and (for example) cpu as siblings, which does not seem to make sense.
If we don't need default_collection_interval then I would prefer to eliminate scraper and reduce the nesting. I glanced at the PR and cannot see how DefaultCollectionInterval is used in the code (I assume it will be in a future PR?). Is the idea that there is a "default" interval and each scraper can override this? Would it make sense to just hard-code the default and eliminate the config key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense - I think I'll leave it as is for now though. While we could hardcode default_collection_interval (which as you correctly point out will be used in a future PR as the default interval for individual scrapers), there's likely to be other top level config that's needed, e.g. proc_path in the example config @jrcamp linked.
@jrcamp - regarding the naming, I originally called these collectors, but ended up changing to scrapers as I thought that was too confusing / broad given the name of this repo is "collector".
4870074 to
b02ae44
Compare
👍 I've updated this PR so it's just the first part: Receiver & Scraper interfaces |
b02ae44 to
40e2bb3
Compare
5050a85 to
c4ba9b4
Compare
c4ba9b4 to
c9c7267
Compare
| ScraperFactories map[string]internal.Factory | ||
| } | ||
|
|
||
| // NewFactory creates a new factory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // NewFactory creates a new factory | |
| // NewFactory creates a new factory for host metrics receiver. |
| } | ||
| } | ||
|
|
||
| // Type gets the type of the Receiver config created by this Factory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Type gets the type of the Receiver config created by this Factory. | |
| // Type returns the type of the Receiver config created by this Factory. |
| return typeStr | ||
| } | ||
|
|
||
| // CustomUnmarshaler returns custom unmarshaler for this config. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // CustomUnmarshaler returns custom unmarshaler for this config. | |
| // CustomUnmarshaller returns custom unmarshaller for this factory. |
| } | ||
| } | ||
|
|
||
| // CreateTraceReceiver creates a trace receiver based on provided config. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should be
CreateTraceReceiver returns error as trace receiver is not applicable to host metrics receiver.
Other receivers also have similar issue.
| "github.com/open-telemetry/opentelemetry-collector/receiver/hostmetricsreceiver/internal" | ||
| ) | ||
|
|
||
| // Receiver is the type used to handle metrics from VM metrics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Receiver is the type used to handle metrics from VM metrics. | |
| // Receiver is the type that provides scraping host metrics. |
| cancel context.CancelFunc | ||
| } | ||
|
|
||
| // NewHostMetricsReceiver creates a new set of VM and Process Metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // NewHostMetricsReceiver creates a new set of VM and Process Metrics | |
| // NewHostMetricsReceiver creates a host metric scraper. |
| @@ -0,0 +1,17 @@ | |||
| receivers: | |||
| hostmetrics: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add one more hostmetrics receiver with a name, for e.g. 'hostmetrics/customname'. Similar to otlp
typical use would be that one scraper is configured with 10 second interval collecting memory metrics and another one configured with 60 second interval collecting cpu metrics. Unless each category allows configuring separate interval. In that case two instance of hostmetrics should be disallowed.
|
|
||
| service: | ||
| pipelines: | ||
| traces: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be metrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed your comments in the next PR: #862
* Update core/contrib deps * Update renamed components * Update signalfx exporter for contrib open-telemetry#5756 * Update CHANGELOG.md * Fix breaking config map and tracer changes * Account for k8s_tagger rename * Rename k8s_tagger to k8sattributes * reduce collectd-cassandra test flake * improve k8s_tagger rename test * Update hec tls config map usage * move from retracted 0.37.0 to 0.37.1 * Update changelog * Update bundled Smart Agent to v5.14.2 * Update github.com/signalfx/signalfx-agent to main Co-authored-by: Ryan Fitzpatrick <[email protected]>
Link to tracking Issue:
Issue 847
Description:
Added initial scaffolding code for a new hostmetricsreceiver which will scrape Host Metrics (CPU, Memory, Disk, etc) based on the operating system.
This receiver supports configuration per metric type (or "Scraper"), e.g. to collect cpu, memory and disk metrics, config would look something like:
Not included in this PR:
+build windowsflag). At some stage, we'll presumably need to add a build step that ensures this code is built and tested as part of CI (running integration & e2e tests would require a windows agent, which circleci does appear to support). Likewise, may need to look at how to measure code coverage of the windows specific code.