-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-28475][CORE] Add regex MetricFilter to GraphiteSink #25232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
ok to test |
|
Thank you for your first contribution, @nkarpov . Could you elaborate more about the verification steps in your PR? |
|
Test build #108023 has finished for PR 25232 at commit
|
| import com.codahale.metrics.MetricRegistry | ||
| import com.codahale.metrics.{Metric, MetricFilter, MetricRegistry} | ||
| import com.codahale.metrics.graphite.{Graphite, GraphiteReporter, GraphiteUDP} | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This empty line is one of the required Apache Spark coding styles.
[error] /home/jenkins/workspace/SparkPullRequestBuilder/core/src/main/scala/org/apache/spark/metrics/sink/GraphiteSink.scala:25:0: There should at least one a single empty line separating groups 3rdParty and spark.
You can run the code style checker by the following.
$ dev/scalastyle
|
Hi @dongjoon-hyun! Setup: start a graphite server & add host, port etc. in For each scenario below step through w/ debugger to confirm MetricFilter is registered and corresponding metrics are posted to graphite server:
I couldn't find an existing testing suite for the GraphiteSink so I verified manually as above. The change seemed minor enough for that to be OK but let me know if a more robust suite should be added. Also, if we're happy with the naming convention here, I will add documentation in the following files as part of the PR https://github.com/apache/spark/blob/master/docs/monitoring.md & |
|
@nkarpov, please update this comment in the PR description under Also, since there is no test, it would be better to describe, with commands, step by step from scratch. With explicit steps, we and our descendants later can detect regressions later other fixes come. Otherwise, every time we should parse your steps and come up with the proper commands. Ideally it should be copy-and-pastable to reduce human efforts since there's no test case that detects regression automatically. |
|
Test build #108026 has finished for PR 25232 at commit
|
|
Thanks @HyukjinKwon - in that case it's worth to just add the tests. Added in latest commit. |
|
Test build #108036 has finished for PR 25232 at commit
|
|
Thanks for adding a test. |
|
retest this please |
|
Test build #108039 has finished for PR 25232 at commit
|
| val props = new Properties | ||
| props.put("host", "127.0.0.1") | ||
| props.put("port", "54321") | ||
| props.put("regex", "streaming") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be ideal to have regex pattern on test case, as we support regex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
|
Thanks for the contribution. I'm seeing the benefit on the change as I introduced similar thing (not for Spark though), but it also ended up with advanced (complicated) supports: accepting multiple patterns, with mode either "whitelist" or "blacklist" (not both, of course). Hopefully I'm not seeing too many Dropwizard metrics in Spark so this might be good enough to start. |
|
Thanks @HeartSaVioR - I've added a better regex example in the test. And you make a good point re: whitelist/blacklist - I think it'll be clear with an example in metrics.properties template? Should we rename the config? Otherwise, @HeartSaVioR @HyukjinKwon @dongjoon-hyun, it's my first commit so please help me understand what it takes to be merged :) My only outstanding item ATM is to include an example in metrics.properties template file if we are good on naming convention. |
|
Test build #108111 has finished for PR 25232 at commit
|
|
I'm also one of contributors trying to help reviewing. :) @HyukjinKwon and @dongjoon-hyun are able to review and approve the patch. So let's wait their next round of reviews. |
|
Can we make the tests passed? Also, you can cc people who appears in Git blame. cc @jerryshao |
|
I've appended some documentation. I don't think the previous failures were related |
|
Test build #108180 has finished for PR 25232 at commit
|
HeartSaVioR
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but as I said earlier you need to get approvals from committers.
|
LGTM. Trigger the test again. |
|
Jenkins, retest this please. |
|
Test build #108540 has finished for PR 25232 at commit
|
jerryshao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
|
Thanks for the contribution, merging to master branch. |
|
Thank you so much, @jerryshao ! :D |
What changes were proposed in this pull request?
Today all registered metric sources are reported to GraphiteSink with no filtering mechanism, although the codahale project does support it.
GraphiteReporter (ScheduledReporter) from the codahale project requires you implement and supply the MetricFilter interface (there is only a single implementation by default in the codahale project, MetricFilter.ALL).
Propose to add an additional regex config to match and filter metrics to the GraphiteSink
How was this patch tested?
Included a GraphiteSinkSuite that tests:
regex=<regexexpr>correctly filters metric keys