-
Notifications
You must be signed in to change notification settings - Fork 103
feat: transition from standalone prometheus to kube-prometheus-stack #70
Conversation
feat: update test-forwards utility script for prometheus operator use
feat: convert prometheus to kube-prometheus-stack
feat: Update utility script to use new services from prometheus operator
feat: add extras script to fix permissions on kube-proxy metrics
feat: modifications to NGINX IC to allow prometheus service monitor to pull metrics
feat: added service monitor for ledgerdb and accountdb postgres
fix: adjust depends_on for prometheus deployment
| Notes: | ||
| 1. The NGINX IC needs to be configured to expose prometheus metrics; this is currently done by default. | ||
| 2. The default address binding of the `kube-proxy` component is set to `127.0.0.1` and as such will cause errors when the | ||
| canned prometheus scrape configurations are run. The fix is to set this address to `0.0.0.0`. An example manifest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be a security issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on everything I read, no, because:
- It's an internal address that this is exposed on (whatever the cluster addressing is internally)
- When the connections are made they are made over TLS using a shared secret, so w/o that secret you're not going to be allowed to connect.
So, I view it as most likely safe - but I'm leaving it as something that everyone can decide for themselves if they want to run or not. I suppose once we get more of an automated process in place we can have this as a 'do you want to run this y/n" prompt.
| ### Grafana | ||
| **NOTE:** This deployment has been deprecated but the project has been left as an example on how to deploy Grafana in this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just delete and point folks to the git history. We don't want to carry this forward. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went back and forth on this. Part of me wanted to delete it, but then another part started down the "well, what if the user wants to swap out prometheus for something else and still wants grafana?"
If we go to a modular approach where the user runs a script and answers prompts as to what they want / don't want, I feel that just keeping it in place (preferably with a few tests around it to make sure it works) would be fine - since I'm pulling from the mainline grafana builds, we could just manage it like the other dependencies.
That said, I'm not married to this idea - so let me know what you think in light of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I say, let's delete it. It will always be in the source history and we can always come back and add it again after we have better support for multiple options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleted in last commit.
|
Any ideas why the build is failing? |
|
re: why is the build failling. I have no idea; I've been digging into it and we keep hitting this: Nothing has changed w/ this code as far as I know... |
|
Note that the issue with the tests was corrected by updating the requirements.txt to a new version of pulumi; pretty sure it's not a matter of what was upgraded but more a matter of the fact that we upgraded it. |
Proposed changes
This change moves us from using a standalone "ala cart" version of the prometheus services to an integrated prometheus operator based deployment using the prometheus community kube-prometheus-stack.
This update also installs the appropriate service monitors to handle stated (from the gunicorn python apps in the bank of sirius project), postgres/prometheus exporters (in the bank of sirius postgres installs), and the NGINX KIC.
This change also includes an extras script to handle the updates needed to read kube-proxy metrics along with a readme.
Documentation updates are in progress.
Checklist
Before creating a PR, run through this checklist and mark each as complete.