fix(client): update icmp/ping logic to determine pinger privileged mode#1346
fix(client): update icmp/ping logic to determine pinger privileged mode#1346TwiN merged 11 commits intoTwiN:masterfrom
Conversation
|
I can work on adding unit tests once an approach is finalized. I always feel weird making reviews without tests. |
Signed-off-by: Zee Aslam <zeet6613@gmail.com>
Signed-off-by: Zee Aslam <zeet6613@gmail.com>
… permission error Signed-off-by: Zee Aslam <zeet6613@gmail.com>
|
I played around with precisely checking for I would expect folks running as root without having the uid set to 0 is not super common. In order to be thorough, I decided to check for cap_net_raw by simply testing if opening a raw socket throws a permission error. Let me know if you like or dislike it and I can re-adjust |
Signed-off-by: Zee Aslam <zeet6613@gmail.com>
|
After some testing of the syscall method, it came to light that it won't compile on windows due to the different flags, so I discarded that route and fell back to the initial implementation. |
Signed-off-by: Zee Aslam <zeet6613@gmail.com>
|
I built my own image for testing purposes. Confirmed it works. repository: |
| // ShouldUsePrivilegedPinger will determine whether or not to run pinger in privileged mode. | ||
| // It should be set to privileged when running as root, and always on windows. See https://pkg.go.dev/github.com/macrat/go-parallel-pinger#Pinger.SetPrivileged | ||
| func ShouldRunPingerAsPrivileged() bool { | ||
| // Set the pinger's privileged mode to false for darwin |
There was a problem hiding this comment.
I'm no expert in Darwin, is it guaranteed that
os.Geteuid() == 0
will always return false on Darwin? Otherwise this should be added as a special case here.
There was a problem hiding this comment.
If someone is running this as root on darwin, then they'd want the Privileged mode as per my understanding.
| } | ||
|
|
||
| // To actually check for cap_net_raw capabilities, we would need to add "kernel.org/pub/linux/libs/security/libcap/cap" to gatus. | ||
| // Or use a syscall and check for permission errors, but this requires platform specific compilation |
There was a problem hiding this comment.
Is it architecture or os specific implementations?
Because OS specific would be fairly easy in this case.
There was a problem hiding this comment.
The low level libraries needed for creating a raw socket on POSIX systems require C-linking. This project currently builds with CGO not enabled, and I am reluctant to make changes that would impact the build process as this will introduce some additional complexity around building for multiple systems and thus increase the scope.
If you have suggestions that don't require CGO, I'm open to trying them out.
heathcliff26
left a comment
There was a problem hiding this comment.
Thank you @h3mmy for the PR.
Nit: Squash Commits
| // As a backstop we can simply check the effective user id and run as privileged when running as root | ||
| return os.Geteuid() == 0 |
There was a problem hiding this comment.
I'm ok with giving this a try in latest and see what happens, but this may break people whose container is configured to use a custom user id.
There was a problem hiding this comment.
If using as a custom user ID with cap net raw, it should be running as privileged to work. I'd be curious if anyone is running it as unprivileged with CAP_NET_RAW and have it be working. I'm happy to work on something more thorough, but it will likely involve some sort of libc linking requirement unless I figure out a different approach.
|
Couple comments, but I've tested it on my end and it seems to work fine on my Kubernetes cluster. |
Co-authored-by: TwiN <twin@linux.com>
Match function name Co-authored-by: TwiN <twin@linux.com>
Remove extra line Co-authored-by: TwiN <twin@linux.com>
|
@h3mmy Thank you for the contribution! |
|
Can confirm this is working for me as well. ARM deployment on an Oracle Kubernetes Engine Cluster. Thank you so much @h3mmy! |
|
Thank you for confirming, I appreciate it! Having to validate changes alone isn't as scalable as it once used to be 🤣 |
|
Ping is broken for me now. I'm compiling gatus myself and run it with systemd: |
@roughnecks Would you try running it without the This seems to be exactly the edge case not covered in the PR. I can work on updating it to include this, although it will be tricky without CGO. But I'd like to confirm that removing that ambient capability resolves this for you |
|
Hey, I just commented out the AmbientCapabilities, still no dice. |
|
@roughnecks To confirm, you commented it out and restarted the systemd service to check, right? What is your output for You can allow your uid to use unprivileged ping by using something like The main thing here is just that the guid for the Linux user needs to be less than the right hand number. The other thing you can try is changing the I probably should have added debug logging to check the EUID. It's super late for me at the moment, but I'll work on this in the morning. I might be able to use the deprecated syscall method to test capabilities to make config changes not a required thing. |
|
yeah, "daemon-reloaded", restarted service. I changed this: 'net.ipv4.ping_group_range = 0 2147483647' and it's fixed now. I even had that line already in my sysctl.conf with a comment about gatus, but I believe at the time I thought it was better like I did with the systemd unit. Oh well. Thanks |
|
@TwiN I want to understand how you want this prioritized. It seems like this scenario will primarily impact people who needed to add a CAP_NET_RAW to get pinger to work in the first place (technically as a workaround). Should I prioritize trying to find a way to put together a raw packet without CGO in order to make the logic more precise, or focus on updating documentation with an explanation of config? |
|
@h3mmy If the fix you made impacts people who previously had to do some shenanigans to get it to work before, then it's fine to me, there's no need to fix anything. That said, a note in the documentation saying something along the lines of "Prior to v5.31.0, some environments required setting CAP_NET_RAW to [...]. As of v5.31.0, this is no longer necessary. See #1346." Would be greatly appreciated |
|
Thanks for the clarification! Note added. #1384 |
This PR contains the following updates: | Package | Update | Change | |---|---|---| | [ghcr.io/twin/gatus](https://github.com/TwiN/gatus) | minor | `v5.30.0` -> `v5.31.0` | --- ### Release Notes <details> <summary>TwiN/gatus (ghcr.io/twin/gatus)</summary> ### [`v5.31.0`](https://github.com/TwiN/gatus/releases/tag/v5.31.0) [Compare Source](TwiN/gatus@v5.30.0...v5.31.0) Highlight of this release are the ability to mark announcements as "archived", which renders said announcements in a new `Past Announcements` section at the bottom of the status page (only rendered if there is at least 1 archived announcements), support for markdown in announcements and support for monitoring gRPC health endpoints. <img width="1166" height="556" alt="image" src="https://github.com/user-attachments/assets/d22a0ea7-c035-4c35-a148-6de097a357b7" /> #### What's Changed * feat(announcements): Add support for archived announcements and add past announcement section in UI by @​TwiN in TwiN/gatus#1382 * feat(announcements): add markdown support by @​Sworyz in TwiN/gatus#1371 * feat(client): Add support for monitoring gRPC endpoints by @​diamanat in TwiN/gatus#1376 * fix(client): update icmp/ping logic to determine pinger privileged mode by @​h3mmy in TwiN/gatus#1346 * fix(api): Escape endpoint key in URL for raw APIs by @​Nedra1998 in TwiN/gatus#1381 * docs(readme): adds ECS fargate module in README by @​GiamPy5 in TwiN/gatus#1377 #### New Contributors * @​GiamPy5 made their first contribution in TwiN/gatus#1377 * @​h3mmy made their first contribution in TwiN/gatus#1346 * @​diamanat made their first contribution in TwiN/gatus#1376 * @​Nedra1998 made their first contribution in TwiN/gatus#1381 * @​Sworyz made their first contribution in TwiN/gatus#1371 **Full Changelog**: <TwiN/gatus@v5.30.0...v5.31.0> </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0Mi41LjAiLCJ1cGRhdGVkSW5WZXIiOiI0Mi41LjAiLCJ0YXJnZXRCcmFuY2giOiJtYWluIiwibGFiZWxzIjpbImltYWdlIl19--> Reviewed-on: https://gitea.alexlebens.dev/alexlebens/infrastructure/pulls/2008 Co-authored-by: Renovate Bot <renovate-bot@alexlebens.net> Co-committed-by: Renovate Bot <renovate-bot@alexlebens.net>
Summary
#697 (comment)
Based around fix attempted in #748 since the experimental tagged image worked for pings in non-privileged containers. Apparently it broke for users in root containers, so this PR includes a check for root privilege, albeit a naive one.
As of v5.26.0, ICMP checks still do not work for my (non-root) deployment regardless of whether I add
CAP_NET_RAWChecklist
README.md, if applicable.