Skip to content

fix(client): update icmp/ping logic to determine pinger privileged mode#1346

Merged
TwiN merged 11 commits intoTwiN:masterfrom
h3mmy:master
Nov 5, 2025
Merged

fix(client): update icmp/ping logic to determine pinger privileged mode#1346
TwiN merged 11 commits intoTwiN:masterfrom
h3mmy:master

Conversation

@h3mmy
Copy link
Contributor

@h3mmy h3mmy commented Oct 18, 2025

Summary

#697 (comment)
Based around fix attempted in #748 since the experimental tagged image worked for pings in non-privileged containers. Apparently it broke for users in root containers, so this PR includes a check for root privilege, albeit a naive one.

As of v5.26.0, ICMP checks still do not work for my (non-root) deployment regardless of whether I add CAP_NET_RAW

Checking individual capabilities will require adding "kernel.org/pub/linux/libs/security/libcap/cap" as a dependency. I'm not sure if you want to add more dependencies for a small check, so I can use some naive logic to check if the app is running as root like checking the EUID == 0

Since SetPrivileged needs to be set to false for non-privileged processes running on linux or darwin, I figured this is a reasonable check unless you are wanting a more precise check with the extra dependency.

Checklist

  • Tested and/or added tests to validate that the changes work as intended, if applicable.
  • Updated documentation in README.md, if applicable.

@github-actions github-actions bot added the bug Something isn't working label Oct 18, 2025
@h3mmy
Copy link
Contributor Author

h3mmy commented Oct 18, 2025

I can work on adding unit tests once an approach is finalized. I always feel weird making reviews without tests.

h3mmy added 3 commits October 18, 2025 19:52
Signed-off-by: Zee Aslam <zeet6613@gmail.com>
Signed-off-by: Zee Aslam <zeet6613@gmail.com>
… permission error

Signed-off-by: Zee Aslam <zeet6613@gmail.com>
@h3mmy
Copy link
Contributor Author

h3mmy commented Oct 19, 2025

I played around with precisely checking for CAP_NET_RAW and CAP_NET_ADMIN but a consequence of relying on "kernel.org/pub/linux/libs/security/libcap/cap" is tht CGO_ENABLED would need to be set to 1 instead of 0. I hesitate to bring in a new dependency that will change the build process that way.

I would expect folks running as root without having the uid set to 0 is not super common. In order to be thorough, I decided to check for cap_net_raw by simply testing if opening a raw socket throws a permission error. Let me know if you like or dislike it and I can re-adjust

@h3mmy
Copy link
Contributor Author

h3mmy commented Oct 19, 2025

After some testing of the syscall method, it came to light that it won't compile on windows due to the different flags, so I discarded that route and fell back to the initial implementation.

Signed-off-by: Zee Aslam <zeet6613@gmail.com>
@h3mmy
Copy link
Contributor Author

h3mmy commented Oct 19, 2025

I built my own image for testing purposes. Confirmed it works.

repository: ghcr.io/h3mmy/gatus
tag: testing@sha256:87f8e8ea0073d69dd6eecacecf93c55699f2ee9e2e062e547ce4b10b536743d1

// ShouldUsePrivilegedPinger will determine whether or not to run pinger in privileged mode.
// It should be set to privileged when running as root, and always on windows. See https://pkg.go.dev/github.com/macrat/go-parallel-pinger#Pinger.SetPrivileged
func ShouldRunPingerAsPrivileged() bool {
// Set the pinger's privileged mode to false for darwin

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm no expert in Darwin, is it guaranteed that

os.Geteuid() == 0

will always return false on Darwin? Otherwise this should be added as a special case here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone is running this as root on darwin, then they'd want the Privileged mode as per my understanding.

}

// To actually check for cap_net_raw capabilities, we would need to add "kernel.org/pub/linux/libs/security/libcap/cap" to gatus.
// Or use a syscall and check for permission errors, but this requires platform specific compilation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it architecture or os specific implementations?
Because OS specific would be fairly easy in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The low level libraries needed for creating a raw socket on POSIX systems require C-linking. This project currently builds with CGO not enabled, and I am reluctant to make changes that would impact the build process as this will introduce some additional complexity around building for multiple systems and thus increase the scope.

If you have suggestions that don't require CGO, I'm open to trying them out.

Copy link

@heathcliff26 heathcliff26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @h3mmy for the PR.

Nit: Squash Commits

Comment on lines +380 to +381
// As a backstop we can simply check the effective user id and run as privileged when running as root
return os.Geteuid() == 0
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with giving this a try in latest and see what happens, but this may break people whose container is configured to use a custom user id.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If using as a custom user ID with cap net raw, it should be running as privileged to work. I'd be curious if anyone is running it as unprivileged with CAP_NET_RAW and have it be working. I'm happy to work on something more thorough, but it will likely involve some sort of libc linking requirement unless I figure out a different approach.

@TwiN
Copy link
Owner

TwiN commented Nov 4, 2025

Couple comments, but I've tested it on my end and it seems to work fine on my Kubernetes cluster.

h3mmy and others added 3 commits November 3, 2025 22:33
Co-authored-by: TwiN <twin@linux.com>
Match function name

Co-authored-by: TwiN <twin@linux.com>
Remove extra line

Co-authored-by: TwiN <twin@linux.com>
@TwiN TwiN changed the title fix(pinger): update logic to determine pinger privileged mode fix(client): update icmp/ping logic to determine pinger privileged mode Nov 5, 2025
@TwiN TwiN merged commit 5fdc489 into TwiN:master Nov 5, 2025
2 checks passed
@TwiN
Copy link
Owner

TwiN commented Nov 5, 2025

@h3mmy Thank you for the contribution!

@enorasec
Copy link

enorasec commented Nov 5, 2025

Can confirm this is working for me as well. ARM deployment on an Oracle Kubernetes Engine Cluster. Thank you so much @h3mmy!

@TwiN
Copy link
Owner

TwiN commented Nov 5, 2025

Thank you for confirming, I appreciate it!

Having to validate changes alone isn't as scalable as it once used to be 🤣

@roughnecks
Copy link

Ping is broken for me now. I'm compiling gatus myself and run it with systemd:

systemctl cat gatus.service
# /etc/systemd/system/gatus.service
[Unit]
Description=Gatus daemon
After=network-online.target

[Service]
Type=simple
ExecStart=/home/user/go/bin/gatus
Environment=GATUS_CONFIG_PATH="/home/user/config/gatus/"
Environment=GATUS_LOG_LEVEL="WARN"
CapabilityBoundingSet=CAP_NET_RAW
AmbientCapabilities=CAP_NET_RAW
NoNewPrivileges=yes
Restart=always
RestartSec=20s
User=user
WorkingDirectory=/home/user/

[Install]
WantedBy=multi-user.target

@h3mmy
Copy link
Contributor Author

h3mmy commented Nov 9, 2025

Ping is broken for me now. I'm compiling gatus myself and run it with systemd:

systemctl cat gatus.service
# /etc/systemd/system/gatus.service
[Unit]
Description=Gatus daemon
After=network-online.target

[Service]
Type=simple
ExecStart=/home/user/go/bin/gatus
Environment=GATUS_CONFIG_PATH="/home/user/config/gatus/"
Environment=GATUS_LOG_LEVEL="WARN"
CapabilityBoundingSet=CAP_NET_RAW
AmbientCapabilities=CAP_NET_RAW
NoNewPrivileges=yes
Restart=always
RestartSec=20s
User=user
WorkingDirectory=/home/user/

[Install]
WantedBy=multi-user.target

@roughnecks Would you try running it without the CAP_NET_RAW capabilities? You would only need to remove it from the AmbientCapabilities

This seems to be exactly the edge case not covered in the PR. I can work on updating it to include this, although it will be tricky without CGO. But I'd like to confirm that removing that ambient capability resolves this for you

@roughnecks
Copy link

Hey, I just commented out the AmbientCapabilities, still no dice.

@h3mmy
Copy link
Contributor Author

h3mmy commented Nov 9, 2025

@roughnecks To confirm, you commented it out and restarted the systemd service to check, right?

What is your output for sysctl net.ipv4.ping_group_range? Some distros make this range too narrow

You can allow your uid to use unprivileged ping by using something like
echo 'net.ipv4.ping_group_range = 0 2147483647' | sudo tee -a /etc/sysctl.conf

The main thing here is just that the guid for the Linux user needs to be less than the right hand number.

The other thing you can try is changing the User=user to User=root (this ensures EUID is 0 and uses privileged mode

I probably should have added debug logging to check the EUID.

It's super late for me at the moment, but I'll work on this in the morning. I might be able to use the deprecated syscall method to test capabilities to make config changes not a required thing.

@roughnecks
Copy link

yeah, "daemon-reloaded", restarted service.

I changed this: 'net.ipv4.ping_group_range = 0 2147483647' and it's fixed now.

I even had that line already in my sysctl.conf with a comment about gatus, but I believe at the time I thought it was better like I did with the systemd unit.

Oh well. Thanks

@h3mmy
Copy link
Contributor Author

h3mmy commented Nov 9, 2025

@TwiN I want to understand how you want this prioritized. It seems like this scenario will primarily impact people who needed to add a CAP_NET_RAW to get pinger to work in the first place (technically as a workaround). Should I prioritize trying to find a way to put together a raw packet without CGO in order to make the logic more precise, or focus on updating documentation with an explanation of config?

@TwiN
Copy link
Owner

TwiN commented Nov 9, 2025

@h3mmy If the fix you made impacts people who previously had to do some shenanigans to get it to work before, then it's fine to me, there's no need to fix anything. That said, a note in the documentation saying something along the lines of "Prior to v5.31.0, some environments required setting CAP_NET_RAW to [...]. As of v5.31.0, this is no longer necessary. See #1346." Would be greatly appreciated

@h3mmy
Copy link
Contributor Author

h3mmy commented Nov 9, 2025

Thanks for the clarification! Note added. #1384

alexlebens pushed a commit to alexlebens/infrastructure that referenced this pull request Nov 14, 2025
This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [ghcr.io/twin/gatus](https://github.com/TwiN/gatus) | minor | `v5.30.0` -> `v5.31.0` |

---

### Release Notes

<details>
<summary>TwiN/gatus (ghcr.io/twin/gatus)</summary>

### [`v5.31.0`](https://github.com/TwiN/gatus/releases/tag/v5.31.0)

[Compare Source](TwiN/gatus@v5.30.0...v5.31.0)

Highlight of this release are the ability to mark announcements as "archived", which renders said announcements in a new `Past Announcements` section at the bottom of the status page (only rendered if there is at least 1 archived announcements), support for markdown in announcements and support for monitoring gRPC health endpoints.

<img width="1166" height="556" alt="image" src="https://github.com/user-attachments/assets/d22a0ea7-c035-4c35-a148-6de097a357b7" />

#### What's Changed
* feat(announcements): Add support for archived announcements and add past announcement section in UI by @&#8203;TwiN in TwiN/gatus#1382
* feat(announcements): add markdown support by @&#8203;Sworyz in TwiN/gatus#1371
* feat(client): Add support for monitoring gRPC endpoints by @&#8203;diamanat in TwiN/gatus#1376
* fix(client): update icmp/ping logic to determine pinger privileged mode by @&#8203;h3mmy in TwiN/gatus#1346
* fix(api): Escape endpoint key in URL for raw APIs by @&#8203;Nedra1998 in TwiN/gatus#1381
* docs(readme): adds ECS fargate module in README by @&#8203;GiamPy5 in TwiN/gatus#1377

#### New Contributors
* @&#8203;GiamPy5 made their first contribution in TwiN/gatus#1377
* @&#8203;h3mmy made their first contribution in TwiN/gatus#1346
* @&#8203;diamanat made their first contribution in TwiN/gatus#1376
* @&#8203;Nedra1998 made their first contribution in TwiN/gatus#1381
* @&#8203;Sworyz made their first contribution in TwiN/gatus#1371

**Full Changelog**: <TwiN/gatus@v5.30.0...v5.31.0>

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0Mi41LjAiLCJ1cGRhdGVkSW5WZXIiOiI0Mi41LjAiLCJ0YXJnZXRCcmFuY2giOiJtYWluIiwibGFiZWxzIjpbImltYWdlIl19-->

Reviewed-on: https://gitea.alexlebens.dev/alexlebens/infrastructure/pulls/2008
Co-authored-by: Renovate Bot <renovate-bot@alexlebens.net>
Co-committed-by: Renovate Bot <renovate-bot@alexlebens.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants