-
Notifications
You must be signed in to change notification settings - Fork 377
Description
Description
This issue is very similar to #1483, if it seems familiar please bear with me, I think there is an important difference. In #1483 the storage folder is inaccessible to the user running podman without their secondary groups, but in this issue it should still be accessible because the user in question owns the folder.
Running a container in rootless mode and --userns=keep-id with podman fails if a parent directory of the storage is owned by a group not part of the containers user namespace, and that directory also has no world execute permission.
For example, if the username is user with main group user, then a directory with permissions drwx------ owned by user and owning group userdata as a parent path of the storage location leads to a failure to run any container.
Steps to reproduce:
- Use the following
storage.conf(in~/.config/containers/):
[storage]
driver = "overlay"
graphroot = "/tmp/podman-test/storage"
[storage.options]
mount_program = "/usr/bin/fuse-overlayfs"- Create a group
userdata(it's not required to adduserto this group) - Setup the storage folder:
mkdir /tmp/podman-test
chown user:userdata /tmp/podman-test/
chmod u+rwx /tmp/podman-test/
chmod og-rwx /tmp/podman-test/- Run a container with a non-root user:
podman run --userns=keep-id alpine ls
Expected result:
The container is run successfully.
Actual result:
The following error is raised:
Error: crun: open `/tmp/graphroot/overlay/5fba3a9a250150294dcb692656d165ec6bd26c9c6be2c692183a70e24083b29c/merged`: Permission denied: OCI permission denied
After running chgrp user /tmp/podman-test/ or chmod o+x /tmp/podman-test, the container runs successfully.
Analysis
Based on me digging around the code and experimenting with strace, I think the diagram below describes what's going on:
Assuming the following user and group ids, and /etc/subuid:
> id user
uid=1000(user) gid=1000(user) groups=1000(user)
> grep user /etc/subuid
user:100000:65536Then the simplified view of events is:
sequenceDiagram
participant Podman
participant Crun
participant Kernel
Podman->>Podman: Set up user namespace:<br> root -> user (Table 1)
Podman->>Crun: start container with:<br>user -> root (Table 2)
activate Crun
Crun->>Crun: re-invoke in user namespace
Crun->>Kernel: open(root.path)
deactivate Crun
activate Kernel
Note over Kernel: process uid: 0<br>process gid: 0<br>file perms: drwx------<br>file uid:1000<br>file gid:[invalid]
Kernel--xCrun: return: -1, errno: EPERM
deactivate Kernel
activate Crun
open(root.path) fails because the kernel requires the uid and gid of the accessed file to be mapped for the root capabilities to take effect 1. Normal access checks fail because at this point cruns uid is still 0, so user permissions don't apply.
Table 1: User Namespace Setup (Podman)
| UID in NS | UID in host |
|---|---|
| [0] | [1000] |
| [1, 65536] | [100000, 165535] |
Table 2: Container Spec linux.uidMappings
| UID in container | UID in outer NS |
|---|---|
| [0, 999] | [1, 1000] |
| [1000] | [0] |
| [1001, 65536] | [1001, 65536] |
Possible fixes
- I think crun could open the storage root path before entering the user namespace, in the same way as its done for mount paths. There the root user in the namespace is still mapped to the user that invoked podman in the host.
- Or open it after
setuid. - podman could open and pass a file descriptor to crun for the graph storage root.
Option 3 would be the most ideal, because if the user in the host can access the storage root, then IMO containers should be able to start from it, regardless of what permissions enable that access in the host (file ownership, primary or secondary group or ACLs). It's probably not simple to implement however.
Footnotes
-
Quoting from
man 7 user_namespaces:Certain capabilities allow a process to bypass various kernel-enforced restrictions when performing operations on files owned by other users or groups. These capabilities are: CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_DAC_READ_SEARCH, CAP_FOWNER, and CAP_FSETID.
Within a user namespace, these capabilities allow a process to bypass the rules if the process has the relevant capability over the file, meaning that:
- the process has the relevant effective capability in its user namespace; and
- the file's user ID and group ID both have valid mappings in the user namespace.
(Emphasis mine) ↩