Skip to content

Commit 9a868cd

Browse files
committed
landlock: Fix handling of disconnected directories
We can get disconnected files or directories when they are visible and opened from a bind mount, before being renamed/moved from the source of the bind mount in a way that makes them inaccessible from the mount point (i.e. out of scope). Until now, access rights tied to files or directories opened through a disconnected directory were collected by walking the related hierarchy down to the root of this filesystem because the mount point couldn't be found. This could lead to inconsistent access results, and hard-to-debug renames, especially because such paths cannot be printed. For a sandboxed task to create a disconnected directory, it needs to have write access (i.e. FS_MAKE_REG, FS_REMOVE_FILE, and FS_REFER) to the underlying source of the bind mount, and read access to the related mount point. Because a sandboxed task cannot get more access than those defined by its Landlock domain, this could only lead to inconsistent access rights because of missing those that should be inherited from the mount point hierarchy and inheriting from the hierarchy of the mounted filesystem instead. Landlock now handles files/directories opened from disconnected directories like the mount point these disconnected directories were opened from. This gives the guarantee that access rights on a file/directory cannot be more than those at open time. The rationale is that disconnected hierarchies might not be visible nor accessible to a sandboxed task, and relying on the collected access rights from them could introduce unexpected results, especially for rename actions because of the access right comparison between the source and the destination (see LANDLOCK_ACCESS_FS_REFER). This new behavior is much less surprising to users and safer from an access point of view. Unlike follow_dotdot(), we don't need to check for each directory if it is part of the mount's root, but instead this is only checked when we reached a root dentry (not a mount point), or when the access request is about to be allowed. This limits the number of calls to is_subdir() which walks down the hierarchy (again). This also avoids checking path connection at the beginning of the walk for each mount point, which would be racy. Remove a wrong WARN_ON_ONCE() canary in collect_domain_accesses() and fix comment. This change increases the stack size with two Landlock layer masks backups and a boolean, that are needed to reset the collected and requested access rights from the latest mount point. Because opened files have their access rights stored in the related file security properties, their is no impact for disconnected or unlinked files. Make path_connected() public to stay consistent with the VFS. This helper is used when we are about to allowed an access. Cc: Günther Noack <[email protected]> Cc: Song Liu <[email protected]> Acked-by: Christian Brauner <[email protected]> Reported-by: Tingmao Wang <[email protected]> Closes: https://lore.kernel.org/r/[email protected] Closes: https://lore.kernel.org/r/09b24128f86973a6022e6aa8338945fcfb9a33e4.1749925391.git.m@maowtm.org Fixes: b91c3e4 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER") Fixes: cb2c7d1 ("landlock: Support filesystem access-control") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mickaël Salaün <[email protected]>
1 parent 6803b6e commit 9a868cd

File tree

4 files changed

+176
-34
lines changed

4 files changed

+176
-34
lines changed

fs/namei.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -716,7 +716,7 @@ static bool nd_alloc_stack(struct nameidata *nd)
716716
* Rename can sometimes move a file or directory outside of a bind
717717
* mount, path_connected allows those cases to be detected.
718718
*/
719-
static bool path_connected(struct vfsmount *mnt, struct dentry *dentry)
719+
bool path_connected(struct vfsmount *mnt, struct dentry *dentry)
720720
{
721721
struct super_block *sb = mnt->mnt_sb;
722722

include/linux/fs.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3252,6 +3252,7 @@ extern struct file * open_exec(const char *);
32523252
/* fs/dcache.c -- generic fs support functions */
32533253
extern bool is_subdir(struct dentry *, struct dentry *);
32543254
extern bool path_is_under(const struct path *, const struct path *);
3255+
bool path_connected(struct vfsmount *mnt, struct dentry *dentry);
32553256

32563257
extern char *file_path(struct file *, char *, int);
32573258

security/landlock/errata/abi-1.h

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
/* SPDX-License-Identifier: GPL-2.0-only */
2+
3+
/**
4+
* DOC: erratum_3
5+
*
6+
* Erratum 3: Disconnected directory handling
7+
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
8+
*
9+
* This fix addresses an issue with disconnected directories that occur when a
10+
* directory is moved outside the scope of a bind mount. The change ensures
11+
* that evaluated access rights exclude those inherited from disconnected file
12+
* hierarchies (no longer accessible from the related mount point), and instead
13+
* only consider rights tied to directories that remain visible. This prevents
14+
* access inconsistencies caused by missing access rights.
15+
*/
16+
LANDLOCK_ERRATUM(3)

security/landlock/fs.c

Lines changed: 158 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -764,11 +764,14 @@ static bool is_access_to_paths_allowed(
764764
struct dentry *const dentry_child2)
765765
{
766766
bool allowed_parent1 = false, allowed_parent2 = false, is_dom_check,
767-
child1_is_directory = true, child2_is_directory = true;
767+
is_dom_check_bkp, child1_is_directory = true,
768+
child2_is_directory = true;
768769
struct path walker_path;
769770
access_mask_t access_masked_parent1, access_masked_parent2;
770771
layer_mask_t _layer_masks_child1[LANDLOCK_NUM_ACCESS_FS],
771-
_layer_masks_child2[LANDLOCK_NUM_ACCESS_FS];
772+
_layer_masks_child2[LANDLOCK_NUM_ACCESS_FS],
773+
_layer_masks_parent1_bkp[LANDLOCK_NUM_ACCESS_FS],
774+
_layer_masks_parent2_bkp[LANDLOCK_NUM_ACCESS_FS];
772775
layer_mask_t(*layer_masks_child1)[LANDLOCK_NUM_ACCESS_FS] = NULL,
773776
(*layer_masks_child2)[LANDLOCK_NUM_ACCESS_FS] = NULL;
774777

@@ -784,12 +787,18 @@ static bool is_access_to_paths_allowed(
784787
if (WARN_ON_ONCE(!layer_masks_parent1))
785788
return false;
786789

787-
allowed_parent1 = is_layer_masks_allowed(layer_masks_parent1);
788-
789790
if (unlikely(layer_masks_parent2)) {
790791
if (WARN_ON_ONCE(!dentry_child1))
791792
return false;
792793

794+
/*
795+
* Creates a backup of the initial layer masks to be able to restore
796+
* them if we find out that we were walking a disconnected directory,
797+
* which would make the collected access rights inconsistent (cf.
798+
* reset_to_mount_root).
799+
*/
800+
memcpy(&_layer_masks_parent2_bkp, layer_masks_parent2,
801+
sizeof(_layer_masks_parent2_bkp));
793802
allowed_parent2 = is_layer_masks_allowed(layer_masks_parent2);
794803

795804
/*
@@ -809,6 +818,16 @@ static bool is_access_to_paths_allowed(
809818
is_dom_check = false;
810819
}
811820

821+
/*
822+
* Creates a backup of the initial layer masks to be able to restore them if
823+
* we find out that we were walking a disconnected directory, which would
824+
* make the collected access rights inconsistent (cf. reset_to_mount_root).
825+
*/
826+
memcpy(&_layer_masks_parent1_bkp, layer_masks_parent1,
827+
sizeof(_layer_masks_parent1_bkp));
828+
allowed_parent1 = is_layer_masks_allowed(layer_masks_parent1);
829+
is_dom_check_bkp = is_dom_check;
830+
812831
if (unlikely(dentry_child1)) {
813832
landlock_unmask_layers(
814833
find_rule(domain, dentry_child1),
@@ -874,13 +893,13 @@ static bool is_access_to_paths_allowed(
874893
allowed_parent2 ||
875894
scope_to_request(access_masked_parent2,
876895
layer_masks_parent2);
877-
878-
/* Stops when all accesses are granted. */
879-
if (allowed_parent1 && allowed_parent2)
880-
break;
881896
}
882897

883-
rule = find_rule(domain, walker_path.dentry);
898+
/* Looks for a rule when needed. */
899+
rule = unlikely(allowed_parent1 && allowed_parent2) ?
900+
NULL :
901+
find_rule(domain, walker_path.dentry);
902+
884903
allowed_parent1 = allowed_parent1 ||
885904
landlock_unmask_layers(
886905
rule, access_masked_parent1,
@@ -893,12 +912,47 @@ static bool is_access_to_paths_allowed(
893912
ARRAY_SIZE(*layer_masks_parent2));
894913

895914
/* Stops when a rule from each layer grants access. */
896-
if (allowed_parent1 && allowed_parent2)
915+
if (allowed_parent1 && allowed_parent2) {
916+
/*
917+
* Before allowing the access request, checks that the walk was not
918+
* in a disconnected directory.
919+
*/
920+
if (unlikely(!path_connected(walker_path.mnt,
921+
walker_path.dentry)))
922+
goto reset_to_mount_root;
923+
897924
break;
925+
}
898926

899927
jump_up:
900928
if (walker_path.dentry == walker_path.mnt->mnt_root) {
929+
/*
930+
* We reached a mount point which is not a disconnected directory.
931+
* We can now safely assume that the collected access rights are
932+
* valid, and we can save them to be able to get back to this state
933+
* later on. If we reached the real root, the walk will end, and we
934+
* will not need to restore anything, so we only need to save the
935+
* collected access rights if we are not at the real root.
936+
*/
901937
if (follow_up(&walker_path)) {
938+
/*
939+
* The mount point before this follow_up() call was connected.
940+
* We will know during the ongoing walk if the path from this
941+
* new mount point (i.e. walker_path) is disconnected. If it is
942+
* the case, we will restore the collected access rights from
943+
* here and jump to walker_path.mnt->mnt_root, short-circuiting
944+
* the disconnected hierarchy (cf. reset_to_mount_root).
945+
*/
946+
memcpy(&_layer_masks_parent1_bkp,
947+
layer_masks_parent1,
948+
sizeof(_layer_masks_parent1_bkp));
949+
if (layer_masks_parent2) {
950+
memcpy(&_layer_masks_parent2_bkp,
951+
layer_masks_parent2,
952+
sizeof(_layer_masks_parent2_bkp));
953+
is_dom_check_bkp = is_dom_check;
954+
}
955+
902956
/* Ignores hidden mount points. */
903957
goto jump_up;
904958
} else {
@@ -910,20 +964,69 @@ static bool is_access_to_paths_allowed(
910964
}
911965
}
912966
if (unlikely(IS_ROOT(walker_path.dentry))) {
913-
/*
914-
* Stops at disconnected root directories. Only allows
915-
* access to internal filesystems (e.g. nsfs, which is
916-
* reachable through /proc/<pid>/ns/<namespace>).
917-
*/
918-
if (walker_path.mnt->mnt_flags & MNT_INTERNAL) {
967+
if (likely(walker_path.mnt->mnt_flags & MNT_INTERNAL)) {
968+
/*
969+
* Stops and allows access when reaching disconnected root
970+
* directories that are part of internal filesystems (e.g. nsfs,
971+
* which is reachable through /proc/<pid>/ns/<namespace>).
972+
*/
919973
allowed_parent1 = true;
920974
allowed_parent2 = true;
975+
break;
921976
}
922-
break;
977+
978+
/*
979+
* We reached a disconnected root directory from a bind mount, and
980+
* we need to reset the walk to the current mount root.
981+
*/
982+
goto reset_to_mount_root;
923983
}
924984
parent_dentry = dget_parent(walker_path.dentry);
925985
dput(walker_path.dentry);
926986
walker_path.dentry = parent_dentry;
987+
continue;
988+
989+
reset_to_mount_root:
990+
/*
991+
* At this point, we know that the walk was in a disconnected file
992+
* hierarchy and we need to restore the layer masks from the last known
993+
* good values. These were either built from the domain (at the
994+
* beginning of the walk) or from the collected access rights up to the
995+
* previous connected mount point. This ensures we don't use
996+
* potentially invalid access rights from the disconnected path
997+
* traversal.
998+
*/
999+
memcpy(layer_masks_parent1, &_layer_masks_parent1_bkp,
1000+
sizeof(_layer_masks_parent1_bkp));
1001+
allowed_parent1 =
1002+
is_layer_masks_allowed(&_layer_masks_parent1_bkp);
1003+
if (layer_masks_parent2) {
1004+
memcpy(layer_masks_parent2, &_layer_masks_parent2_bkp,
1005+
sizeof(_layer_masks_parent2_bkp));
1006+
allowed_parent2 = is_layer_masks_allowed(
1007+
&_layer_masks_parent2_bkp);
1008+
1009+
/*
1010+
* Restores domain check mode if needed: increases back the scope of
1011+
* the access checks to the domain's handled accesses, which are a
1012+
* superset of the requested ones.
1013+
*/
1014+
if (is_dom_check_bkp) {
1015+
access_masked_parent1 = access_masked_parent2 =
1016+
landlock_union_access_masks(domain).fs;
1017+
is_dom_check = true;
1018+
}
1019+
}
1020+
1021+
/*
1022+
* Jumps to the root of the current mount point to short-circuit the
1023+
* disconnected walk with a reachable directory. This ensures we
1024+
* continue the access check from a known good location in the
1025+
* filesystem hierarchy.
1026+
*/
1027+
dput(walker_path.dentry);
1028+
walker_path.dentry = walker_path.mnt->mnt_root;
1029+
dget(walker_path.dentry);
9271030
}
9281031
path_put(&walker_path);
9291032

@@ -1011,15 +1114,18 @@ static access_mask_t maybe_remove(const struct dentry *const dentry)
10111114
* collect_domain_accesses - Walk through a file path and collect accesses
10121115
*
10131116
* @domain: Domain to check against.
1014-
* @mnt_root: Last directory to check.
1117+
* @mnt_dir: Mount point directory to stop the walk at.
10151118
* @dir: Directory to start the walk from.
10161119
* @layer_masks_dom: Where to store the collected accesses.
10171120
*
1018-
* This helper is useful to begin a path walk from the @dir directory to a
1019-
* @mnt_root directory used as a mount point. This mount point is the common
1020-
* ancestor between the source and the destination of a renamed and linked
1021-
* file. While walking from @dir to @mnt_root, we record all the domain's
1022-
* allowed accesses in @layer_masks_dom.
1121+
* This helper is useful to begin a path walk from the @dir directory to
1122+
* @mnt_dir. This mount point is the common ancestor between the source and the
1123+
* destination of a renamed and linked file. While walking from @dir to
1124+
* @mnt_dir, we record all the domain's allowed accesses in @layer_masks_dom.
1125+
*
1126+
* Because of disconnected directories, this walk may not reach @mnt_dir. In
1127+
* this case, the walk is canceled and the collected accesses are reset to the
1128+
* domain handled ones.
10231129
*
10241130
* This is similar to is_access_to_paths_allowed() but much simpler because it
10251131
* only handles walking on the same mount point and only checks one set of
@@ -1031,13 +1137,13 @@ static access_mask_t maybe_remove(const struct dentry *const dentry)
10311137
*/
10321138
static bool collect_domain_accesses(
10331139
const struct landlock_ruleset *const domain,
1034-
const struct dentry *const mnt_root, struct dentry *dir,
1140+
const struct path *const mnt_dir, struct dentry *dir,
10351141
layer_mask_t (*const layer_masks_dom)[LANDLOCK_NUM_ACCESS_FS])
10361142
{
1037-
unsigned long access_dom;
1143+
access_mask_t access_dom;
10381144
bool ret = false;
10391145

1040-
if (WARN_ON_ONCE(!domain || !mnt_root || !dir || !layer_masks_dom))
1146+
if (WARN_ON_ONCE(!domain || !mnt_dir || !dir || !layer_masks_dom))
10411147
return true;
10421148
if (is_nouser_or_private(dir))
10431149
return true;
@@ -1054,6 +1160,13 @@ static bool collect_domain_accesses(
10541160
if (landlock_unmask_layers(find_rule(domain, dir), access_dom,
10551161
layer_masks_dom,
10561162
ARRAY_SIZE(*layer_masks_dom))) {
1163+
/*
1164+
* Before allowing this side of the access request, checks that the
1165+
* walk was not in a disconnected directory.
1166+
*/
1167+
if (unlikely(!path_connected(mnt_dir->mnt, dir)))
1168+
goto cancel_walk;
1169+
10571170
/*
10581171
* Stops when all handled accesses are allowed by at
10591172
* least one rule in each layer.
@@ -1062,13 +1175,27 @@ static bool collect_domain_accesses(
10621175
break;
10631176
}
10641177

1065-
/* We should not reach a root other than @mnt_root. */
1066-
if (dir == mnt_root || WARN_ON_ONCE(IS_ROOT(dir)))
1178+
/* Stops at the mount point. */
1179+
if (dir == mnt_dir->dentry)
10671180
break;
10681181

1182+
/* Ignores the walk if we end up in a disconnected root directory. */
1183+
if (unlikely(IS_ROOT(dir)))
1184+
goto cancel_walk;
1185+
10691186
parent_dentry = dget_parent(dir);
10701187
dput(dir);
10711188
dir = parent_dentry;
1189+
continue;
1190+
1191+
cancel_walk:
1192+
/*
1193+
* Resets the inconsistent collected access rights to the domain's
1194+
* handled access rights since we encountered a disconnected directory.
1195+
*/
1196+
landlock_init_layer_masks(domain, LANDLOCK_MASK_ACCESS_FS,
1197+
layer_masks_dom, LANDLOCK_KEY_INODE);
1198+
break;
10721199
}
10731200
dput(dir);
10741201
return ret;
@@ -1199,13 +1326,11 @@ static int current_check_refer_path(struct dentry *const old_dentry,
11991326
old_dentry->d_parent;
12001327

12011328
/* new_dir->dentry is equal to new_dentry->d_parent */
1202-
allow_parent1 = collect_domain_accesses(subject->domain, mnt_dir.dentry,
1203-
old_parent,
1204-
&layer_masks_parent1);
1205-
allow_parent2 = collect_domain_accesses(subject->domain, mnt_dir.dentry,
1329+
allow_parent1 = collect_domain_accesses(
1330+
subject->domain, &mnt_dir, old_parent, &layer_masks_parent1);
1331+
allow_parent2 = collect_domain_accesses(subject->domain, &mnt_dir,
12061332
new_dir->dentry,
12071333
&layer_masks_parent2);
1208-
12091334
if (allow_parent1 && allow_parent2)
12101335
return 0;
12111336

0 commit comments

Comments
 (0)