History log of /linux/tools/testing/selftests/filesystems/empty_mntns/empty_mntns.h (Results 1 – 4 of 4)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 7c8a4671 15-Apr-2026 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'vfs-7.1-rc1.mount.v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs mount updates from Christian Brauner:

- Add FSMOUNT_NAMESPACE flag to fsmount() that creates a ne

Merge tag 'vfs-7.1-rc1.mount.v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs mount updates from Christian Brauner:

- Add FSMOUNT_NAMESPACE flag to fsmount() that creates a new mount
namespace with the newly created filesystem attached to a copy of the
real rootfs. This returns a namespace file descriptor instead of an
O_PATH mount fd, similar to how OPEN_TREE_NAMESPACE works for
open_tree().

This allows creating a new filesystem and immediately placing it in a
new mount namespace in a single operation, which is useful for
container runtimes and other namespace-based isolation mechanisms.

This accompanies OPEN_TREE_NAMESPACE and avoids a needless detour via
OPEN_TREE_NAMESPACE to get the same effect. Will be especially useful
when you mount an actual filesystem to be used as the container
rootfs.

- Currently, creating a new mount namespace always copies the entire
mount tree from the caller's namespace. For containers and sandboxes
that intend to build their mount table from scratch this is wasteful:
they inherit a potentially large mount tree only to immediately tear
it down.

This series adds support for creating a mount namespace that contains
only a clone of the root mount, with none of the child mounts. Two
new flags are introduced:

- CLONE_EMPTY_MNTNS (0x400000000) for clone3(), using the 64-bit flag space
- UNSHARE_EMPTY_MNTNS (0x00100000) for unshare()

Both flags imply CLONE_NEWNS. The resulting namespace contains a
single nullfs root mount with an immutable empty directory. The
intended workflow is to then mount a real filesystem (e.g., tmpfs)
over the root and build the mount table from there.

- Allow MOVE_MOUNT_BENEATH to target the caller's rootfs, allowing to
switch out the rootfs without pivot_root(2).

The traditional approach to switching the rootfs involves
pivot_root(2) or a chroot_fs_refs()-based mechanism that atomically
updates fs->root for all tasks sharing the same fs_struct. This has
consequences for fork(), unshare(CLONE_FS), and setns().

This series instead decomposes root-switching into individually
atomic, locally-scoped steps:

fd_tree = open_tree(-EBADF, "/newroot", OPEN_TREE_CLONE | OPEN_TREE_CLOEXEC);
fchdir(fd_tree);
move_mount(fd_tree, "", AT_FDCWD, "/", MOVE_MOUNT_BENEATH | MOVE_MOUNT_F_EMPTY_PATH);
chroot(".");
umount2(".", MNT_DETACH);

Since each step only modifies the caller's own state, the
fork/unshare/setns races are eliminated by design.

A key step to making this possible is to remove the locked mount
restriction. Originally MOVE_MOUNT_BENEATH doesn't support mounting
beneath a mount that is locked. The locked mount protects the
underlying mount from being revealed. This is a core mechanism of
unshare(CLONE_NEWUSER | CLONE_NEWNS). The mounts in the new mount
namespace become locked. That effectively makes the new mount table
useless as the caller cannot ever get rid of any of the mounts no
matter how useless they are.

We can lift this restriction though. We simply transfer the locked
property from the top mount to the mount beneath. This works because
what we care about is to protect the underlying mount aka the parent.
The mount mounted between the parent and the top mount takes over the
job of protecting the parent mount from the top mount mount. This
leaves us free to remove the locked property from the top mount which
can consequently be unmounted:

unshare(CLONE_NEWUSER | CLONE_NEWNS)

and we inherit a clone of procfs on /proc then currently we cannot
unmount it as:

umount -l /proc

will fail with EINVAL because the procfs mount is locked.

After this series we can now do:

mount --beneath -t tmpfs tmpfs /proc
umount -l /proc

after which a tmpfs mount has been placed beneath the procfs mount.
The tmpfs mount has become locked and the procfs mount has become
unlocked.

This means you can safely modify an inherited mount table after
unprivileged namespace creation.

Afterwards we simply make it possible to move a mount beneath the
rootfs allowing to upgrade the rootfs.

Removing the locked restriction makes this very useful for containers
created with unshare(CLONE_NEWUSER | CLONE_NEWNS) to reshuffle an
inherited mount table safely and MOVE_MOUNT_BENEATH makes it possible
to switch out the rootfs instead of using the costly pivot_root(2).

* tag 'vfs-7.1-rc1.mount.v2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests/namespaces: remove unused utils.h include from listns_efault_test
selftests/fsmount_ns: add missing TARGETS and fix cap test
selftests/empty_mntns: fix wrong CLONE_EMPTY_MNTNS hex value in comment
selftests/empty_mntns: fix statmount_alloc() signature mismatch
selftests/statmount: remove duplicate wait_for_pid()
mount: always duplicate mount
selftests/filesystems: add MOVE_MOUNT_BENEATH rootfs tests
move_mount: allow MOVE_MOUNT_BENEATH on the rootfs
move_mount: transfer MNT_LOCKED
selftests/filesystems: add clone3 tests for empty mount namespaces
selftests/filesystems: add tests for empty mount namespaces
namespace: allow creating empty mount namespaces
selftests: add FSMOUNT_NAMESPACE tests
selftests/statmount: add statmount_alloc() helper
tools: update mount.h header
mount: add FSMOUNT_NAMESPACE
mount: simplify __do_loopback()
mount: start iterating from start of rbtree

show more ...


Revision tags: v7.0, v7.0-rc7, v7.0-rc6
# 1a398a23 23-Mar-2026 Christian Brauner <brauner@kernel.org>

selftests/empty_mntns: fix statmount_alloc() signature mismatch

empty_mntns.h includes ../statmount/statmount.h which provides a
4-argument statmount_alloc(mnt_id, mnt_ns_id, mask, flags), but then

selftests/empty_mntns: fix statmount_alloc() signature mismatch

empty_mntns.h includes ../statmount/statmount.h which provides a
4-argument statmount_alloc(mnt_id, mnt_ns_id, mask, flags), but then
redefines its own 3-argument version without the flags parameter. This
causes a build failure due to conflicting types.

Remove the duplicate definition from empty_mntns.h and update all
callers to pass 0 for the flags argument.

Fixes: 32f54f2bbccf ("selftests/filesystems: add tests for empty mount namespaces")
Signed-off-by: Christian Brauner <brauner@kernel.org>

show more ...


Revision tags: v7.0-rc5, v7.0-rc4
# 4e9f7592 11-Mar-2026 Christian Brauner <brauner@kernel.org>

Merge patch series "namespace: allow creating empty mount namespaces"

Christian Brauner <brauner@kernel.org> says:

Currently, creating a new mount namespace always copies the entire mount
tree from

Merge patch series "namespace: allow creating empty mount namespaces"

Christian Brauner <brauner@kernel.org> says:

Currently, creating a new mount namespace always copies the entire mount
tree from the caller's namespace. For containers and sandboxes that
intend to build their mount table from scratch this is wasteful: they
inherit a potentially large mount tree only to immediately tear it down.

This series adds support for creating a mount namespace that contains
only a clone of the root mount, with none of the child mounts. Two new
flags are introduced:

- CLONE_EMPTY_MNTNS (0x400000000) for clone3(), using the 64-bit flag
space.
- UNSHARE_EMPTY_MNTNS (0x00100000) for unshare(), reusing the
CLONE_PARENT_SETTID bit which has no meaning for unshare.

Both flags imply CLONE_NEWNS. The resulting namespace contains a single
nullfs root mount with an immutable empty directory. The intended
workflow is to then mount a real filesystem (e.g., tmpfs) over the root
and build the mount table from there.

* patches from https://patch.msgid.link/20260306-work-empty-mntns-consolidated-v1-0-6eb30529bbb0@kernel.org:
selftests/filesystems: add clone3 tests for empty mount namespaces
selftests/filesystems: add tests for empty mount namespaces
namespace: allow creating empty mount namespaces

Link: https://patch.msgid.link/20260306-work-empty-mntns-consolidated-v1-0-6eb30529bbb0@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

show more ...


Revision tags: v7.0-rc3
# 32f54f2b 06-Mar-2026 Christian Brauner <brauner@kernel.org>

selftests/filesystems: add tests for empty mount namespaces

Add a test suite for the UNSHARE_EMPTY_MNTNS and CLONE_EMPTY_MNTNS
flags exercising the empty mount namespace functionality through the
ks

selftests/filesystems: add tests for empty mount namespaces

Add a test suite for the UNSHARE_EMPTY_MNTNS and CLONE_EMPTY_MNTNS
flags exercising the empty mount namespace functionality through the
kselftest harness.

The tests cover:

- basic functionality: unshare succeeds, exactly one mount exists in
the new namespace, root and cwd point to the same mount
- flag interactions: UNSHARE_EMPTY_MNTNS works standalone without
explicit CLONE_NEWNS, combines correctly with CLONE_NEWUSER and
other namespace flags (CLONE_NEWUTS, CLONE_NEWIPC)
- edge cases: EPERM without capabilities, works from a user namespace,
many source mounts still result in one mount, cwd on a different
mount gets reset to root
- error paths: invalid flags return EINVAL
- regression: plain CLONE_NEWNS still copies the full mount tree,
other namespace unshares are unaffected
- mount properties: the root mount has the expected statmount
properties, is its own parent, and is the only entry returned by
listmount
- repeated unshare: consecutive UNSHARE_EMPTY_MNTNS calls each
produce a new namespace with a distinct mount ID
- overmount workflow: verifies the intended usage pattern of creating
an empty mount namespace with a nullfs root and then mounting tmpfs
over it to build a writable filesystem from scratch

Link: https://patch.msgid.link/20260306-work-empty-mntns-consolidated-v1-2-6eb30529bbb0@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>

show more ...