xref: /linux/Documentation/userspace-api/landlock.rst (revision 3df692169e8486fc3dd91fcd5ea81c27a0bac033)
1.. SPDX-License-Identifier: GPL-2.0
2.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net>
3.. Copyright © 2019-2020 ANSSI
4.. Copyright © 2021-2022 Microsoft Corporation
5
6=====================================
7Landlock: unprivileged access control
8=====================================
9
10:Author: Mickaël Salaün
11:Date: October 2023
12
13The goal of Landlock is to enable to restrict ambient rights (e.g. global
14filesystem or network access) for a set of processes.  Because Landlock
15is a stackable LSM, it makes possible to create safe security sandboxes as new
16security layers in addition to the existing system-wide access-controls. This
17kind of sandbox is expected to help mitigate the security impact of bugs or
18unexpected/malicious behaviors in user space applications.  Landlock empowers
19any process, including unprivileged ones, to securely restrict themselves.
20
21We can quickly make sure that Landlock is enabled in the running system by
22looking for "landlock: Up and running" in kernel logs (as root): ``dmesg | grep
23landlock || journalctl -kg landlock`` .  Developers can also easily check for
24Landlock support with a :ref:`related system call <landlock_abi_versions>`.  If
25Landlock is not currently supported, we need to :ref:`configure the kernel
26appropriately <kernel_support>`.
27
28Landlock rules
29==============
30
31A Landlock rule describes an action on an object which the process intends to
32perform.  A set of rules is aggregated in a ruleset, which can then restrict
33the thread enforcing it, and its future children.
34
35The two existing types of rules are:
36
37Filesystem rules
38    For these rules, the object is a file hierarchy,
39    and the related filesystem actions are defined with
40    `filesystem access rights`.
41
42Network rules (since ABI v4)
43    For these rules, the object is a TCP port,
44    and the related actions are defined with `network access rights`.
45
46Defining and enforcing a security policy
47----------------------------------------
48
49We first need to define the ruleset that will contain our rules.
50
51For this example, the ruleset will contain rules that only allow filesystem
52read actions and establish a specific TCP connection. Filesystem write
53actions and other TCP actions will be denied.
54
55The ruleset then needs to handle both these kinds of actions.  This is
56required for backward and forward compatibility (i.e. the kernel and user
57space may not know each other's supported restrictions), hence the need
58to be explicit about the denied-by-default access rights.
59
60.. code-block:: c
61
62    struct landlock_ruleset_attr ruleset_attr = {
63        .handled_access_fs =
64            LANDLOCK_ACCESS_FS_EXECUTE |
65            LANDLOCK_ACCESS_FS_WRITE_FILE |
66            LANDLOCK_ACCESS_FS_READ_FILE |
67            LANDLOCK_ACCESS_FS_READ_DIR |
68            LANDLOCK_ACCESS_FS_REMOVE_DIR |
69            LANDLOCK_ACCESS_FS_REMOVE_FILE |
70            LANDLOCK_ACCESS_FS_MAKE_CHAR |
71            LANDLOCK_ACCESS_FS_MAKE_DIR |
72            LANDLOCK_ACCESS_FS_MAKE_REG |
73            LANDLOCK_ACCESS_FS_MAKE_SOCK |
74            LANDLOCK_ACCESS_FS_MAKE_FIFO |
75            LANDLOCK_ACCESS_FS_MAKE_BLOCK |
76            LANDLOCK_ACCESS_FS_MAKE_SYM |
77            LANDLOCK_ACCESS_FS_REFER |
78            LANDLOCK_ACCESS_FS_TRUNCATE,
79        .handled_access_net =
80            LANDLOCK_ACCESS_NET_BIND_TCP |
81            LANDLOCK_ACCESS_NET_CONNECT_TCP,
82    };
83
84Because we may not know on which kernel version an application will be
85executed, it is safer to follow a best-effort security approach.  Indeed, we
86should try to protect users as much as possible whatever the kernel they are
87using.  To avoid binary enforcement (i.e. either all security features or
88none), we can leverage a dedicated Landlock command to get the current version
89of the Landlock ABI and adapt the handled accesses.  Let's check if we should
90remove access rights which are only supported in higher versions of the ABI.
91
92.. code-block:: c
93
94    int abi;
95
96    abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
97    if (abi < 0) {
98        /* Degrades gracefully if Landlock is not handled. */
99        perror("The running kernel does not enable to use Landlock");
100        return 0;
101    }
102    switch (abi) {
103    case 1:
104        /* Removes LANDLOCK_ACCESS_FS_REFER for ABI < 2 */
105        ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER;
106        __attribute__((fallthrough));
107    case 2:
108        /* Removes LANDLOCK_ACCESS_FS_TRUNCATE for ABI < 3 */
109        ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_TRUNCATE;
110        __attribute__((fallthrough));
111    case 3:
112        /* Removes network support for ABI < 4 */
113        ruleset_attr.handled_access_net &=
114            ~(LANDLOCK_ACCESS_NET_BIND_TCP |
115              LANDLOCK_ACCESS_NET_CONNECT_TCP);
116    }
117
118This enables to create an inclusive ruleset that will contain our rules.
119
120.. code-block:: c
121
122    int ruleset_fd;
123
124    ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
125    if (ruleset_fd < 0) {
126        perror("Failed to create a ruleset");
127        return 1;
128    }
129
130We can now add a new rule to this ruleset thanks to the returned file
131descriptor referring to this ruleset.  The rule will only allow reading the
132file hierarchy ``/usr``.  Without another rule, write actions would then be
133denied by the ruleset.  To add ``/usr`` to the ruleset, we open it with the
134``O_PATH`` flag and fill the &struct landlock_path_beneath_attr with this file
135descriptor.
136
137.. code-block:: c
138
139    int err;
140    struct landlock_path_beneath_attr path_beneath = {
141        .allowed_access =
142            LANDLOCK_ACCESS_FS_EXECUTE |
143            LANDLOCK_ACCESS_FS_READ_FILE |
144            LANDLOCK_ACCESS_FS_READ_DIR,
145    };
146
147    path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC);
148    if (path_beneath.parent_fd < 0) {
149        perror("Failed to open file");
150        close(ruleset_fd);
151        return 1;
152    }
153    err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
154                            &path_beneath, 0);
155    close(path_beneath.parent_fd);
156    if (err) {
157        perror("Failed to update ruleset");
158        close(ruleset_fd);
159        return 1;
160    }
161
162It may also be required to create rules following the same logic as explained
163for the ruleset creation, by filtering access rights according to the Landlock
164ABI version.  In this example, this is not required because all of the requested
165``allowed_access`` rights are already available in ABI 1.
166
167For network access-control, we can add a set of rules that allow to use a port
168number for a specific action: HTTPS connections.
169
170.. code-block:: c
171
172    struct landlock_net_port_attr net_port = {
173        .allowed_access = LANDLOCK_ACCESS_NET_CONNECT_TCP,
174        .port = 443,
175    };
176
177    err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
178                            &net_port, 0);
179
180The next step is to restrict the current thread from gaining more privileges
181(e.g. through a SUID binary).  We now have a ruleset with the first rule
182allowing read access to ``/usr`` while denying all other handled accesses for
183the filesystem, and a second rule allowing HTTPS connections.
184
185.. code-block:: c
186
187    if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
188        perror("Failed to restrict privileges");
189        close(ruleset_fd);
190        return 1;
191    }
192
193The current thread is now ready to sandbox itself with the ruleset.
194
195.. code-block:: c
196
197    if (landlock_restrict_self(ruleset_fd, 0)) {
198        perror("Failed to enforce ruleset");
199        close(ruleset_fd);
200        return 1;
201    }
202    close(ruleset_fd);
203
204If the ``landlock_restrict_self`` system call succeeds, the current thread is
205now restricted and this policy will be enforced on all its subsequently created
206children as well.  Once a thread is landlocked, there is no way to remove its
207security policy; only adding more restrictions is allowed.  These threads are
208now in a new Landlock domain, merge of their parent one (if any) with the new
209ruleset.
210
211Full working code can be found in `samples/landlock/sandboxer.c`_.
212
213Good practices
214--------------
215
216It is recommended setting access rights to file hierarchy leaves as much as
217possible.  For instance, it is better to be able to have ``~/doc/`` as a
218read-only hierarchy and ``~/tmp/`` as a read-write hierarchy, compared to
219``~/`` as a read-only hierarchy and ``~/tmp/`` as a read-write hierarchy.
220Following this good practice leads to self-sufficient hierarchies that do not
221depend on their location (i.e. parent directories).  This is particularly
222relevant when we want to allow linking or renaming.  Indeed, having consistent
223access rights per directory enables to change the location of such directory
224without relying on the destination directory access rights (except those that
225are required for this operation, see ``LANDLOCK_ACCESS_FS_REFER``
226documentation).
227Having self-sufficient hierarchies also helps to tighten the required access
228rights to the minimal set of data.  This also helps avoid sinkhole directories,
229i.e.  directories where data can be linked to but not linked from.  However,
230this depends on data organization, which might not be controlled by developers.
231In this case, granting read-write access to ``~/tmp/``, instead of write-only
232access, would potentially allow to move ``~/tmp/`` to a non-readable directory
233and still keep the ability to list the content of ``~/tmp/``.
234
235Layers of file path access rights
236---------------------------------
237
238Each time a thread enforces a ruleset on itself, it updates its Landlock domain
239with a new layer of policy.  Indeed, this complementary policy is stacked with
240the potentially other rulesets already restricting this thread.  A sandboxed
241thread can then safely add more constraints to itself with a new enforced
242ruleset.
243
244One policy layer grants access to a file path if at least one of its rules
245encountered on the path grants the access.  A sandboxed thread can only access
246a file path if all its enforced policy layers grant the access as well as all
247the other system access controls (e.g. filesystem DAC, other LSM policies,
248etc.).
249
250Bind mounts and OverlayFS
251-------------------------
252
253Landlock enables to restrict access to file hierarchies, which means that these
254access rights can be propagated with bind mounts (cf.
255Documentation/filesystems/sharedsubtree.rst) but not with
256Documentation/filesystems/overlayfs.rst.
257
258A bind mount mirrors a source file hierarchy to a destination.  The destination
259hierarchy is then composed of the exact same files, on which Landlock rules can
260be tied, either via the source or the destination path.  These rules restrict
261access when they are encountered on a path, which means that they can restrict
262access to multiple file hierarchies at the same time, whether these hierarchies
263are the result of bind mounts or not.
264
265An OverlayFS mount point consists of upper and lower layers.  These layers are
266combined in a merge directory, result of the mount point.  This merge hierarchy
267may include files from the upper and lower layers, but modifications performed
268on the merge hierarchy only reflects on the upper layer.  From a Landlock
269policy point of view, each OverlayFS layers and merge hierarchies are
270standalone and contains their own set of files and directories, which is
271different from bind mounts.  A policy restricting an OverlayFS layer will not
272restrict the resulted merged hierarchy, and vice versa.  Landlock users should
273then only think about file hierarchies they want to allow access to, regardless
274of the underlying filesystem.
275
276Inheritance
277-----------
278
279Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain
280restrictions from its parent.  This is similar to the seccomp inheritance (cf.
281Documentation/userspace-api/seccomp_filter.rst) or any other LSM dealing with
282task's :manpage:`credentials(7)`.  For instance, one process's thread may apply
283Landlock rules to itself, but they will not be automatically applied to other
284sibling threads (unlike POSIX thread credential changes, cf.
285:manpage:`nptl(7)`).
286
287When a thread sandboxes itself, we have the guarantee that the related security
288policy will stay enforced on all this thread's descendants.  This allows
289creating standalone and modular security policies per application, which will
290automatically be composed between themselves according to their runtime parent
291policies.
292
293Ptrace restrictions
294-------------------
295
296A sandboxed process has less privileges than a non-sandboxed process and must
297then be subject to additional restrictions when manipulating another process.
298To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target
299process, a sandboxed process should have a subset of the target process rules,
300which means the tracee must be in a sub-domain of the tracer.
301
302Truncating files
303----------------
304
305The operations covered by ``LANDLOCK_ACCESS_FS_WRITE_FILE`` and
306``LANDLOCK_ACCESS_FS_TRUNCATE`` both change the contents of a file and sometimes
307overlap in non-intuitive ways.  It is recommended to always specify both of
308these together.
309
310A particularly surprising example is :manpage:`creat(2)`.  The name suggests
311that this system call requires the rights to create and write files.  However,
312it also requires the truncate right if an existing file under the same name is
313already present.
314
315It should also be noted that truncating files does not require the
316``LANDLOCK_ACCESS_FS_WRITE_FILE`` right.  Apart from the :manpage:`truncate(2)`
317system call, this can also be done through :manpage:`open(2)` with the flags
318``O_RDONLY | O_TRUNC``.
319
320When opening a file, the availability of the ``LANDLOCK_ACCESS_FS_TRUNCATE``
321right is associated with the newly created file descriptor and will be used for
322subsequent truncation attempts using :manpage:`ftruncate(2)`.  The behavior is
323similar to opening a file for reading or writing, where permissions are checked
324during :manpage:`open(2)`, but not during the subsequent :manpage:`read(2)` and
325:manpage:`write(2)` calls.
326
327As a consequence, it is possible to have multiple open file descriptors for the
328same file, where one grants the right to truncate the file and the other does
329not.  It is also possible to pass such file descriptors between processes,
330keeping their Landlock properties, even when these processes do not have an
331enforced Landlock ruleset.
332
333Compatibility
334=============
335
336Backward and forward compatibility
337----------------------------------
338
339Landlock is designed to be compatible with past and future versions of the
340kernel.  This is achieved thanks to the system call attributes and the
341associated bitflags, particularly the ruleset's ``handled_access_fs``.  Making
342handled access right explicit enables the kernel and user space to have a clear
343contract with each other.  This is required to make sure sandboxing will not
344get stricter with a system update, which could break applications.
345
346Developers can subscribe to the `Landlock mailing list
347<https://subspace.kernel.org/lists.linux.dev.html>`_ to knowingly update and
348test their applications with the latest available features.  In the interest of
349users, and because they may use different kernel versions, it is strongly
350encouraged to follow a best-effort security approach by checking the Landlock
351ABI version at runtime and only enforcing the supported features.
352
353.. _landlock_abi_versions:
354
355Landlock ABI versions
356---------------------
357
358The Landlock ABI version can be read with the sys_landlock_create_ruleset()
359system call:
360
361.. code-block:: c
362
363    int abi;
364
365    abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
366    if (abi < 0) {
367        switch (errno) {
368        case ENOSYS:
369            printf("Landlock is not supported by the current kernel.\n");
370            break;
371        case EOPNOTSUPP:
372            printf("Landlock is currently disabled.\n");
373            break;
374        }
375        return 0;
376    }
377    if (abi >= 2) {
378        printf("Landlock supports LANDLOCK_ACCESS_FS_REFER.\n");
379    }
380
381The following kernel interfaces are implicitly supported by the first ABI
382version.  Features only supported from a specific version are explicitly marked
383as such.
384
385Kernel interface
386================
387
388Access rights
389-------------
390
391.. kernel-doc:: include/uapi/linux/landlock.h
392    :identifiers: fs_access net_access
393
394Creating a new ruleset
395----------------------
396
397.. kernel-doc:: security/landlock/syscalls.c
398    :identifiers: sys_landlock_create_ruleset
399
400.. kernel-doc:: include/uapi/linux/landlock.h
401    :identifiers: landlock_ruleset_attr
402
403Extending a ruleset
404-------------------
405
406.. kernel-doc:: security/landlock/syscalls.c
407    :identifiers: sys_landlock_add_rule
408
409.. kernel-doc:: include/uapi/linux/landlock.h
410    :identifiers: landlock_rule_type landlock_path_beneath_attr
411                  landlock_net_port_attr
412
413Enforcing a ruleset
414-------------------
415
416.. kernel-doc:: security/landlock/syscalls.c
417    :identifiers: sys_landlock_restrict_self
418
419Current limitations
420===================
421
422Filesystem topology modification
423--------------------------------
424
425Threads sandboxed with filesystem restrictions cannot modify filesystem
426topology, whether via :manpage:`mount(2)` or :manpage:`pivot_root(2)`.
427However, :manpage:`chroot(2)` calls are not denied.
428
429Special filesystems
430-------------------
431
432Access to regular files and directories can be restricted by Landlock,
433according to the handled accesses of a ruleset.  However, files that do not
434come from a user-visible filesystem (e.g. pipe, socket), but can still be
435accessed through ``/proc/<pid>/fd/*``, cannot currently be explicitly
436restricted.  Likewise, some special kernel filesystems such as nsfs, which can
437be accessed through ``/proc/<pid>/ns/*``, cannot currently be explicitly
438restricted.  However, thanks to the `ptrace restrictions`_, access to such
439sensitive ``/proc`` files are automatically restricted according to domain
440hierarchies.  Future Landlock evolutions could still enable to explicitly
441restrict such paths with dedicated ruleset flags.
442
443Ruleset layers
444--------------
445
446There is a limit of 16 layers of stacked rulesets.  This can be an issue for a
447task willing to enforce a new ruleset in complement to its 16 inherited
448rulesets.  Once this limit is reached, sys_landlock_restrict_self() returns
449E2BIG.  It is then strongly suggested to carefully build rulesets once in the
450life of a thread, especially for applications able to launch other applications
451that may also want to sandbox themselves (e.g. shells, container managers,
452etc.).
453
454Memory usage
455------------
456
457Kernel memory allocated to create rulesets is accounted and can be restricted
458by the Documentation/admin-guide/cgroup-v1/memory.rst.
459
460Previous limitations
461====================
462
463File renaming and linking (ABI < 2)
464-----------------------------------
465
466Because Landlock targets unprivileged access controls, it needs to properly
467handle composition of rules.  Such property also implies rules nesting.
468Properly handling multiple layers of rulesets, each one of them able to
469restrict access to files, also implies inheritance of the ruleset restrictions
470from a parent to its hierarchy.  Because files are identified and restricted by
471their hierarchy, moving or linking a file from one directory to another implies
472propagation of the hierarchy constraints, or restriction of these actions
473according to the potentially lost constraints.  To protect against privilege
474escalations through renaming or linking, and for the sake of simplicity,
475Landlock previously limited linking and renaming to the same directory.
476Starting with the Landlock ABI version 2, it is now possible to securely
477control renaming and linking thanks to the new ``LANDLOCK_ACCESS_FS_REFER``
478access right.
479
480File truncation (ABI < 3)
481-------------------------
482
483File truncation could not be denied before the third Landlock ABI, so it is
484always allowed when using a kernel that only supports the first or second ABI.
485
486Starting with the Landlock ABI version 3, it is now possible to securely control
487truncation thanks to the new ``LANDLOCK_ACCESS_FS_TRUNCATE`` access right.
488
489Network support (ABI < 4)
490-------------------------
491
492Starting with the Landlock ABI version 4, it is now possible to restrict TCP
493bind and connect actions to only a set of allowed ports thanks to the new
494``LANDLOCK_ACCESS_NET_BIND_TCP`` and ``LANDLOCK_ACCESS_NET_CONNECT_TCP``
495access rights.
496
497.. _kernel_support:
498
499Kernel support
500==============
501
502Landlock was first introduced in Linux 5.13 but it must be configured at build
503time with ``CONFIG_SECURITY_LANDLOCK=y``.  Landlock must also be enabled at boot
504time as the other security modules.  The list of security modules enabled by
505default is set with ``CONFIG_LSM``.  The kernel configuration should then
506contains ``CONFIG_LSM=landlock,[...]`` with ``[...]``  as the list of other
507potentially useful security modules for the running system (see the
508``CONFIG_LSM`` help).
509
510If the running kernel does not have ``landlock`` in ``CONFIG_LSM``, then we can
511still enable it by adding ``lsm=landlock,[...]`` to
512Documentation/admin-guide/kernel-parameters.rst thanks to the bootloader
513configuration.
514
515To be able to explicitly allow TCP operations (e.g., adding a network rule with
516``LANDLOCK_ACCESS_NET_BIND_TCP``), the kernel must support TCP
517(``CONFIG_INET=y``).  Otherwise, sys_landlock_add_rule() returns an
518``EAFNOSUPPORT`` error, which can safely be ignored because this kind of TCP
519operation is already not possible.
520
521Questions and answers
522=====================
523
524What about user space sandbox managers?
525---------------------------------------
526
527Using user space process to enforce restrictions on kernel resources can lead
528to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of
529the OS code and state
530<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_).
531
532What about namespaces and containers?
533-------------------------------------
534
535Namespaces can help create sandboxes but they are not designed for
536access-control and then miss useful features for such use case (e.g. no
537fine-grained restrictions).  Moreover, their complexity can lead to security
538issues, especially when untrusted processes can manipulate them (cf.
539`Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_).
540
541Additional documentation
542========================
543
544* Documentation/security/landlock.rst
545* https://landlock.io
546
547.. Links
548.. _samples/landlock/sandboxer.c:
549   https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/landlock/sandboxer.c
550