xref: /linux/Documentation/userspace-api/landlock.rst (revision 4b660dbd9ee2059850fd30e0df420ca7a38a1856)
1.. SPDX-License-Identifier: GPL-2.0
2.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net>
3.. Copyright © 2019-2020 ANSSI
4.. Copyright © 2021-2022 Microsoft Corporation
5
6=====================================
7Landlock: unprivileged access control
8=====================================
9
10:Author: Mickaël Salaün
11:Date: October 2023
12
13The goal of Landlock is to enable to restrict ambient rights (e.g. global
14filesystem or network access) for a set of processes.  Because Landlock
15is a stackable LSM, it makes possible to create safe security sandboxes as new
16security layers in addition to the existing system-wide access-controls. This
17kind of sandbox is expected to help mitigate the security impact of bugs or
18unexpected/malicious behaviors in user space applications.  Landlock empowers
19any process, including unprivileged ones, to securely restrict themselves.
20
21We can quickly make sure that Landlock is enabled in the running system by
22looking for "landlock: Up and running" in kernel logs (as root):
23``dmesg | grep landlock || journalctl -kb -g landlock`` .
24Developers can also easily check for Landlock support with a
25:ref:`related system call <landlock_abi_versions>`.
26If Landlock is not currently supported, we need to
27:ref:`configure the kernel appropriately <kernel_support>`.
28
29Landlock rules
30==============
31
32A Landlock rule describes an action on an object which the process intends to
33perform.  A set of rules is aggregated in a ruleset, which can then restrict
34the thread enforcing it, and its future children.
35
36The two existing types of rules are:
37
38Filesystem rules
39    For these rules, the object is a file hierarchy,
40    and the related filesystem actions are defined with
41    `filesystem access rights`.
42
43Network rules (since ABI v4)
44    For these rules, the object is a TCP port,
45    and the related actions are defined with `network access rights`.
46
47Defining and enforcing a security policy
48----------------------------------------
49
50We first need to define the ruleset that will contain our rules.
51
52For this example, the ruleset will contain rules that only allow filesystem
53read actions and establish a specific TCP connection. Filesystem write
54actions and other TCP actions will be denied.
55
56The ruleset then needs to handle both these kinds of actions.  This is
57required for backward and forward compatibility (i.e. the kernel and user
58space may not know each other's supported restrictions), hence the need
59to be explicit about the denied-by-default access rights.
60
61.. code-block:: c
62
63    struct landlock_ruleset_attr ruleset_attr = {
64        .handled_access_fs =
65            LANDLOCK_ACCESS_FS_EXECUTE |
66            LANDLOCK_ACCESS_FS_WRITE_FILE |
67            LANDLOCK_ACCESS_FS_READ_FILE |
68            LANDLOCK_ACCESS_FS_READ_DIR |
69            LANDLOCK_ACCESS_FS_REMOVE_DIR |
70            LANDLOCK_ACCESS_FS_REMOVE_FILE |
71            LANDLOCK_ACCESS_FS_MAKE_CHAR |
72            LANDLOCK_ACCESS_FS_MAKE_DIR |
73            LANDLOCK_ACCESS_FS_MAKE_REG |
74            LANDLOCK_ACCESS_FS_MAKE_SOCK |
75            LANDLOCK_ACCESS_FS_MAKE_FIFO |
76            LANDLOCK_ACCESS_FS_MAKE_BLOCK |
77            LANDLOCK_ACCESS_FS_MAKE_SYM |
78            LANDLOCK_ACCESS_FS_REFER |
79            LANDLOCK_ACCESS_FS_TRUNCATE,
80        .handled_access_net =
81            LANDLOCK_ACCESS_NET_BIND_TCP |
82            LANDLOCK_ACCESS_NET_CONNECT_TCP,
83    };
84
85Because we may not know on which kernel version an application will be
86executed, it is safer to follow a best-effort security approach.  Indeed, we
87should try to protect users as much as possible whatever the kernel they are
88using.  To avoid binary enforcement (i.e. either all security features or
89none), we can leverage a dedicated Landlock command to get the current version
90of the Landlock ABI and adapt the handled accesses.  Let's check if we should
91remove access rights which are only supported in higher versions of the ABI.
92
93.. code-block:: c
94
95    int abi;
96
97    abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
98    if (abi < 0) {
99        /* Degrades gracefully if Landlock is not handled. */
100        perror("The running kernel does not enable to use Landlock");
101        return 0;
102    }
103    switch (abi) {
104    case 1:
105        /* Removes LANDLOCK_ACCESS_FS_REFER for ABI < 2 */
106        ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER;
107        __attribute__((fallthrough));
108    case 2:
109        /* Removes LANDLOCK_ACCESS_FS_TRUNCATE for ABI < 3 */
110        ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_TRUNCATE;
111        __attribute__((fallthrough));
112    case 3:
113        /* Removes network support for ABI < 4 */
114        ruleset_attr.handled_access_net &=
115            ~(LANDLOCK_ACCESS_NET_BIND_TCP |
116              LANDLOCK_ACCESS_NET_CONNECT_TCP);
117    }
118
119This enables to create an inclusive ruleset that will contain our rules.
120
121.. code-block:: c
122
123    int ruleset_fd;
124
125    ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
126    if (ruleset_fd < 0) {
127        perror("Failed to create a ruleset");
128        return 1;
129    }
130
131We can now add a new rule to this ruleset thanks to the returned file
132descriptor referring to this ruleset.  The rule will only allow reading the
133file hierarchy ``/usr``.  Without another rule, write actions would then be
134denied by the ruleset.  To add ``/usr`` to the ruleset, we open it with the
135``O_PATH`` flag and fill the &struct landlock_path_beneath_attr with this file
136descriptor.
137
138.. code-block:: c
139
140    int err;
141    struct landlock_path_beneath_attr path_beneath = {
142        .allowed_access =
143            LANDLOCK_ACCESS_FS_EXECUTE |
144            LANDLOCK_ACCESS_FS_READ_FILE |
145            LANDLOCK_ACCESS_FS_READ_DIR,
146    };
147
148    path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC);
149    if (path_beneath.parent_fd < 0) {
150        perror("Failed to open file");
151        close(ruleset_fd);
152        return 1;
153    }
154    err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
155                            &path_beneath, 0);
156    close(path_beneath.parent_fd);
157    if (err) {
158        perror("Failed to update ruleset");
159        close(ruleset_fd);
160        return 1;
161    }
162
163It may also be required to create rules following the same logic as explained
164for the ruleset creation, by filtering access rights according to the Landlock
165ABI version.  In this example, this is not required because all of the requested
166``allowed_access`` rights are already available in ABI 1.
167
168For network access-control, we can add a set of rules that allow to use a port
169number for a specific action: HTTPS connections.
170
171.. code-block:: c
172
173    struct landlock_net_port_attr net_port = {
174        .allowed_access = LANDLOCK_ACCESS_NET_CONNECT_TCP,
175        .port = 443,
176    };
177
178    err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT,
179                            &net_port, 0);
180
181The next step is to restrict the current thread from gaining more privileges
182(e.g. through a SUID binary).  We now have a ruleset with the first rule
183allowing read access to ``/usr`` while denying all other handled accesses for
184the filesystem, and a second rule allowing HTTPS connections.
185
186.. code-block:: c
187
188    if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
189        perror("Failed to restrict privileges");
190        close(ruleset_fd);
191        return 1;
192    }
193
194The current thread is now ready to sandbox itself with the ruleset.
195
196.. code-block:: c
197
198    if (landlock_restrict_self(ruleset_fd, 0)) {
199        perror("Failed to enforce ruleset");
200        close(ruleset_fd);
201        return 1;
202    }
203    close(ruleset_fd);
204
205If the ``landlock_restrict_self`` system call succeeds, the current thread is
206now restricted and this policy will be enforced on all its subsequently created
207children as well.  Once a thread is landlocked, there is no way to remove its
208security policy; only adding more restrictions is allowed.  These threads are
209now in a new Landlock domain, merge of their parent one (if any) with the new
210ruleset.
211
212Full working code can be found in `samples/landlock/sandboxer.c`_.
213
214Good practices
215--------------
216
217It is recommended setting access rights to file hierarchy leaves as much as
218possible.  For instance, it is better to be able to have ``~/doc/`` as a
219read-only hierarchy and ``~/tmp/`` as a read-write hierarchy, compared to
220``~/`` as a read-only hierarchy and ``~/tmp/`` as a read-write hierarchy.
221Following this good practice leads to self-sufficient hierarchies that do not
222depend on their location (i.e. parent directories).  This is particularly
223relevant when we want to allow linking or renaming.  Indeed, having consistent
224access rights per directory enables to change the location of such directory
225without relying on the destination directory access rights (except those that
226are required for this operation, see ``LANDLOCK_ACCESS_FS_REFER``
227documentation).
228Having self-sufficient hierarchies also helps to tighten the required access
229rights to the minimal set of data.  This also helps avoid sinkhole directories,
230i.e.  directories where data can be linked to but not linked from.  However,
231this depends on data organization, which might not be controlled by developers.
232In this case, granting read-write access to ``~/tmp/``, instead of write-only
233access, would potentially allow to move ``~/tmp/`` to a non-readable directory
234and still keep the ability to list the content of ``~/tmp/``.
235
236Layers of file path access rights
237---------------------------------
238
239Each time a thread enforces a ruleset on itself, it updates its Landlock domain
240with a new layer of policy.  Indeed, this complementary policy is stacked with
241the potentially other rulesets already restricting this thread.  A sandboxed
242thread can then safely add more constraints to itself with a new enforced
243ruleset.
244
245One policy layer grants access to a file path if at least one of its rules
246encountered on the path grants the access.  A sandboxed thread can only access
247a file path if all its enforced policy layers grant the access as well as all
248the other system access controls (e.g. filesystem DAC, other LSM policies,
249etc.).
250
251Bind mounts and OverlayFS
252-------------------------
253
254Landlock enables to restrict access to file hierarchies, which means that these
255access rights can be propagated with bind mounts (cf.
256Documentation/filesystems/sharedsubtree.rst) but not with
257Documentation/filesystems/overlayfs.rst.
258
259A bind mount mirrors a source file hierarchy to a destination.  The destination
260hierarchy is then composed of the exact same files, on which Landlock rules can
261be tied, either via the source or the destination path.  These rules restrict
262access when they are encountered on a path, which means that they can restrict
263access to multiple file hierarchies at the same time, whether these hierarchies
264are the result of bind mounts or not.
265
266An OverlayFS mount point consists of upper and lower layers.  These layers are
267combined in a merge directory, result of the mount point.  This merge hierarchy
268may include files from the upper and lower layers, but modifications performed
269on the merge hierarchy only reflects on the upper layer.  From a Landlock
270policy point of view, each OverlayFS layers and merge hierarchies are
271standalone and contains their own set of files and directories, which is
272different from bind mounts.  A policy restricting an OverlayFS layer will not
273restrict the resulted merged hierarchy, and vice versa.  Landlock users should
274then only think about file hierarchies they want to allow access to, regardless
275of the underlying filesystem.
276
277Inheritance
278-----------
279
280Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain
281restrictions from its parent.  This is similar to the seccomp inheritance (cf.
282Documentation/userspace-api/seccomp_filter.rst) or any other LSM dealing with
283task's :manpage:`credentials(7)`.  For instance, one process's thread may apply
284Landlock rules to itself, but they will not be automatically applied to other
285sibling threads (unlike POSIX thread credential changes, cf.
286:manpage:`nptl(7)`).
287
288When a thread sandboxes itself, we have the guarantee that the related security
289policy will stay enforced on all this thread's descendants.  This allows
290creating standalone and modular security policies per application, which will
291automatically be composed between themselves according to their runtime parent
292policies.
293
294Ptrace restrictions
295-------------------
296
297A sandboxed process has less privileges than a non-sandboxed process and must
298then be subject to additional restrictions when manipulating another process.
299To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target
300process, a sandboxed process should have a subset of the target process rules,
301which means the tracee must be in a sub-domain of the tracer.
302
303Truncating files
304----------------
305
306The operations covered by ``LANDLOCK_ACCESS_FS_WRITE_FILE`` and
307``LANDLOCK_ACCESS_FS_TRUNCATE`` both change the contents of a file and sometimes
308overlap in non-intuitive ways.  It is recommended to always specify both of
309these together.
310
311A particularly surprising example is :manpage:`creat(2)`.  The name suggests
312that this system call requires the rights to create and write files.  However,
313it also requires the truncate right if an existing file under the same name is
314already present.
315
316It should also be noted that truncating files does not require the
317``LANDLOCK_ACCESS_FS_WRITE_FILE`` right.  Apart from the :manpage:`truncate(2)`
318system call, this can also be done through :manpage:`open(2)` with the flags
319``O_RDONLY | O_TRUNC``.
320
321When opening a file, the availability of the ``LANDLOCK_ACCESS_FS_TRUNCATE``
322right is associated with the newly created file descriptor and will be used for
323subsequent truncation attempts using :manpage:`ftruncate(2)`.  The behavior is
324similar to opening a file for reading or writing, where permissions are checked
325during :manpage:`open(2)`, but not during the subsequent :manpage:`read(2)` and
326:manpage:`write(2)` calls.
327
328As a consequence, it is possible to have multiple open file descriptors for the
329same file, where one grants the right to truncate the file and the other does
330not.  It is also possible to pass such file descriptors between processes,
331keeping their Landlock properties, even when these processes do not have an
332enforced Landlock ruleset.
333
334Compatibility
335=============
336
337Backward and forward compatibility
338----------------------------------
339
340Landlock is designed to be compatible with past and future versions of the
341kernel.  This is achieved thanks to the system call attributes and the
342associated bitflags, particularly the ruleset's ``handled_access_fs``.  Making
343handled access right explicit enables the kernel and user space to have a clear
344contract with each other.  This is required to make sure sandboxing will not
345get stricter with a system update, which could break applications.
346
347Developers can subscribe to the `Landlock mailing list
348<https://subspace.kernel.org/lists.linux.dev.html>`_ to knowingly update and
349test their applications with the latest available features.  In the interest of
350users, and because they may use different kernel versions, it is strongly
351encouraged to follow a best-effort security approach by checking the Landlock
352ABI version at runtime and only enforcing the supported features.
353
354.. _landlock_abi_versions:
355
356Landlock ABI versions
357---------------------
358
359The Landlock ABI version can be read with the sys_landlock_create_ruleset()
360system call:
361
362.. code-block:: c
363
364    int abi;
365
366    abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
367    if (abi < 0) {
368        switch (errno) {
369        case ENOSYS:
370            printf("Landlock is not supported by the current kernel.\n");
371            break;
372        case EOPNOTSUPP:
373            printf("Landlock is currently disabled.\n");
374            break;
375        }
376        return 0;
377    }
378    if (abi >= 2) {
379        printf("Landlock supports LANDLOCK_ACCESS_FS_REFER.\n");
380    }
381
382The following kernel interfaces are implicitly supported by the first ABI
383version.  Features only supported from a specific version are explicitly marked
384as such.
385
386Kernel interface
387================
388
389Access rights
390-------------
391
392.. kernel-doc:: include/uapi/linux/landlock.h
393    :identifiers: fs_access net_access
394
395Creating a new ruleset
396----------------------
397
398.. kernel-doc:: security/landlock/syscalls.c
399    :identifiers: sys_landlock_create_ruleset
400
401.. kernel-doc:: include/uapi/linux/landlock.h
402    :identifiers: landlock_ruleset_attr
403
404Extending a ruleset
405-------------------
406
407.. kernel-doc:: security/landlock/syscalls.c
408    :identifiers: sys_landlock_add_rule
409
410.. kernel-doc:: include/uapi/linux/landlock.h
411    :identifiers: landlock_rule_type landlock_path_beneath_attr
412                  landlock_net_port_attr
413
414Enforcing a ruleset
415-------------------
416
417.. kernel-doc:: security/landlock/syscalls.c
418    :identifiers: sys_landlock_restrict_self
419
420Current limitations
421===================
422
423Filesystem topology modification
424--------------------------------
425
426Threads sandboxed with filesystem restrictions cannot modify filesystem
427topology, whether via :manpage:`mount(2)` or :manpage:`pivot_root(2)`.
428However, :manpage:`chroot(2)` calls are not denied.
429
430Special filesystems
431-------------------
432
433Access to regular files and directories can be restricted by Landlock,
434according to the handled accesses of a ruleset.  However, files that do not
435come from a user-visible filesystem (e.g. pipe, socket), but can still be
436accessed through ``/proc/<pid>/fd/*``, cannot currently be explicitly
437restricted.  Likewise, some special kernel filesystems such as nsfs, which can
438be accessed through ``/proc/<pid>/ns/*``, cannot currently be explicitly
439restricted.  However, thanks to the `ptrace restrictions`_, access to such
440sensitive ``/proc`` files are automatically restricted according to domain
441hierarchies.  Future Landlock evolutions could still enable to explicitly
442restrict such paths with dedicated ruleset flags.
443
444Ruleset layers
445--------------
446
447There is a limit of 16 layers of stacked rulesets.  This can be an issue for a
448task willing to enforce a new ruleset in complement to its 16 inherited
449rulesets.  Once this limit is reached, sys_landlock_restrict_self() returns
450E2BIG.  It is then strongly suggested to carefully build rulesets once in the
451life of a thread, especially for applications able to launch other applications
452that may also want to sandbox themselves (e.g. shells, container managers,
453etc.).
454
455Memory usage
456------------
457
458Kernel memory allocated to create rulesets is accounted and can be restricted
459by the Documentation/admin-guide/cgroup-v1/memory.rst.
460
461Previous limitations
462====================
463
464File renaming and linking (ABI < 2)
465-----------------------------------
466
467Because Landlock targets unprivileged access controls, it needs to properly
468handle composition of rules.  Such property also implies rules nesting.
469Properly handling multiple layers of rulesets, each one of them able to
470restrict access to files, also implies inheritance of the ruleset restrictions
471from a parent to its hierarchy.  Because files are identified and restricted by
472their hierarchy, moving or linking a file from one directory to another implies
473propagation of the hierarchy constraints, or restriction of these actions
474according to the potentially lost constraints.  To protect against privilege
475escalations through renaming or linking, and for the sake of simplicity,
476Landlock previously limited linking and renaming to the same directory.
477Starting with the Landlock ABI version 2, it is now possible to securely
478control renaming and linking thanks to the new ``LANDLOCK_ACCESS_FS_REFER``
479access right.
480
481File truncation (ABI < 3)
482-------------------------
483
484File truncation could not be denied before the third Landlock ABI, so it is
485always allowed when using a kernel that only supports the first or second ABI.
486
487Starting with the Landlock ABI version 3, it is now possible to securely control
488truncation thanks to the new ``LANDLOCK_ACCESS_FS_TRUNCATE`` access right.
489
490Network support (ABI < 4)
491-------------------------
492
493Starting with the Landlock ABI version 4, it is now possible to restrict TCP
494bind and connect actions to only a set of allowed ports thanks to the new
495``LANDLOCK_ACCESS_NET_BIND_TCP`` and ``LANDLOCK_ACCESS_NET_CONNECT_TCP``
496access rights.
497
498.. _kernel_support:
499
500Kernel support
501==============
502
503Build time configuration
504------------------------
505
506Landlock was first introduced in Linux 5.13 but it must be configured at build
507time with ``CONFIG_SECURITY_LANDLOCK=y``.  Landlock must also be enabled at boot
508time as the other security modules.  The list of security modules enabled by
509default is set with ``CONFIG_LSM``.  The kernel configuration should then
510contains ``CONFIG_LSM=landlock,[...]`` with ``[...]``  as the list of other
511potentially useful security modules for the running system (see the
512``CONFIG_LSM`` help).
513
514Boot time configuration
515-----------------------
516
517If the running kernel does not have ``landlock`` in ``CONFIG_LSM``, then we can
518enable Landlock by adding ``lsm=landlock,[...]`` to
519Documentation/admin-guide/kernel-parameters.rst in the boot loader
520configuration.
521
522For example, if the current built-in configuration is:
523
524.. code-block:: console
525
526    $ zgrep -h "^CONFIG_LSM=" "/boot/config-$(uname -r)" /proc/config.gz 2>/dev/null
527    CONFIG_LSM="lockdown,yama,integrity,apparmor"
528
529...and if the cmdline doesn't contain ``landlock`` either:
530
531.. code-block:: console
532
533    $ sed -n 's/.*\(\<lsm=\S\+\).*/\1/p' /proc/cmdline
534    lsm=lockdown,yama,integrity,apparmor
535
536...we should configure the boot loader to set a cmdline extending the ``lsm``
537list with the ``landlock,`` prefix::
538
539  lsm=landlock,lockdown,yama,integrity,apparmor
540
541After a reboot, we can check that Landlock is up and running by looking at
542kernel logs:
543
544.. code-block:: console
545
546    # dmesg | grep landlock || journalctl -kb -g landlock
547    [    0.000000] Command line: [...] lsm=landlock,lockdown,yama,integrity,apparmor
548    [    0.000000] Kernel command line: [...] lsm=landlock,lockdown,yama,integrity,apparmor
549    [    0.000000] LSM: initializing lsm=lockdown,capability,landlock,yama,integrity,apparmor
550    [    0.000000] landlock: Up and running.
551
552The kernel may be configured at build time to always load the ``lockdown`` and
553``capability`` LSMs.  In that case, these LSMs will appear at the beginning of
554the ``LSM: initializing`` log line as well, even if they are not configured in
555the boot loader.
556
557Network support
558---------------
559
560To be able to explicitly allow TCP operations (e.g., adding a network rule with
561``LANDLOCK_ACCESS_NET_BIND_TCP``), the kernel must support TCP
562(``CONFIG_INET=y``).  Otherwise, sys_landlock_add_rule() returns an
563``EAFNOSUPPORT`` error, which can safely be ignored because this kind of TCP
564operation is already not possible.
565
566Questions and answers
567=====================
568
569What about user space sandbox managers?
570---------------------------------------
571
572Using user space process to enforce restrictions on kernel resources can lead
573to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of
574the OS code and state
575<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_).
576
577What about namespaces and containers?
578-------------------------------------
579
580Namespaces can help create sandboxes but they are not designed for
581access-control and then miss useful features for such use case (e.g. no
582fine-grained restrictions).  Moreover, their complexity can lead to security
583issues, especially when untrusted processes can manipulate them (cf.
584`Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_).
585
586Additional documentation
587========================
588
589* Documentation/security/landlock.rst
590* https://landlock.io
591
592.. Links
593.. _samples/landlock/sandboxer.c:
594   https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/landlock/sandboxer.c
595