1.. SPDX-License-Identifier: GPL-2.0 2.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net> 3.. Copyright © 2019-2020 ANSSI 4.. Copyright © 2021-2022 Microsoft Corporation 5 6===================================== 7Landlock: unprivileged access control 8===================================== 9 10:Author: Mickaël Salaün 11:Date: October 2023 12 13The goal of Landlock is to enable to restrict ambient rights (e.g. global 14filesystem or network access) for a set of processes. Because Landlock 15is a stackable LSM, it makes possible to create safe security sandboxes as new 16security layers in addition to the existing system-wide access-controls. This 17kind of sandbox is expected to help mitigate the security impact of bugs or 18unexpected/malicious behaviors in user space applications. Landlock empowers 19any process, including unprivileged ones, to securely restrict themselves. 20 21We can quickly make sure that Landlock is enabled in the running system by 22looking for "landlock: Up and running" in kernel logs (as root): 23``dmesg | grep landlock || journalctl -kb -g landlock`` . 24Developers can also easily check for Landlock support with a 25:ref:`related system call <landlock_abi_versions>`. 26If Landlock is not currently supported, we need to 27:ref:`configure the kernel appropriately <kernel_support>`. 28 29Landlock rules 30============== 31 32A Landlock rule describes an action on an object which the process intends to 33perform. A set of rules is aggregated in a ruleset, which can then restrict 34the thread enforcing it, and its future children. 35 36The two existing types of rules are: 37 38Filesystem rules 39 For these rules, the object is a file hierarchy, 40 and the related filesystem actions are defined with 41 `filesystem access rights`. 42 43Network rules (since ABI v4) 44 For these rules, the object is a TCP port, 45 and the related actions are defined with `network access rights`. 46 47Defining and enforcing a security policy 48---------------------------------------- 49 50We first need to define the ruleset that will contain our rules. 51 52For this example, the ruleset will contain rules that only allow filesystem 53read actions and establish a specific TCP connection. Filesystem write 54actions and other TCP actions will be denied. 55 56The ruleset then needs to handle both these kinds of actions. This is 57required for backward and forward compatibility (i.e. the kernel and user 58space may not know each other's supported restrictions), hence the need 59to be explicit about the denied-by-default access rights. 60 61.. code-block:: c 62 63 struct landlock_ruleset_attr ruleset_attr = { 64 .handled_access_fs = 65 LANDLOCK_ACCESS_FS_EXECUTE | 66 LANDLOCK_ACCESS_FS_WRITE_FILE | 67 LANDLOCK_ACCESS_FS_READ_FILE | 68 LANDLOCK_ACCESS_FS_READ_DIR | 69 LANDLOCK_ACCESS_FS_REMOVE_DIR | 70 LANDLOCK_ACCESS_FS_REMOVE_FILE | 71 LANDLOCK_ACCESS_FS_MAKE_CHAR | 72 LANDLOCK_ACCESS_FS_MAKE_DIR | 73 LANDLOCK_ACCESS_FS_MAKE_REG | 74 LANDLOCK_ACCESS_FS_MAKE_SOCK | 75 LANDLOCK_ACCESS_FS_MAKE_FIFO | 76 LANDLOCK_ACCESS_FS_MAKE_BLOCK | 77 LANDLOCK_ACCESS_FS_MAKE_SYM | 78 LANDLOCK_ACCESS_FS_REFER | 79 LANDLOCK_ACCESS_FS_TRUNCATE, 80 .handled_access_net = 81 LANDLOCK_ACCESS_NET_BIND_TCP | 82 LANDLOCK_ACCESS_NET_CONNECT_TCP, 83 }; 84 85Because we may not know on which kernel version an application will be 86executed, it is safer to follow a best-effort security approach. Indeed, we 87should try to protect users as much as possible whatever the kernel they are 88using. To avoid binary enforcement (i.e. either all security features or 89none), we can leverage a dedicated Landlock command to get the current version 90of the Landlock ABI and adapt the handled accesses. Let's check if we should 91remove access rights which are only supported in higher versions of the ABI. 92 93.. code-block:: c 94 95 int abi; 96 97 abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION); 98 if (abi < 0) { 99 /* Degrades gracefully if Landlock is not handled. */ 100 perror("The running kernel does not enable to use Landlock"); 101 return 0; 102 } 103 switch (abi) { 104 case 1: 105 /* Removes LANDLOCK_ACCESS_FS_REFER for ABI < 2 */ 106 ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER; 107 __attribute__((fallthrough)); 108 case 2: 109 /* Removes LANDLOCK_ACCESS_FS_TRUNCATE for ABI < 3 */ 110 ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_TRUNCATE; 111 __attribute__((fallthrough)); 112 case 3: 113 /* Removes network support for ABI < 4 */ 114 ruleset_attr.handled_access_net &= 115 ~(LANDLOCK_ACCESS_NET_BIND_TCP | 116 LANDLOCK_ACCESS_NET_CONNECT_TCP); 117 } 118 119This enables to create an inclusive ruleset that will contain our rules. 120 121.. code-block:: c 122 123 int ruleset_fd; 124 125 ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0); 126 if (ruleset_fd < 0) { 127 perror("Failed to create a ruleset"); 128 return 1; 129 } 130 131We can now add a new rule to this ruleset thanks to the returned file 132descriptor referring to this ruleset. The rule will only allow reading the 133file hierarchy ``/usr``. Without another rule, write actions would then be 134denied by the ruleset. To add ``/usr`` to the ruleset, we open it with the 135``O_PATH`` flag and fill the &struct landlock_path_beneath_attr with this file 136descriptor. 137 138.. code-block:: c 139 140 int err; 141 struct landlock_path_beneath_attr path_beneath = { 142 .allowed_access = 143 LANDLOCK_ACCESS_FS_EXECUTE | 144 LANDLOCK_ACCESS_FS_READ_FILE | 145 LANDLOCK_ACCESS_FS_READ_DIR, 146 }; 147 148 path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC); 149 if (path_beneath.parent_fd < 0) { 150 perror("Failed to open file"); 151 close(ruleset_fd); 152 return 1; 153 } 154 err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH, 155 &path_beneath, 0); 156 close(path_beneath.parent_fd); 157 if (err) { 158 perror("Failed to update ruleset"); 159 close(ruleset_fd); 160 return 1; 161 } 162 163It may also be required to create rules following the same logic as explained 164for the ruleset creation, by filtering access rights according to the Landlock 165ABI version. In this example, this is not required because all of the requested 166``allowed_access`` rights are already available in ABI 1. 167 168For network access-control, we can add a set of rules that allow to use a port 169number for a specific action: HTTPS connections. 170 171.. code-block:: c 172 173 struct landlock_net_port_attr net_port = { 174 .allowed_access = LANDLOCK_ACCESS_NET_CONNECT_TCP, 175 .port = 443, 176 }; 177 178 err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT, 179 &net_port, 0); 180 181The next step is to restrict the current thread from gaining more privileges 182(e.g. through a SUID binary). We now have a ruleset with the first rule 183allowing read access to ``/usr`` while denying all other handled accesses for 184the filesystem, and a second rule allowing HTTPS connections. 185 186.. code-block:: c 187 188 if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) { 189 perror("Failed to restrict privileges"); 190 close(ruleset_fd); 191 return 1; 192 } 193 194The current thread is now ready to sandbox itself with the ruleset. 195 196.. code-block:: c 197 198 if (landlock_restrict_self(ruleset_fd, 0)) { 199 perror("Failed to enforce ruleset"); 200 close(ruleset_fd); 201 return 1; 202 } 203 close(ruleset_fd); 204 205If the ``landlock_restrict_self`` system call succeeds, the current thread is 206now restricted and this policy will be enforced on all its subsequently created 207children as well. Once a thread is landlocked, there is no way to remove its 208security policy; only adding more restrictions is allowed. These threads are 209now in a new Landlock domain, merge of their parent one (if any) with the new 210ruleset. 211 212Full working code can be found in `samples/landlock/sandboxer.c`_. 213 214Good practices 215-------------- 216 217It is recommended setting access rights to file hierarchy leaves as much as 218possible. For instance, it is better to be able to have ``~/doc/`` as a 219read-only hierarchy and ``~/tmp/`` as a read-write hierarchy, compared to 220``~/`` as a read-only hierarchy and ``~/tmp/`` as a read-write hierarchy. 221Following this good practice leads to self-sufficient hierarchies that do not 222depend on their location (i.e. parent directories). This is particularly 223relevant when we want to allow linking or renaming. Indeed, having consistent 224access rights per directory enables to change the location of such directory 225without relying on the destination directory access rights (except those that 226are required for this operation, see ``LANDLOCK_ACCESS_FS_REFER`` 227documentation). 228Having self-sufficient hierarchies also helps to tighten the required access 229rights to the minimal set of data. This also helps avoid sinkhole directories, 230i.e. directories where data can be linked to but not linked from. However, 231this depends on data organization, which might not be controlled by developers. 232In this case, granting read-write access to ``~/tmp/``, instead of write-only 233access, would potentially allow to move ``~/tmp/`` to a non-readable directory 234and still keep the ability to list the content of ``~/tmp/``. 235 236Layers of file path access rights 237--------------------------------- 238 239Each time a thread enforces a ruleset on itself, it updates its Landlock domain 240with a new layer of policy. Indeed, this complementary policy is stacked with 241the potentially other rulesets already restricting this thread. A sandboxed 242thread can then safely add more constraints to itself with a new enforced 243ruleset. 244 245One policy layer grants access to a file path if at least one of its rules 246encountered on the path grants the access. A sandboxed thread can only access 247a file path if all its enforced policy layers grant the access as well as all 248the other system access controls (e.g. filesystem DAC, other LSM policies, 249etc.). 250 251Bind mounts and OverlayFS 252------------------------- 253 254Landlock enables to restrict access to file hierarchies, which means that these 255access rights can be propagated with bind mounts (cf. 256Documentation/filesystems/sharedsubtree.rst) but not with 257Documentation/filesystems/overlayfs.rst. 258 259A bind mount mirrors a source file hierarchy to a destination. The destination 260hierarchy is then composed of the exact same files, on which Landlock rules can 261be tied, either via the source or the destination path. These rules restrict 262access when they are encountered on a path, which means that they can restrict 263access to multiple file hierarchies at the same time, whether these hierarchies 264are the result of bind mounts or not. 265 266An OverlayFS mount point consists of upper and lower layers. These layers are 267combined in a merge directory, result of the mount point. This merge hierarchy 268may include files from the upper and lower layers, but modifications performed 269on the merge hierarchy only reflects on the upper layer. From a Landlock 270policy point of view, each OverlayFS layers and merge hierarchies are 271standalone and contains their own set of files and directories, which is 272different from bind mounts. A policy restricting an OverlayFS layer will not 273restrict the resulted merged hierarchy, and vice versa. Landlock users should 274then only think about file hierarchies they want to allow access to, regardless 275of the underlying filesystem. 276 277Inheritance 278----------- 279 280Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain 281restrictions from its parent. This is similar to the seccomp inheritance (cf. 282Documentation/userspace-api/seccomp_filter.rst) or any other LSM dealing with 283task's :manpage:`credentials(7)`. For instance, one process's thread may apply 284Landlock rules to itself, but they will not be automatically applied to other 285sibling threads (unlike POSIX thread credential changes, cf. 286:manpage:`nptl(7)`). 287 288When a thread sandboxes itself, we have the guarantee that the related security 289policy will stay enforced on all this thread's descendants. This allows 290creating standalone and modular security policies per application, which will 291automatically be composed between themselves according to their runtime parent 292policies. 293 294Ptrace restrictions 295------------------- 296 297A sandboxed process has less privileges than a non-sandboxed process and must 298then be subject to additional restrictions when manipulating another process. 299To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target 300process, a sandboxed process should have a subset of the target process rules, 301which means the tracee must be in a sub-domain of the tracer. 302 303Truncating files 304---------------- 305 306The operations covered by ``LANDLOCK_ACCESS_FS_WRITE_FILE`` and 307``LANDLOCK_ACCESS_FS_TRUNCATE`` both change the contents of a file and sometimes 308overlap in non-intuitive ways. It is recommended to always specify both of 309these together. 310 311A particularly surprising example is :manpage:`creat(2)`. The name suggests 312that this system call requires the rights to create and write files. However, 313it also requires the truncate right if an existing file under the same name is 314already present. 315 316It should also be noted that truncating files does not require the 317``LANDLOCK_ACCESS_FS_WRITE_FILE`` right. Apart from the :manpage:`truncate(2)` 318system call, this can also be done through :manpage:`open(2)` with the flags 319``O_RDONLY | O_TRUNC``. 320 321When opening a file, the availability of the ``LANDLOCK_ACCESS_FS_TRUNCATE`` 322right is associated with the newly created file descriptor and will be used for 323subsequent truncation attempts using :manpage:`ftruncate(2)`. The behavior is 324similar to opening a file for reading or writing, where permissions are checked 325during :manpage:`open(2)`, but not during the subsequent :manpage:`read(2)` and 326:manpage:`write(2)` calls. 327 328As a consequence, it is possible to have multiple open file descriptors for the 329same file, where one grants the right to truncate the file and the other does 330not. It is also possible to pass such file descriptors between processes, 331keeping their Landlock properties, even when these processes do not have an 332enforced Landlock ruleset. 333 334Compatibility 335============= 336 337Backward and forward compatibility 338---------------------------------- 339 340Landlock is designed to be compatible with past and future versions of the 341kernel. This is achieved thanks to the system call attributes and the 342associated bitflags, particularly the ruleset's ``handled_access_fs``. Making 343handled access right explicit enables the kernel and user space to have a clear 344contract with each other. This is required to make sure sandboxing will not 345get stricter with a system update, which could break applications. 346 347Developers can subscribe to the `Landlock mailing list 348<https://subspace.kernel.org/lists.linux.dev.html>`_ to knowingly update and 349test their applications with the latest available features. In the interest of 350users, and because they may use different kernel versions, it is strongly 351encouraged to follow a best-effort security approach by checking the Landlock 352ABI version at runtime and only enforcing the supported features. 353 354.. _landlock_abi_versions: 355 356Landlock ABI versions 357--------------------- 358 359The Landlock ABI version can be read with the sys_landlock_create_ruleset() 360system call: 361 362.. code-block:: c 363 364 int abi; 365 366 abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION); 367 if (abi < 0) { 368 switch (errno) { 369 case ENOSYS: 370 printf("Landlock is not supported by the current kernel.\n"); 371 break; 372 case EOPNOTSUPP: 373 printf("Landlock is currently disabled.\n"); 374 break; 375 } 376 return 0; 377 } 378 if (abi >= 2) { 379 printf("Landlock supports LANDLOCK_ACCESS_FS_REFER.\n"); 380 } 381 382The following kernel interfaces are implicitly supported by the first ABI 383version. Features only supported from a specific version are explicitly marked 384as such. 385 386Kernel interface 387================ 388 389Access rights 390------------- 391 392.. kernel-doc:: include/uapi/linux/landlock.h 393 :identifiers: fs_access net_access 394 395Creating a new ruleset 396---------------------- 397 398.. kernel-doc:: security/landlock/syscalls.c 399 :identifiers: sys_landlock_create_ruleset 400 401.. kernel-doc:: include/uapi/linux/landlock.h 402 :identifiers: landlock_ruleset_attr 403 404Extending a ruleset 405------------------- 406 407.. kernel-doc:: security/landlock/syscalls.c 408 :identifiers: sys_landlock_add_rule 409 410.. kernel-doc:: include/uapi/linux/landlock.h 411 :identifiers: landlock_rule_type landlock_path_beneath_attr 412 landlock_net_port_attr 413 414Enforcing a ruleset 415------------------- 416 417.. kernel-doc:: security/landlock/syscalls.c 418 :identifiers: sys_landlock_restrict_self 419 420Current limitations 421=================== 422 423Filesystem topology modification 424-------------------------------- 425 426Threads sandboxed with filesystem restrictions cannot modify filesystem 427topology, whether via :manpage:`mount(2)` or :manpage:`pivot_root(2)`. 428However, :manpage:`chroot(2)` calls are not denied. 429 430Special filesystems 431------------------- 432 433Access to regular files and directories can be restricted by Landlock, 434according to the handled accesses of a ruleset. However, files that do not 435come from a user-visible filesystem (e.g. pipe, socket), but can still be 436accessed through ``/proc/<pid>/fd/*``, cannot currently be explicitly 437restricted. Likewise, some special kernel filesystems such as nsfs, which can 438be accessed through ``/proc/<pid>/ns/*``, cannot currently be explicitly 439restricted. However, thanks to the `ptrace restrictions`_, access to such 440sensitive ``/proc`` files are automatically restricted according to domain 441hierarchies. Future Landlock evolutions could still enable to explicitly 442restrict such paths with dedicated ruleset flags. 443 444Ruleset layers 445-------------- 446 447There is a limit of 16 layers of stacked rulesets. This can be an issue for a 448task willing to enforce a new ruleset in complement to its 16 inherited 449rulesets. Once this limit is reached, sys_landlock_restrict_self() returns 450E2BIG. It is then strongly suggested to carefully build rulesets once in the 451life of a thread, especially for applications able to launch other applications 452that may also want to sandbox themselves (e.g. shells, container managers, 453etc.). 454 455Memory usage 456------------ 457 458Kernel memory allocated to create rulesets is accounted and can be restricted 459by the Documentation/admin-guide/cgroup-v1/memory.rst. 460 461Previous limitations 462==================== 463 464File renaming and linking (ABI < 2) 465----------------------------------- 466 467Because Landlock targets unprivileged access controls, it needs to properly 468handle composition of rules. Such property also implies rules nesting. 469Properly handling multiple layers of rulesets, each one of them able to 470restrict access to files, also implies inheritance of the ruleset restrictions 471from a parent to its hierarchy. Because files are identified and restricted by 472their hierarchy, moving or linking a file from one directory to another implies 473propagation of the hierarchy constraints, or restriction of these actions 474according to the potentially lost constraints. To protect against privilege 475escalations through renaming or linking, and for the sake of simplicity, 476Landlock previously limited linking and renaming to the same directory. 477Starting with the Landlock ABI version 2, it is now possible to securely 478control renaming and linking thanks to the new ``LANDLOCK_ACCESS_FS_REFER`` 479access right. 480 481File truncation (ABI < 3) 482------------------------- 483 484File truncation could not be denied before the third Landlock ABI, so it is 485always allowed when using a kernel that only supports the first or second ABI. 486 487Starting with the Landlock ABI version 3, it is now possible to securely control 488truncation thanks to the new ``LANDLOCK_ACCESS_FS_TRUNCATE`` access right. 489 490Network support (ABI < 4) 491------------------------- 492 493Starting with the Landlock ABI version 4, it is now possible to restrict TCP 494bind and connect actions to only a set of allowed ports thanks to the new 495``LANDLOCK_ACCESS_NET_BIND_TCP`` and ``LANDLOCK_ACCESS_NET_CONNECT_TCP`` 496access rights. 497 498.. _kernel_support: 499 500Kernel support 501============== 502 503Build time configuration 504------------------------ 505 506Landlock was first introduced in Linux 5.13 but it must be configured at build 507time with ``CONFIG_SECURITY_LANDLOCK=y``. Landlock must also be enabled at boot 508time as the other security modules. The list of security modules enabled by 509default is set with ``CONFIG_LSM``. The kernel configuration should then 510contains ``CONFIG_LSM=landlock,[...]`` with ``[...]`` as the list of other 511potentially useful security modules for the running system (see the 512``CONFIG_LSM`` help). 513 514Boot time configuration 515----------------------- 516 517If the running kernel does not have ``landlock`` in ``CONFIG_LSM``, then we can 518enable Landlock by adding ``lsm=landlock,[...]`` to 519Documentation/admin-guide/kernel-parameters.rst in the boot loader 520configuration. 521 522For example, if the current built-in configuration is: 523 524.. code-block:: console 525 526 $ zgrep -h "^CONFIG_LSM=" "/boot/config-$(uname -r)" /proc/config.gz 2>/dev/null 527 CONFIG_LSM="lockdown,yama,integrity,apparmor" 528 529...and if the cmdline doesn't contain ``landlock`` either: 530 531.. code-block:: console 532 533 $ sed -n 's/.*\(\<lsm=\S\+\).*/\1/p' /proc/cmdline 534 lsm=lockdown,yama,integrity,apparmor 535 536...we should configure the boot loader to set a cmdline extending the ``lsm`` 537list with the ``landlock,`` prefix:: 538 539 lsm=landlock,lockdown,yama,integrity,apparmor 540 541After a reboot, we can check that Landlock is up and running by looking at 542kernel logs: 543 544.. code-block:: console 545 546 # dmesg | grep landlock || journalctl -kb -g landlock 547 [ 0.000000] Command line: [...] lsm=landlock,lockdown,yama,integrity,apparmor 548 [ 0.000000] Kernel command line: [...] lsm=landlock,lockdown,yama,integrity,apparmor 549 [ 0.000000] LSM: initializing lsm=lockdown,capability,landlock,yama,integrity,apparmor 550 [ 0.000000] landlock: Up and running. 551 552The kernel may be configured at build time to always load the ``lockdown`` and 553``capability`` LSMs. In that case, these LSMs will appear at the beginning of 554the ``LSM: initializing`` log line as well, even if they are not configured in 555the boot loader. 556 557Network support 558--------------- 559 560To be able to explicitly allow TCP operations (e.g., adding a network rule with 561``LANDLOCK_ACCESS_NET_BIND_TCP``), the kernel must support TCP 562(``CONFIG_INET=y``). Otherwise, sys_landlock_add_rule() returns an 563``EAFNOSUPPORT`` error, which can safely be ignored because this kind of TCP 564operation is already not possible. 565 566Questions and answers 567===================== 568 569What about user space sandbox managers? 570--------------------------------------- 571 572Using user space process to enforce restrictions on kernel resources can lead 573to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of 574the OS code and state 575<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_). 576 577What about namespaces and containers? 578------------------------------------- 579 580Namespaces can help create sandboxes but they are not designed for 581access-control and then miss useful features for such use case (e.g. no 582fine-grained restrictions). Moreover, their complexity can lead to security 583issues, especially when untrusted processes can manipulate them (cf. 584`Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_). 585 586Additional documentation 587======================== 588 589* Documentation/security/landlock.rst 590* https://landlock.io 591 592.. Links 593.. _samples/landlock/sandboxer.c: 594 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/landlock/sandboxer.c 595