1.. SPDX-License-Identifier: GPL-2.0 2.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net> 3.. Copyright © 2019-2020 ANSSI 4.. Copyright © 2021-2022 Microsoft Corporation 5 6===================================== 7Landlock: unprivileged access control 8===================================== 9 10:Author: Mickaël Salaün 11:Date: October 2023 12 13The goal of Landlock is to enable to restrict ambient rights (e.g. global 14filesystem or network access) for a set of processes. Because Landlock 15is a stackable LSM, it makes possible to create safe security sandboxes as new 16security layers in addition to the existing system-wide access-controls. This 17kind of sandbox is expected to help mitigate the security impact of bugs or 18unexpected/malicious behaviors in user space applications. Landlock empowers 19any process, including unprivileged ones, to securely restrict themselves. 20 21We can quickly make sure that Landlock is enabled in the running system by 22looking for "landlock: Up and running" in kernel logs (as root): ``dmesg | grep 23landlock || journalctl -kg landlock`` . Developers can also easily check for 24Landlock support with a :ref:`related system call <landlock_abi_versions>`. If 25Landlock is not currently supported, we need to :ref:`configure the kernel 26appropriately <kernel_support>`. 27 28Landlock rules 29============== 30 31A Landlock rule describes an action on an object which the process intends to 32perform. A set of rules is aggregated in a ruleset, which can then restrict 33the thread enforcing it, and its future children. 34 35The two existing types of rules are: 36 37Filesystem rules 38 For these rules, the object is a file hierarchy, 39 and the related filesystem actions are defined with 40 `filesystem access rights`. 41 42Network rules (since ABI v4) 43 For these rules, the object is a TCP port, 44 and the related actions are defined with `network access rights`. 45 46Defining and enforcing a security policy 47---------------------------------------- 48 49We first need to define the ruleset that will contain our rules. 50 51For this example, the ruleset will contain rules that only allow filesystem 52read actions and establish a specific TCP connection. Filesystem write 53actions and other TCP actions will be denied. 54 55The ruleset then needs to handle both these kinds of actions. This is 56required for backward and forward compatibility (i.e. the kernel and user 57space may not know each other's supported restrictions), hence the need 58to be explicit about the denied-by-default access rights. 59 60.. code-block:: c 61 62 struct landlock_ruleset_attr ruleset_attr = { 63 .handled_access_fs = 64 LANDLOCK_ACCESS_FS_EXECUTE | 65 LANDLOCK_ACCESS_FS_WRITE_FILE | 66 LANDLOCK_ACCESS_FS_READ_FILE | 67 LANDLOCK_ACCESS_FS_READ_DIR | 68 LANDLOCK_ACCESS_FS_REMOVE_DIR | 69 LANDLOCK_ACCESS_FS_REMOVE_FILE | 70 LANDLOCK_ACCESS_FS_MAKE_CHAR | 71 LANDLOCK_ACCESS_FS_MAKE_DIR | 72 LANDLOCK_ACCESS_FS_MAKE_REG | 73 LANDLOCK_ACCESS_FS_MAKE_SOCK | 74 LANDLOCK_ACCESS_FS_MAKE_FIFO | 75 LANDLOCK_ACCESS_FS_MAKE_BLOCK | 76 LANDLOCK_ACCESS_FS_MAKE_SYM | 77 LANDLOCK_ACCESS_FS_REFER | 78 LANDLOCK_ACCESS_FS_TRUNCATE, 79 .handled_access_net = 80 LANDLOCK_ACCESS_NET_BIND_TCP | 81 LANDLOCK_ACCESS_NET_CONNECT_TCP, 82 }; 83 84Because we may not know on which kernel version an application will be 85executed, it is safer to follow a best-effort security approach. Indeed, we 86should try to protect users as much as possible whatever the kernel they are 87using. To avoid binary enforcement (i.e. either all security features or 88none), we can leverage a dedicated Landlock command to get the current version 89of the Landlock ABI and adapt the handled accesses. Let's check if we should 90remove access rights which are only supported in higher versions of the ABI. 91 92.. code-block:: c 93 94 int abi; 95 96 abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION); 97 if (abi < 0) { 98 /* Degrades gracefully if Landlock is not handled. */ 99 perror("The running kernel does not enable to use Landlock"); 100 return 0; 101 } 102 switch (abi) { 103 case 1: 104 /* Removes LANDLOCK_ACCESS_FS_REFER for ABI < 2 */ 105 ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER; 106 __attribute__((fallthrough)); 107 case 2: 108 /* Removes LANDLOCK_ACCESS_FS_TRUNCATE for ABI < 3 */ 109 ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_TRUNCATE; 110 __attribute__((fallthrough)); 111 case 3: 112 /* Removes network support for ABI < 4 */ 113 ruleset_attr.handled_access_net &= 114 ~(LANDLOCK_ACCESS_NET_BIND_TCP | 115 LANDLOCK_ACCESS_NET_CONNECT_TCP); 116 } 117 118This enables to create an inclusive ruleset that will contain our rules. 119 120.. code-block:: c 121 122 int ruleset_fd; 123 124 ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0); 125 if (ruleset_fd < 0) { 126 perror("Failed to create a ruleset"); 127 return 1; 128 } 129 130We can now add a new rule to this ruleset thanks to the returned file 131descriptor referring to this ruleset. The rule will only allow reading the 132file hierarchy ``/usr``. Without another rule, write actions would then be 133denied by the ruleset. To add ``/usr`` to the ruleset, we open it with the 134``O_PATH`` flag and fill the &struct landlock_path_beneath_attr with this file 135descriptor. 136 137.. code-block:: c 138 139 int err; 140 struct landlock_path_beneath_attr path_beneath = { 141 .allowed_access = 142 LANDLOCK_ACCESS_FS_EXECUTE | 143 LANDLOCK_ACCESS_FS_READ_FILE | 144 LANDLOCK_ACCESS_FS_READ_DIR, 145 }; 146 147 path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC); 148 if (path_beneath.parent_fd < 0) { 149 perror("Failed to open file"); 150 close(ruleset_fd); 151 return 1; 152 } 153 err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH, 154 &path_beneath, 0); 155 close(path_beneath.parent_fd); 156 if (err) { 157 perror("Failed to update ruleset"); 158 close(ruleset_fd); 159 return 1; 160 } 161 162It may also be required to create rules following the same logic as explained 163for the ruleset creation, by filtering access rights according to the Landlock 164ABI version. In this example, this is not required because all of the requested 165``allowed_access`` rights are already available in ABI 1. 166 167For network access-control, we can add a set of rules that allow to use a port 168number for a specific action: HTTPS connections. 169 170.. code-block:: c 171 172 struct landlock_net_port_attr net_port = { 173 .allowed_access = LANDLOCK_ACCESS_NET_CONNECT_TCP, 174 .port = 443, 175 }; 176 177 err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_NET_PORT, 178 &net_port, 0); 179 180The next step is to restrict the current thread from gaining more privileges 181(e.g. through a SUID binary). We now have a ruleset with the first rule 182allowing read access to ``/usr`` while denying all other handled accesses for 183the filesystem, and a second rule allowing HTTPS connections. 184 185.. code-block:: c 186 187 if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) { 188 perror("Failed to restrict privileges"); 189 close(ruleset_fd); 190 return 1; 191 } 192 193The current thread is now ready to sandbox itself with the ruleset. 194 195.. code-block:: c 196 197 if (landlock_restrict_self(ruleset_fd, 0)) { 198 perror("Failed to enforce ruleset"); 199 close(ruleset_fd); 200 return 1; 201 } 202 close(ruleset_fd); 203 204If the ``landlock_restrict_self`` system call succeeds, the current thread is 205now restricted and this policy will be enforced on all its subsequently created 206children as well. Once a thread is landlocked, there is no way to remove its 207security policy; only adding more restrictions is allowed. These threads are 208now in a new Landlock domain, merge of their parent one (if any) with the new 209ruleset. 210 211Full working code can be found in `samples/landlock/sandboxer.c`_. 212 213Good practices 214-------------- 215 216It is recommended setting access rights to file hierarchy leaves as much as 217possible. For instance, it is better to be able to have ``~/doc/`` as a 218read-only hierarchy and ``~/tmp/`` as a read-write hierarchy, compared to 219``~/`` as a read-only hierarchy and ``~/tmp/`` as a read-write hierarchy. 220Following this good practice leads to self-sufficient hierarchies that do not 221depend on their location (i.e. parent directories). This is particularly 222relevant when we want to allow linking or renaming. Indeed, having consistent 223access rights per directory enables to change the location of such directory 224without relying on the destination directory access rights (except those that 225are required for this operation, see ``LANDLOCK_ACCESS_FS_REFER`` 226documentation). 227Having self-sufficient hierarchies also helps to tighten the required access 228rights to the minimal set of data. This also helps avoid sinkhole directories, 229i.e. directories where data can be linked to but not linked from. However, 230this depends on data organization, which might not be controlled by developers. 231In this case, granting read-write access to ``~/tmp/``, instead of write-only 232access, would potentially allow to move ``~/tmp/`` to a non-readable directory 233and still keep the ability to list the content of ``~/tmp/``. 234 235Layers of file path access rights 236--------------------------------- 237 238Each time a thread enforces a ruleset on itself, it updates its Landlock domain 239with a new layer of policy. Indeed, this complementary policy is stacked with 240the potentially other rulesets already restricting this thread. A sandboxed 241thread can then safely add more constraints to itself with a new enforced 242ruleset. 243 244One policy layer grants access to a file path if at least one of its rules 245encountered on the path grants the access. A sandboxed thread can only access 246a file path if all its enforced policy layers grant the access as well as all 247the other system access controls (e.g. filesystem DAC, other LSM policies, 248etc.). 249 250Bind mounts and OverlayFS 251------------------------- 252 253Landlock enables to restrict access to file hierarchies, which means that these 254access rights can be propagated with bind mounts (cf. 255Documentation/filesystems/sharedsubtree.rst) but not with 256Documentation/filesystems/overlayfs.rst. 257 258A bind mount mirrors a source file hierarchy to a destination. The destination 259hierarchy is then composed of the exact same files, on which Landlock rules can 260be tied, either via the source or the destination path. These rules restrict 261access when they are encountered on a path, which means that they can restrict 262access to multiple file hierarchies at the same time, whether these hierarchies 263are the result of bind mounts or not. 264 265An OverlayFS mount point consists of upper and lower layers. These layers are 266combined in a merge directory, result of the mount point. This merge hierarchy 267may include files from the upper and lower layers, but modifications performed 268on the merge hierarchy only reflects on the upper layer. From a Landlock 269policy point of view, each OverlayFS layers and merge hierarchies are 270standalone and contains their own set of files and directories, which is 271different from bind mounts. A policy restricting an OverlayFS layer will not 272restrict the resulted merged hierarchy, and vice versa. Landlock users should 273then only think about file hierarchies they want to allow access to, regardless 274of the underlying filesystem. 275 276Inheritance 277----------- 278 279Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain 280restrictions from its parent. This is similar to the seccomp inheritance (cf. 281Documentation/userspace-api/seccomp_filter.rst) or any other LSM dealing with 282task's :manpage:`credentials(7)`. For instance, one process's thread may apply 283Landlock rules to itself, but they will not be automatically applied to other 284sibling threads (unlike POSIX thread credential changes, cf. 285:manpage:`nptl(7)`). 286 287When a thread sandboxes itself, we have the guarantee that the related security 288policy will stay enforced on all this thread's descendants. This allows 289creating standalone and modular security policies per application, which will 290automatically be composed between themselves according to their runtime parent 291policies. 292 293Ptrace restrictions 294------------------- 295 296A sandboxed process has less privileges than a non-sandboxed process and must 297then be subject to additional restrictions when manipulating another process. 298To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target 299process, a sandboxed process should have a subset of the target process rules, 300which means the tracee must be in a sub-domain of the tracer. 301 302Truncating files 303---------------- 304 305The operations covered by ``LANDLOCK_ACCESS_FS_WRITE_FILE`` and 306``LANDLOCK_ACCESS_FS_TRUNCATE`` both change the contents of a file and sometimes 307overlap in non-intuitive ways. It is recommended to always specify both of 308these together. 309 310A particularly surprising example is :manpage:`creat(2)`. The name suggests 311that this system call requires the rights to create and write files. However, 312it also requires the truncate right if an existing file under the same name is 313already present. 314 315It should also be noted that truncating files does not require the 316``LANDLOCK_ACCESS_FS_WRITE_FILE`` right. Apart from the :manpage:`truncate(2)` 317system call, this can also be done through :manpage:`open(2)` with the flags 318``O_RDONLY | O_TRUNC``. 319 320When opening a file, the availability of the ``LANDLOCK_ACCESS_FS_TRUNCATE`` 321right is associated with the newly created file descriptor and will be used for 322subsequent truncation attempts using :manpage:`ftruncate(2)`. The behavior is 323similar to opening a file for reading or writing, where permissions are checked 324during :manpage:`open(2)`, but not during the subsequent :manpage:`read(2)` and 325:manpage:`write(2)` calls. 326 327As a consequence, it is possible to have multiple open file descriptors for the 328same file, where one grants the right to truncate the file and the other does 329not. It is also possible to pass such file descriptors between processes, 330keeping their Landlock properties, even when these processes do not have an 331enforced Landlock ruleset. 332 333Compatibility 334============= 335 336Backward and forward compatibility 337---------------------------------- 338 339Landlock is designed to be compatible with past and future versions of the 340kernel. This is achieved thanks to the system call attributes and the 341associated bitflags, particularly the ruleset's ``handled_access_fs``. Making 342handled access right explicit enables the kernel and user space to have a clear 343contract with each other. This is required to make sure sandboxing will not 344get stricter with a system update, which could break applications. 345 346Developers can subscribe to the `Landlock mailing list 347<https://subspace.kernel.org/lists.linux.dev.html>`_ to knowingly update and 348test their applications with the latest available features. In the interest of 349users, and because they may use different kernel versions, it is strongly 350encouraged to follow a best-effort security approach by checking the Landlock 351ABI version at runtime and only enforcing the supported features. 352 353.. _landlock_abi_versions: 354 355Landlock ABI versions 356--------------------- 357 358The Landlock ABI version can be read with the sys_landlock_create_ruleset() 359system call: 360 361.. code-block:: c 362 363 int abi; 364 365 abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION); 366 if (abi < 0) { 367 switch (errno) { 368 case ENOSYS: 369 printf("Landlock is not supported by the current kernel.\n"); 370 break; 371 case EOPNOTSUPP: 372 printf("Landlock is currently disabled.\n"); 373 break; 374 } 375 return 0; 376 } 377 if (abi >= 2) { 378 printf("Landlock supports LANDLOCK_ACCESS_FS_REFER.\n"); 379 } 380 381The following kernel interfaces are implicitly supported by the first ABI 382version. Features only supported from a specific version are explicitly marked 383as such. 384 385Kernel interface 386================ 387 388Access rights 389------------- 390 391.. kernel-doc:: include/uapi/linux/landlock.h 392 :identifiers: fs_access net_access 393 394Creating a new ruleset 395---------------------- 396 397.. kernel-doc:: security/landlock/syscalls.c 398 :identifiers: sys_landlock_create_ruleset 399 400.. kernel-doc:: include/uapi/linux/landlock.h 401 :identifiers: landlock_ruleset_attr 402 403Extending a ruleset 404------------------- 405 406.. kernel-doc:: security/landlock/syscalls.c 407 :identifiers: sys_landlock_add_rule 408 409.. kernel-doc:: include/uapi/linux/landlock.h 410 :identifiers: landlock_rule_type landlock_path_beneath_attr 411 landlock_net_port_attr 412 413Enforcing a ruleset 414------------------- 415 416.. kernel-doc:: security/landlock/syscalls.c 417 :identifiers: sys_landlock_restrict_self 418 419Current limitations 420=================== 421 422Filesystem topology modification 423-------------------------------- 424 425Threads sandboxed with filesystem restrictions cannot modify filesystem 426topology, whether via :manpage:`mount(2)` or :manpage:`pivot_root(2)`. 427However, :manpage:`chroot(2)` calls are not denied. 428 429Special filesystems 430------------------- 431 432Access to regular files and directories can be restricted by Landlock, 433according to the handled accesses of a ruleset. However, files that do not 434come from a user-visible filesystem (e.g. pipe, socket), but can still be 435accessed through ``/proc/<pid>/fd/*``, cannot currently be explicitly 436restricted. Likewise, some special kernel filesystems such as nsfs, which can 437be accessed through ``/proc/<pid>/ns/*``, cannot currently be explicitly 438restricted. However, thanks to the `ptrace restrictions`_, access to such 439sensitive ``/proc`` files are automatically restricted according to domain 440hierarchies. Future Landlock evolutions could still enable to explicitly 441restrict such paths with dedicated ruleset flags. 442 443Ruleset layers 444-------------- 445 446There is a limit of 16 layers of stacked rulesets. This can be an issue for a 447task willing to enforce a new ruleset in complement to its 16 inherited 448rulesets. Once this limit is reached, sys_landlock_restrict_self() returns 449E2BIG. It is then strongly suggested to carefully build rulesets once in the 450life of a thread, especially for applications able to launch other applications 451that may also want to sandbox themselves (e.g. shells, container managers, 452etc.). 453 454Memory usage 455------------ 456 457Kernel memory allocated to create rulesets is accounted and can be restricted 458by the Documentation/admin-guide/cgroup-v1/memory.rst. 459 460Previous limitations 461==================== 462 463File renaming and linking (ABI < 2) 464----------------------------------- 465 466Because Landlock targets unprivileged access controls, it needs to properly 467handle composition of rules. Such property also implies rules nesting. 468Properly handling multiple layers of rulesets, each one of them able to 469restrict access to files, also implies inheritance of the ruleset restrictions 470from a parent to its hierarchy. Because files are identified and restricted by 471their hierarchy, moving or linking a file from one directory to another implies 472propagation of the hierarchy constraints, or restriction of these actions 473according to the potentially lost constraints. To protect against privilege 474escalations through renaming or linking, and for the sake of simplicity, 475Landlock previously limited linking and renaming to the same directory. 476Starting with the Landlock ABI version 2, it is now possible to securely 477control renaming and linking thanks to the new ``LANDLOCK_ACCESS_FS_REFER`` 478access right. 479 480File truncation (ABI < 3) 481------------------------- 482 483File truncation could not be denied before the third Landlock ABI, so it is 484always allowed when using a kernel that only supports the first or second ABI. 485 486Starting with the Landlock ABI version 3, it is now possible to securely control 487truncation thanks to the new ``LANDLOCK_ACCESS_FS_TRUNCATE`` access right. 488 489Network support (ABI < 4) 490------------------------- 491 492Starting with the Landlock ABI version 4, it is now possible to restrict TCP 493bind and connect actions to only a set of allowed ports thanks to the new 494``LANDLOCK_ACCESS_NET_BIND_TCP`` and ``LANDLOCK_ACCESS_NET_CONNECT_TCP`` 495access rights. 496 497.. _kernel_support: 498 499Kernel support 500============== 501 502Landlock was first introduced in Linux 5.13 but it must be configured at build 503time with ``CONFIG_SECURITY_LANDLOCK=y``. Landlock must also be enabled at boot 504time as the other security modules. The list of security modules enabled by 505default is set with ``CONFIG_LSM``. The kernel configuration should then 506contains ``CONFIG_LSM=landlock,[...]`` with ``[...]`` as the list of other 507potentially useful security modules for the running system (see the 508``CONFIG_LSM`` help). 509 510If the running kernel does not have ``landlock`` in ``CONFIG_LSM``, then we can 511still enable it by adding ``lsm=landlock,[...]`` to 512Documentation/admin-guide/kernel-parameters.rst thanks to the bootloader 513configuration. 514 515To be able to explicitly allow TCP operations (e.g., adding a network rule with 516``LANDLOCK_ACCESS_NET_BIND_TCP``), the kernel must support TCP 517(``CONFIG_INET=y``). Otherwise, sys_landlock_add_rule() returns an 518``EAFNOSUPPORT`` error, which can safely be ignored because this kind of TCP 519operation is already not possible. 520 521Questions and answers 522===================== 523 524What about user space sandbox managers? 525--------------------------------------- 526 527Using user space process to enforce restrictions on kernel resources can lead 528to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of 529the OS code and state 530<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_). 531 532What about namespaces and containers? 533------------------------------------- 534 535Namespaces can help create sandboxes but they are not designed for 536access-control and then miss useful features for such use case (e.g. no 537fine-grained restrictions). Moreover, their complexity can lead to security 538issues, especially when untrusted processes can manipulate them (cf. 539`Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_). 540 541Additional documentation 542======================== 543 544* Documentation/security/landlock.rst 545* https://landlock.io 546 547.. Links 548.. _samples/landlock/sandboxer.c: 549 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/landlock/sandboxer.c 550