The Linux Kernel threat model ============================= There are a lot of assumptions regarding what the kernel does and does not protect against. These assumptions tend to cause confusion for bug reports (:doc:`security-related ones ` vs :doc:`non-security ones <../admin-guide/reporting-issues>`), and can complicate security enforcement when the responsibilities for some boundaries is not clear between the kernel, distros, administrators and users. This document tries to clarify the responsibilities of the kernel in this domain. The kernel's responsibilities ----------------------------- The kernel abstracts access to local hardware resources and to remote systems in a way that allows multiple local users to get a fair share of the available resources granted to them, and, when the underlying hardware permits, to assign a level of confidentiality to their communications and to the data they are processing or storing. The kernel assumes that the underlying hardware behaves according to its specifications. This includes the integrity of the CPU's instruction set, the transparency of the branch prediction unit and the cache units, the consistency of the Memory Management Unit (MMU), the isolation of DMA-capable peripherals (e.g., via IOMMU), state transitions in controllers, ranges of values read from registers, the respect of documented hardware limitations, etc. When hardware fails to maintain its specified isolation (e.g., CPU bugs, side-channels, hardware response to unexpected inputs), the kernel will usually attempt to implement reasonable mitigations. These are best-effort measures intended to reduce the attack surface or elevate the cost of an attack within the limits of the hardware's facilities; they do not constitute a kernel-provided safety guarantee. Users always perform their activities under the authority of an administrator who is able to grant or deny various types of permissions that may affect how users benefit from available resources, or the level of confidentiality of their activities. Administrators may also delegate all or part of their own permissions to some users, particularly via capabilities but not only. All this is performed via configuration (sysctl, file-system permissions etc). The Linux Kernel applies a certain collection of default settings that match its threat model. Distros have their own threat model and will come with their own configuration presets, that the administrator may have to adjust to better suit their expectations (relax or restrict). By default, the Linux Kernel guarantees the following protections when running on common processors featuring privilege levels and memory management units: * **User-based isolation**: an unprivileged user may restrict access to their own data from other unprivileged users running on the same system. This includes: * stored data, via file system permissions * in-memory data (pages are not accessible by default to other users) * process activity (ptrace is not permitted to other users) * inter-process communication (other users may not observe data exchanged via UNIX domain sockets or other IPC mechanisms). * network communications within the same or with other systems * **Capability-based protection**: * users not having elevated capabilities (including but not limited to CAP_SYS_ADMIN) may not alter the kernel's configuration, memory nor state, change other users' view of the file system layout, grant any user capabilities they do not have, nor affect the system's availability (shutdown, reboot, panic, hang, or making the system unresponsive via unbounded resource exhaustion). * users not having the ``CAP_NET_ADMIN`` capability may not alter the network configuration, intercept nor spoof network communications from other users nor systems. * users not having ``CAP_SYS_PTRACE`` may not observe other users' processes activities. When ``CONFIG_USER_NS`` is set, the kernel also permits unprivileged users to create their own user namespace in which they have all capabilities, but with a number of restrictions (they may not perform actions that have impacts on the initial user namespace, such as changing time, loading modules or mounting block devices). Please refer to ``user_namespaces(7)`` for more details, the possibilities of user namespaces are not covered in this document. The kernel also offers a lot of troubleshooting and debugging facilities, which can constitute attack vectors when placed in wrong hands. While some of them are designed to be accessible to regular local users with a low risk (e.g. kernel logs via ``/proc/kmsg``), some would expose enough information to represent a risk in most places and the decision to expose them is under the administrator's responsibility (perf events, traces), and others are not designed to be accessed by non-privileged users (e.g. debugfs). Access to these facilities by a user who has been explicitly granted permission by an administrator does not constitute a security breach. Bugs that permit to violate the principles above constitute security breaches. However, bugs that permit one violation only once another one was already achieved are only weaknesses. The kernel applies a number of self-protection measures whose purpose is to avoid crossing a security boundary when certain classes of bugs are found, but a failure of these extra protections do not constitute a vulnerability alone. What does not constitute a security bug --------------------------------------- In the Linux kernel's threat model, the following classes of problems are **NOT** considered as Linux Kernel security bugs. However, when it is believed that the kernel could do better, they should be reported, so that they can be reviewed and fixed where reasonably possible, but they will be handled as any regular bug: * **Configuration**: * outdated kernels and particularly end-of-life branches are out of the scope of the kernel's threat model: administrators are responsible for keeping their system up to date. For a bug to qualify as a security bug, it must be demonstrated that it affects actively maintained versions. * build-level: changes to the kernel configuration that are explicitly documented as lowering the security level (e.g. ``CONFIG_NOMMU``), or targeted at developers only. * OS-level: changes to command line parameters, sysctls, filesystem permissions, user capabilities, exposure of privileged interfaces, that explicitly increase exposure by either offering non-default access to unprivileged users, or reduce the kernel's ability to enforce some protections or mitigations. Example: write access to procfs or debugfs. * issues triggered only when using features intended for development or debugging (e.g., LOCKDEP, KASAN, FAULT_INJECTION): these features are known to introduce overhead and potential instability and are not intended for production use. * issues affecting drivers exposed under CONFIG_STAGING, as well as features marked EXPERIMENTAL in the configuration. * loading of explicitly insecure/broken/staging modules, and generally any using any subsystem marked as experimental or not intended for production use. * running out-of-tree modules or unofficial kernel forks; these should be reported to the relevant vendor. * **Excess of initial privileges**: * actions performed by a user already possessing the privileges required to perform that action or modify that state (e.g. ``CAP_SYS_ADMIN``, ``CAP_NET_ADMIN``, ``CAP_SYS_RAWIO``, ``CAP_SYS_MODULE`` with no further boundary being crossed). * actions performed in user namespace that do not bypass the restrictions imposed to the initial user (e.g. ptrace usage, signal delivery, resource usage, access to FS/device/sysctl/memory, network binding, system/network configuration etc). * anything performed by the root user in the initial namespace (e.g. kernel oops when writing to a privileged device). * **Out of production use**: This covers theoretical/probabilistic attacks that rely on laboratory conditions with zero system noise, or those requiring an unrealistic number of attempts (e.g., billions of trials) that would be detected by standard system monitoring long before success, such as: * prediction of random numbers that only works in a totally silent environment (such as IP ID, TCP ports or sequence numbers that can only be guessed in a lab). * activity observation and information leaks based on probabilistic approaches that are prone to measurement noise and not realistically reproducible on a production system. * issues that can only be triggered by heavy attacks (e.g. brute force) whose impact on the system makes it unlikely or impossible to remain undetected before they succeed (e.g. consuming all memory before succeeding). * problems seen only under development simulators, emulators, or combinations that do not exist on real systems at the time of reporting (issues involving tens of millions of threads, tens of thousands of CPUs, unrealistic CPU frequencies, RAM sizes or disk capacities, network speeds. * issues whose reproduction requires hardware modification or emulation, including fake USB devices that pretend to be another one. * as well as issues that can be triggered at a cost that is orders of magnitude higher than the expected benefits (e.g. fully functional keyboard emulator only to retrieve 7 uninitialized bytes in a structure, or brute-force method involving millions of connection attempts to guess a port number). * **Hardening failures**: * ability to bypass some of the kernel's hardening measures with no demonstrable exploit path (e.g. ASLR bypass, events timing or probing with no demonstrable consequence). These are just weaknesses, not vulnerabilities. * missing argument checks and failure to report certain errors with no immediate consequence. * **Random information leaks**: This concerns information leaks of small data parts that happen to be there and that cannot be chosen by the attacker, or face access restrictions: * structure padding reported by syscalls or other interfaces. * identifiers, partial data, non-terminated strings reported in error messages. * Leaks of kernel memory addresses/pointers do not constitute an immediately exploitable vector and are not security bugs, though they must be reported and fixed. * **Crafted file system images**: * bugs triggered by mounting a corrupted or maliciously crafted file system image are generally not security bugs, as the kernel assumes the underlying storage media is under the administrator's control, unless the filesystem driver is specifically documented as being hardened against untrusted media. * issues that are resolved, mitigated, or detected by running a filesystem consistency check (fsck) on the image prior to mounting. * **Physical access**: Issues that require physical access to the machine, hardware modification, or the use of specialized hardware (e.g., logic analyzers, DMA-attack tools over PCI-E/Thunderbolt) are out of scope unless the system is explicitly configured with technologies meant to defend against such attacks (e.g. IOMMU). * **Functional and performance regressions**: Any issue that can be mitigated by setting proper permissions and limits doesn't qualify as a security bug.