xref: /linux/Documentation/core-api/real-time/hardware.rst (revision 72c395024dac5e215136cbff793455f065603b06)
1.. SPDX-License-Identifier: GPL-2.0
2
3====================
4Considering hardware
5====================
6
7:Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
8
9The way a workload is handled can be influenced by the hardware it runs on.
10Key components include the CPU, memory, and the buses that connect them.
11These resources are shared among all applications on the system.
12As a result, heavy utilization of one resource by a single application
13can affect the deterministic handling of workloads in other applications.
14
15Below is a brief overview.
16
17System memory and cache
18-----------------------
19
20Main memory and the associated caches are the most common shared resources among
21tasks in a system. One task can dominate the available caches, forcing another
22task to wait until a cache line is written back to main memory before it can
23proceed. The impact of this contention varies based on write patterns and the
24size of the caches available. Larger caches may reduce stalls because more lines
25can be buffered before being written back. Conversely, certain write patterns
26may trigger the cache controller to flush many lines at once, causing
27applications to stall until the operation completes.
28
29This issue can be partly mitigated if applications do not share the same CPU
30cache. The kernel is aware of the cache topology and exports this information to
31user space. Tools such as **lstopo** from the Portable Hardware Locality (hwloc)
32project (https://www.open-mpi.org/projects/hwloc/) can visualize the hierarchy.
33
34Avoiding shared L2 or L3 caches is not always possible. Even when cache sharing
35is minimized, bottlenecks can still occur when accessing system memory. Memory
36is used not only by the CPU but also by peripheral devices via DMA, such as
37graphics cards or network adapters.
38
39In some cases, cache and memory bottlenecks can be controlled if the hardware
40provides the necessary support. On x86 systems, Intel offers Cache Allocation
41Technology (CAT), which enables cache partitioning among applications and
42provides control over the interconnect. AMD provides similar functionality under
43Platform Quality of Service (PQoS). On Arm64, the equivalent is Memory
44System Resource Partitioning and Monitoring (MPAM).
45
46These features can be configured through the Linux Resource Control interface.
47For details, see Documentation/filesystems/resctrl.rst.
48
49The perf tool can be used to monitor cache behavior. It can analyze
50cache misses of an application and compare how they change under
51different workloads on a neighboring CPU. Even more powerful, the perf
52c2c tool can help identify cache-to-cache issues, where multiple CPU
53cores repeatedly access and modify data on the same cache line.
54
55Hardware buses
56--------------
57
58Real-time systems often need to access hardware directly to perform their work.
59Any latency in this process is undesirable, as it can affect the outcome of the
60task. For example, on an I/O bus, a changed output may not become immediately
61visible but instead appear with variable delay depending on the latency of the
62bus used for communication.
63
64A bus such as PCI is relatively simple because register accesses are routed
65directly to the connected device. In the worst case, a read operation stalls the
66CPU until the device responds.
67
68A bus such as USB is more complex, involving multiple layers. A register read
69or write is wrapped in a USB Request Block (URB), which is then sent by the
70USB host controller to the device. Timing and latency are influenced by the
71underlying USB bus. Requests cannot be sent immediately; they must align with
72the next frame boundary according to the endpoint type and the host controller's
73scheduling rules. This can introduce delays and additional latency. For example,
74a network device connected via USB may still deliver sufficient throughput, but
75the added latency when sending or receiving packets may fail to meet the
76requirements of certain real-time use cases.
77
78Additional restrictions on bus latency can arise from power management. For
79instance, PCIe with Active State Power Management (ASPM) enabled can suspend
80the link between the device and the host. While this behavior is beneficial for
81power savings, it delays device access and adds latency to responses. This issue
82is not limited to PCIe; internal buses within a System-on-Chip (SoC) can also be
83affected by power management mechanisms.
84
85Virtualization
86--------------
87
88In a virtualized environment such as KVM, each guest CPU is represented as a
89thread on the host. If such a thread runs with real-time priority, the system
90should be tested to confirm it can sustain this behavior over extended periods.
91Because of its priority, the thread will not be preempted by lower-priority
92threads (such as SCHED_OTHER), which may then receive no CPU time. This can
93cause problems if a lower-priority thread is pinned to a CPU already occupied by
94a real-time task and unable to make progress. Even if a CPU has been isolated,
95the system may still (accidentally) start a per‑CPU thread on that CPU.
96Ensuring that a guest CPU goes idle is difficult, as it requires avoiding both
97task scheduling and interrupt handling. Furthermore, if the guest CPU does go
98idle but the guest system is booted with the option **idle=poll**, the guest
99CPU will never enter an idle state and will instead spin until an event
100arrives.
101
102Device handling introduces additional considerations. Emulated PCI devices or
103VirtIO devices require a counterpart on the host to complete requests. This
104adds latency because the host must intercept and either process the request
105directly or schedule a thread for its completion. These delays can be avoided if
106the required PCI device is passed directly through to the guest. Some devices,
107such as networking or storage controllers, support the PCIe SR-IOV feature.
108SR-IOV allows a single PCIe device to be divided into multiple virtual functions,
109which can then be assigned to different guests.
110
111Networking
112----------
113
114For low-latency networking, the full networking stack may be undesirable, as it
115can introduce additional sources of delay. In this context, XDP can be used
116as a shortcut to bypass much of the stack while still relying on the kernel's
117network driver.
118
119The requirements are that the network driver must support XDP- preferably using
120an "skb pool" and that the application must use an XDP socket. Additional
121configuration may involve BPF filters, tuning networking queues, or configuring
122qdiscs for time-based transmission. These techniques are often
123applied in Time-Sensitive Networking (TSN) environments.
124
125Documenting all required steps exceeds the scope of this text. For detailed
126guidance, see the TSN documentation at https://tsn.readthedocs.io.
127
128Another useful resource is the Linux Real-Time Communication Testbench
129https://github.com/Linutronix/RTC-Testbench.
130The goal of this project is to validate real-time network communication. It can
131be thought of as a "cyclictest" for networking and also serves as a starting
132point for application development.
133