1.. SPDX-License-Identifier: GPL-2.0 2 3.. include:: <isonum.txt> 4 5=============================== 6Bus lock detection and handling 7=============================== 8 9:Copyright: |copy| 2021 Intel Corporation 10:Authors: - Fenghua Yu <fenghua.yu@intel.com> 11 - Tony Luck <tony.luck@intel.com> 12 13Problem 14======= 15 16A split lock is any atomic operation whose operand crosses two cache lines. 17Since the operand spans two cache lines and the operation must be atomic, 18the system locks the bus while the CPU accesses the two cache lines. 19 20A bus lock is acquired through either split locked access to writeback (WB) 21memory or any locked access to non-WB memory. This is typically thousands of 22cycles slower than an atomic operation within a cache line. It also disrupts 23performance on other cores and brings the whole system to its knees. 24 25Detection 26========= 27 28Intel processors may support either or both of the following hardware 29mechanisms to detect split locks and bus locks. Some AMD processors also 30support bus lock detect. 31 32#AC exception for split lock detection 33-------------------------------------- 34 35Beginning with the Tremont Atom CPU split lock operations may raise an 36Alignment Check (#AC) exception when a split lock operation is attempted. 37 38#DB exception for bus lock detection 39------------------------------------ 40 41Some CPUs have the ability to notify the kernel by an #DB trap after a user 42instruction acquires a bus lock and is executed. This allows the kernel to 43terminate the application or to enforce throttling. 44 45Software handling 46================= 47 48The kernel #AC and #DB handlers handle bus lock based on the kernel 49parameter "split_lock_detect". Here is a summary of different options: 50 51+------------------+----------------------------+-----------------------+ 52|split_lock_detect=|#AC for split lock |#DB for bus lock | 53+------------------+----------------------------+-----------------------+ 54|off |Do nothing |Do nothing | 55+------------------+----------------------------+-----------------------+ 56|warn |Kernel OOPs |Warn once per task and | 57|(default) |Warn once per task, add a |and continues to run. | 58| |delay, add synchronization | | 59| |to prevent more than one | | 60| |core from executing a | | 61| |split lock in parallel. | | 62| |sysctl split_lock_mitigate | | 63| |can be used to avoid the | | 64| |delay and synchronization | | 65| |When both features are | | 66| |supported, warn in #AC | | 67+------------------+----------------------------+-----------------------+ 68|fatal |Kernel OOPs |Send SIGBUS to user. | 69| |Send SIGBUS to user | | 70| |When both features are | | 71| |supported, fatal in #AC | | 72+------------------+----------------------------+-----------------------+ 73|ratelimit:N |Do nothing |Limit bus lock rate to | 74|(0 < N <= 1000) | |N bus locks per second | 75| | |system wide and warn on| 76| | |bus locks. | 77+------------------+----------------------------+-----------------------+ 78 79Usages 80====== 81 82Detecting and handling bus lock may find usages in various areas: 83 84It is critical for real time system designers who build consolidated real 85time systems. These systems run hard real time code on some cores and run 86"untrusted" user processes on other cores. The hard real time cannot afford 87to have any bus lock from the untrusted processes to hurt real time 88performance. To date the designers have been unable to deploy these 89solutions as they have no way to prevent the "untrusted" user code from 90generating split lock and bus lock to block the hard real time code to 91access memory during bus locking. 92 93It's also useful for general computing to prevent guests or user 94applications from slowing down the overall system by executing instructions 95with bus lock. 96 97 98Guidance 99======== 100off 101--- 102 103Disable checking for split lock and bus lock. This option can be useful if 104there are legacy applications that trigger these events at a low rate so 105that mitigation is not needed. 106 107warn 108---- 109 110A warning is emitted when a bus lock is detected which allows to identify 111the offending application. This is the default behavior. 112 113fatal 114----- 115 116In this case, the bus lock is not tolerated and the process is killed. 117 118ratelimit 119--------- 120 121A system wide bus lock rate limit N is specified where 0 < N <= 1000. This 122allows a bus lock rate up to N bus locks per second. When the bus lock rate 123is exceeded then any task which is caught via the buslock #DB exception is 124throttled by enforced sleeps until the rate goes under the limit again. 125 126This is an effective mitigation in cases where a minimal impact can be 127tolerated, but an eventual Denial of Service attack has to be prevented. It 128allows to identify the offending processes and analyze whether they are 129malicious or just badly written. 130 131Selecting a rate limit of 1000 allows the bus to be locked for up to about 132seven million cycles each second (assuming 7000 cycles for each bus 133lock). On a 2 GHz processor that would be about 0.35% system slowdown. 134