xref: /linux/Documentation/mm/swap-table.rst (revision 8804d970fab45726b3c7cd7f240b31122aa94219)
1.. SPDX-License-Identifier: GPL-2.0
2
3:Author: Chris Li <chrisl@kernel.org>, Kairui Song <kasong@tencent.com>
4
5==========
6Swap Table
7==========
8
9Swap table implements swap cache as a per-cluster swap cache value array.
10
11Swap Entry
12----------
13
14A swap entry contains the information required to serve the anonymous page
15fault.
16
17Swap entry is encoded as two parts: swap type and swap offset.
18
19The swap type indicates which swap device to use.
20The swap offset is the offset of the swap file to read the page data from.
21
22Swap Cache
23----------
24
25Swap cache is a map to look up folios using swap entry as the key. The result
26value can have three possible types depending on which stage of this swap entry
27was in.
28
291. NULL: This swap entry is not used.
30
312. folio: A folio has been allocated and bound to this swap entry. This is
32   the transient state of swap out or swap in. The folio data can be in
33   the folio or swap file, or both.
34
353. shadow: The shadow contains the working set information of the swapped
36   out folio. This is the normal state for a swapped out page.
37
38Swap Table Internals
39--------------------
40
41The previous swap cache is implemented by XArray. The XArray is a tree
42structure. Each lookup will go through multiple nodes. Can we do better?
43
44Notice that most of the time when we look up the swap cache, we are either
45in a swap in or swap out path. We should already have the swap cluster,
46which contains the swap entry.
47
48If we have a per-cluster array to store swap cache value in the cluster.
49Swap cache lookup within the cluster can be a very simple array lookup.
50
51We give such a per-cluster swap cache value array a name: the swap table.
52
53A swap table is an array of pointers. Each pointer is the same size as a
54PTE. The size of a swap table for one swap cluster typically matches a PTE
55page table, which is one page on modern 64-bit systems.
56
57With swap table, swap cache lookup can achieve great locality, simpler,
58and faster.
59
60Locking
61-------
62
63Swap table modification requires taking the cluster lock. If a folio
64is being added to or removed from the swap table, the folio must be
65locked prior to the cluster lock. After adding or removing is done, the
66folio shall be unlocked.
67
68Swap table lookup is protected by RCU and atomic read. If the lookup
69returns a folio, the user must lock the folio before use.
70