1.. SPDX-License-Identifier: GPL-2.0-only 2.. Copyright (C) 2022 Red Hat, Inc. 3.. Copyright (C) 2022-2023 Isovalent, Inc. 4 5=============================================== 6BPF_MAP_TYPE_HASH, with PERCPU and LRU Variants 7=============================================== 8 9.. note:: 10 - ``BPF_MAP_TYPE_HASH`` was introduced in kernel version 3.19 11 - ``BPF_MAP_TYPE_PERCPU_HASH`` was introduced in version 4.6 12 - Both ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` 13 were introduced in version 4.10 14 15``BPF_MAP_TYPE_HASH`` and ``BPF_MAP_TYPE_PERCPU_HASH`` provide general 16purpose hash map storage. Both the key and the value can be structs, 17allowing for composite keys and values. 18 19The kernel is responsible for allocating and freeing key/value pairs, up 20to the max_entries limit that you specify. Hash maps use pre-allocation 21of hash table elements by default. The ``BPF_F_NO_PREALLOC`` flag can be 22used to disable pre-allocation when it is too memory expensive. 23 24``BPF_MAP_TYPE_PERCPU_HASH`` provides a separate value slot per 25CPU. The per-cpu values are stored internally in an array. 26 27The ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` 28variants add LRU semantics to their respective hash tables. An LRU hash 29will automatically evict the least recently used entries when the hash 30table reaches capacity. An LRU hash maintains an internal LRU list that 31is used to select elements for eviction. This internal LRU list is 32shared across CPUs but it is possible to request a per CPU LRU list with 33the ``BPF_F_NO_COMMON_LRU`` flag when calling ``bpf_map_create``. The 34following table outlines the properties of LRU maps depending on the a 35map type and the flags used to create the map. 36 37======================== ========================= ================================ 38Flag ``BPF_MAP_TYPE_LRU_HASH`` ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` 39======================== ========================= ================================ 40**BPF_F_NO_COMMON_LRU** Per-CPU LRU, global map Per-CPU LRU, per-cpu map 41**!BPF_F_NO_COMMON_LRU** Global LRU, global map Global LRU, per-cpu map 42======================== ========================= ================================ 43 44Usage 45===== 46 47Kernel BPF 48---------- 49 50bpf_map_update_elem() 51~~~~~~~~~~~~~~~~~~~~~ 52 53.. code-block:: c 54 55 long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags) 56 57Hash entries can be added or updated using the ``bpf_map_update_elem()`` 58helper. This helper replaces existing elements atomically. The ``flags`` 59parameter can be used to control the update behaviour: 60 61- ``BPF_ANY`` will create a new element or update an existing element 62- ``BPF_NOEXIST`` will create a new element only if one did not already 63 exist 64- ``BPF_EXIST`` will update an existing element 65 66``bpf_map_update_elem()`` returns 0 on success, or negative error in 67case of failure. 68 69bpf_map_lookup_elem() 70~~~~~~~~~~~~~~~~~~~~~ 71 72.. code-block:: c 73 74 void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 75 76Hash entries can be retrieved using the ``bpf_map_lookup_elem()`` 77helper. This helper returns a pointer to the value associated with 78``key``, or ``NULL`` if no entry was found. 79 80bpf_map_delete_elem() 81~~~~~~~~~~~~~~~~~~~~~ 82 83.. code-block:: c 84 85 long bpf_map_delete_elem(struct bpf_map *map, const void *key) 86 87Hash entries can be deleted using the ``bpf_map_delete_elem()`` 88helper. This helper will return 0 on success, or negative error in case 89of failure. 90 91Per CPU Hashes 92-------------- 93 94For ``BPF_MAP_TYPE_PERCPU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` 95the ``bpf_map_update_elem()`` and ``bpf_map_lookup_elem()`` helpers 96automatically access the hash slot for the current CPU. 97 98bpf_map_lookup_percpu_elem() 99~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 100 101.. code-block:: c 102 103 void *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void *key, u32 cpu) 104 105The ``bpf_map_lookup_percpu_elem()`` helper can be used to lookup the 106value in the hash slot for a specific CPU. Returns value associated with 107``key`` on ``cpu`` , or ``NULL`` if no entry was found or ``cpu`` is 108invalid. 109 110Concurrency 111----------- 112 113Values stored in ``BPF_MAP_TYPE_HASH`` can be accessed concurrently by 114programs running on different CPUs. Since Kernel version 5.1, the BPF 115infrastructure provides ``struct bpf_spin_lock`` to synchronise access. 116See ``tools/testing/selftests/bpf/progs/test_spin_lock.c``. 117 118Userspace 119--------- 120 121bpf_map_get_next_key() 122~~~~~~~~~~~~~~~~~~~~~~ 123 124.. code-block:: c 125 126 int bpf_map_get_next_key(int fd, const void *cur_key, void *next_key) 127 128In userspace, it is possible to iterate through the keys of a hash using 129libbpf's ``bpf_map_get_next_key()`` function. The first key can be fetched by 130calling ``bpf_map_get_next_key()`` with ``cur_key`` set to 131``NULL``. Subsequent calls will fetch the next key that follows the 132current key. ``bpf_map_get_next_key()`` returns 0 on success, -ENOENT if 133cur_key is the last key in the hash, or negative error in case of 134failure. 135 136Note that if ``cur_key`` gets deleted then ``bpf_map_get_next_key()`` 137will instead return the *first* key in the hash table which is 138undesirable. It is recommended to use batched lookup if there is going 139to be key deletion intermixed with ``bpf_map_get_next_key()``. 140 141Examples 142======== 143 144Please see the ``tools/testing/selftests/bpf`` directory for functional 145examples. The code snippets below demonstrates API usage. 146 147This example shows how to declare an LRU Hash with a struct key and a 148struct value. 149 150.. code-block:: c 151 152 #include <linux/bpf.h> 153 #include <bpf/bpf_helpers.h> 154 155 struct key { 156 __u32 srcip; 157 }; 158 159 struct value { 160 __u64 packets; 161 __u64 bytes; 162 }; 163 164 struct { 165 __uint(type, BPF_MAP_TYPE_LRU_HASH); 166 __uint(max_entries, 32); 167 __type(key, struct key); 168 __type(value, struct value); 169 } packet_stats SEC(".maps"); 170 171This example shows how to create or update hash values using atomic 172instructions: 173 174.. code-block:: c 175 176 static void update_stats(__u32 srcip, int bytes) 177 { 178 struct key key = { 179 .srcip = srcip, 180 }; 181 struct value *value = bpf_map_lookup_elem(&packet_stats, &key); 182 183 if (value) { 184 __sync_fetch_and_add(&value->packets, 1); 185 __sync_fetch_and_add(&value->bytes, bytes); 186 } else { 187 struct value newval = { 1, bytes }; 188 189 bpf_map_update_elem(&packet_stats, &key, &newval, BPF_NOEXIST); 190 } 191 } 192 193Userspace walking the map elements from the map declared above: 194 195.. code-block:: c 196 197 #include <bpf/libbpf.h> 198 #include <bpf/bpf.h> 199 200 static void walk_hash_elements(int map_fd) 201 { 202 struct key *cur_key = NULL; 203 struct key next_key; 204 struct value value; 205 int err; 206 207 for (;;) { 208 err = bpf_map_get_next_key(map_fd, cur_key, &next_key); 209 if (err) 210 break; 211 212 bpf_map_lookup_elem(map_fd, &next_key, &value); 213 214 // Use key and value here 215 216 cur_key = &next_key; 217 } 218 } 219 220Internals 221========= 222 223This section of the document is targeted at Linux developers and describes 224aspects of the map implementations that are not considered stable ABI. The 225following details are subject to change in future versions of the kernel. 226 227``BPF_MAP_TYPE_LRU_HASH`` and variants 228-------------------------------------- 229 230Updating elements in LRU maps may trigger eviction behaviour when the capacity 231of the map is reached. There are various steps that the update algorithm 232attempts in order to enforce the LRU property which have increasing impacts on 233other CPUs involved in the following operation attempts: 234 235- Attempt to use CPU-local state to batch operations 236- Attempt to fetch free nodes from global lists 237- Attempt to pull any node from a global list and remove it from the hashmap 238- Attempt to pull any node from any CPU's list and remove it from the hashmap 239 240This algorithm is described visually in the following diagram. See the 241description in commit 3a08c2fd7634 ("bpf: LRU List") for a full explanation of 242the corresponding operations: 243 244.. kernel-figure:: map_lru_hash_update.dot 245 :alt: Diagram outlining the LRU eviction steps taken during map update. 246 247 LRU hash eviction during map update for ``BPF_MAP_TYPE_LRU_HASH`` and 248 variants. See the dot file source for kernel function name code references. 249 250Map updates start from the oval in the top right "begin ``bpf_map_update()``" 251and progress through the graph towards the bottom where the result may be 252either a successful update or a failure with various error codes. The key in 253the top right provides indicators for which locks may be involved in specific 254operations. This is intended as a visual hint for reasoning about how map 255contention may impact update operations, though the map type and flags may 256impact the actual contention on those locks, based on the logic described in 257the table above. For instance, if the map is created with type 258``BPF_MAP_TYPE_LRU_PERCPU_HASH`` and flags ``BPF_F_NO_COMMON_LRU`` then all map 259properties would be per-cpu. 260