11cfa97b3SDave Tucker.. SPDX-License-Identifier: GPL-2.0-only 21cfa97b3SDave Tucker.. Copyright (C) 2022 Red Hat, Inc. 31cfa97b3SDave Tucker 41cfa97b3SDave Tucker================================================ 51cfa97b3SDave TuckerBPF_MAP_TYPE_ARRAY and BPF_MAP_TYPE_PERCPU_ARRAY 61cfa97b3SDave Tucker================================================ 71cfa97b3SDave Tucker 81cfa97b3SDave Tucker.. note:: 91cfa97b3SDave Tucker - ``BPF_MAP_TYPE_ARRAY`` was introduced in kernel version 3.19 101cfa97b3SDave Tucker - ``BPF_MAP_TYPE_PERCPU_ARRAY`` was introduced in version 4.6 111cfa97b3SDave Tucker 121cfa97b3SDave Tucker``BPF_MAP_TYPE_ARRAY`` and ``BPF_MAP_TYPE_PERCPU_ARRAY`` provide generic array 131cfa97b3SDave Tuckerstorage. The key type is an unsigned 32-bit integer (4 bytes) and the map is 141cfa97b3SDave Tuckerof constant size. The size of the array is defined in ``max_entries`` at 151cfa97b3SDave Tuckercreation time. All array elements are pre-allocated and zero initialized when 161cfa97b3SDave Tuckercreated. ``BPF_MAP_TYPE_PERCPU_ARRAY`` uses a different memory region for each 171cfa97b3SDave TuckerCPU whereas ``BPF_MAP_TYPE_ARRAY`` uses the same memory region. The value 181cfa97b3SDave Tuckerstored can be of any size, however, all array elements are aligned to 8 191cfa97b3SDave Tuckerbytes. 201cfa97b3SDave Tucker 211cfa97b3SDave TuckerSince kernel 5.5, memory mapping may be enabled for ``BPF_MAP_TYPE_ARRAY`` by 221cfa97b3SDave Tuckersetting the flag ``BPF_F_MMAPABLE``. The map definition is page-aligned and 231cfa97b3SDave Tuckerstarts on the first page. Sufficient page-sized and page-aligned blocks of 241cfa97b3SDave Tuckermemory are allocated to store all array values, starting on the second page, 251cfa97b3SDave Tuckerwhich in some cases will result in over-allocation of memory. The benefit of 261cfa97b3SDave Tuckerusing this is increased performance and ease of use since userspace programs 271cfa97b3SDave Tuckerwould not be required to use helper functions to access and mutate data. 281cfa97b3SDave Tucker 291cfa97b3SDave TuckerUsage 301cfa97b3SDave Tucker===== 311cfa97b3SDave Tucker 321cfa97b3SDave TuckerKernel BPF 331cfa97b3SDave Tucker---------- 341cfa97b3SDave Tucker 35*539886a3SDonald Hunterbpf_map_lookup_elem() 36*539886a3SDonald Hunter~~~~~~~~~~~~~~~~~~~~~ 37*539886a3SDonald Hunter 38*539886a3SDonald Hunter.. code-block:: c 39*539886a3SDonald Hunter 401cfa97b3SDave Tucker void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 411cfa97b3SDave Tucker 421cfa97b3SDave TuckerArray elements can be retrieved using the ``bpf_map_lookup_elem()`` helper. 431cfa97b3SDave TuckerThis helper returns a pointer into the array element, so to avoid data races 441cfa97b3SDave Tuckerwith userspace reading the value, the user must use primitives like 451cfa97b3SDave Tucker``__sync_fetch_and_add()`` when updating the value in-place. 461cfa97b3SDave Tucker 47*539886a3SDonald Hunterbpf_map_update_elem() 48*539886a3SDonald Hunter~~~~~~~~~~~~~~~~~~~~~ 49*539886a3SDonald Hunter 50*539886a3SDonald Hunter.. code-block:: c 51*539886a3SDonald Hunter 521cfa97b3SDave Tucker long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags) 531cfa97b3SDave Tucker 541cfa97b3SDave TuckerArray elements can be updated using the ``bpf_map_update_elem()`` helper. 551cfa97b3SDave Tucker 561cfa97b3SDave Tucker``bpf_map_update_elem()`` returns 0 on success, or negative error in case of 571cfa97b3SDave Tuckerfailure. 581cfa97b3SDave Tucker 591cfa97b3SDave TuckerSince the array is of constant size, ``bpf_map_delete_elem()`` is not supported. 601cfa97b3SDave TuckerTo clear an array element, you may use ``bpf_map_update_elem()`` to insert a 611cfa97b3SDave Tuckerzero value to that index. 621cfa97b3SDave Tucker 631cfa97b3SDave TuckerPer CPU Array 64*539886a3SDonald Hunter------------- 651cfa97b3SDave Tucker 661cfa97b3SDave TuckerValues stored in ``BPF_MAP_TYPE_ARRAY`` can be accessed by multiple programs 671cfa97b3SDave Tuckeracross different CPUs. To restrict storage to a single CPU, you may use a 681cfa97b3SDave Tucker``BPF_MAP_TYPE_PERCPU_ARRAY``. 691cfa97b3SDave Tucker 701cfa97b3SDave TuckerWhen using a ``BPF_MAP_TYPE_PERCPU_ARRAY`` the ``bpf_map_update_elem()`` and 711cfa97b3SDave Tucker``bpf_map_lookup_elem()`` helpers automatically access the slot for the current 721cfa97b3SDave TuckerCPU. 731cfa97b3SDave Tucker 74*539886a3SDonald Hunterbpf_map_lookup_percpu_elem() 75*539886a3SDonald Hunter~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 76*539886a3SDonald Hunter 77*539886a3SDonald Hunter.. code-block:: c 78*539886a3SDonald Hunter 791cfa97b3SDave Tucker void *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void *key, u32 cpu) 801cfa97b3SDave Tucker 811cfa97b3SDave TuckerThe ``bpf_map_lookup_percpu_elem()`` helper can be used to lookup the array 821cfa97b3SDave Tuckervalue for a specific CPU. Returns value on success , or ``NULL`` if no entry was 831cfa97b3SDave Tuckerfound or ``cpu`` is invalid. 841cfa97b3SDave Tucker 851cfa97b3SDave TuckerConcurrency 861cfa97b3SDave Tucker----------- 871cfa97b3SDave Tucker 881cfa97b3SDave TuckerSince kernel version 5.1, the BPF infrastructure provides ``struct bpf_spin_lock`` 891cfa97b3SDave Tuckerto synchronize access. 901cfa97b3SDave Tucker 911cfa97b3SDave TuckerUserspace 921cfa97b3SDave Tucker--------- 931cfa97b3SDave Tucker 941cfa97b3SDave TuckerAccess from userspace uses libbpf APIs with the same names as above, with 951cfa97b3SDave Tuckerthe map identified by its ``fd``. 961cfa97b3SDave Tucker 971cfa97b3SDave TuckerExamples 981cfa97b3SDave Tucker======== 991cfa97b3SDave Tucker 1001cfa97b3SDave TuckerPlease see the ``tools/testing/selftests/bpf`` directory for functional 1011cfa97b3SDave Tuckerexamples. The code samples below demonstrate API usage. 1021cfa97b3SDave Tucker 1031cfa97b3SDave TuckerKernel BPF 1041cfa97b3SDave Tucker---------- 1051cfa97b3SDave Tucker 1061cfa97b3SDave TuckerThis snippet shows how to declare an array in a BPF program. 1071cfa97b3SDave Tucker 1081cfa97b3SDave Tucker.. code-block:: c 1091cfa97b3SDave Tucker 1101cfa97b3SDave Tucker struct { 1111cfa97b3SDave Tucker __uint(type, BPF_MAP_TYPE_ARRAY); 1121cfa97b3SDave Tucker __type(key, u32); 1131cfa97b3SDave Tucker __type(value, long); 1141cfa97b3SDave Tucker __uint(max_entries, 256); 1151cfa97b3SDave Tucker } my_map SEC(".maps"); 1161cfa97b3SDave Tucker 1171cfa97b3SDave Tucker 1181cfa97b3SDave TuckerThis example BPF program shows how to access an array element. 1191cfa97b3SDave Tucker 1201cfa97b3SDave Tucker.. code-block:: c 1211cfa97b3SDave Tucker 1221cfa97b3SDave Tucker int bpf_prog(struct __sk_buff *skb) 1231cfa97b3SDave Tucker { 1241cfa97b3SDave Tucker struct iphdr ip; 1251cfa97b3SDave Tucker int index; 1261cfa97b3SDave Tucker long *value; 1271cfa97b3SDave Tucker 1281cfa97b3SDave Tucker if (bpf_skb_load_bytes(skb, ETH_HLEN, &ip, sizeof(ip)) < 0) 1291cfa97b3SDave Tucker return 0; 1301cfa97b3SDave Tucker 1311cfa97b3SDave Tucker index = ip.protocol; 1321cfa97b3SDave Tucker value = bpf_map_lookup_elem(&my_map, &index); 1331cfa97b3SDave Tucker if (value) 134e0eb6082SDonald Hunter __sync_fetch_and_add(value, skb->len); 1351cfa97b3SDave Tucker 1361cfa97b3SDave Tucker return 0; 1371cfa97b3SDave Tucker } 1381cfa97b3SDave Tucker 1391cfa97b3SDave TuckerUserspace 1401cfa97b3SDave Tucker--------- 1411cfa97b3SDave Tucker 1421cfa97b3SDave TuckerBPF_MAP_TYPE_ARRAY 1431cfa97b3SDave Tucker~~~~~~~~~~~~~~~~~~ 1441cfa97b3SDave Tucker 1451cfa97b3SDave TuckerThis snippet shows how to create an array, using ``bpf_map_create_opts`` to 1461cfa97b3SDave Tuckerset flags. 1471cfa97b3SDave Tucker 1481cfa97b3SDave Tucker.. code-block:: c 1491cfa97b3SDave Tucker 1501cfa97b3SDave Tucker #include <bpf/libbpf.h> 1511cfa97b3SDave Tucker #include <bpf/bpf.h> 1521cfa97b3SDave Tucker 1531cfa97b3SDave Tucker int create_array() 1541cfa97b3SDave Tucker { 1551cfa97b3SDave Tucker int fd; 1561cfa97b3SDave Tucker LIBBPF_OPTS(bpf_map_create_opts, opts, .map_flags = BPF_F_MMAPABLE); 1571cfa97b3SDave Tucker 1581cfa97b3SDave Tucker fd = bpf_map_create(BPF_MAP_TYPE_ARRAY, 1591cfa97b3SDave Tucker "example_array", /* name */ 1601cfa97b3SDave Tucker sizeof(__u32), /* key size */ 1611cfa97b3SDave Tucker sizeof(long), /* value size */ 1621cfa97b3SDave Tucker 256, /* max entries */ 1631cfa97b3SDave Tucker &opts); /* create opts */ 1641cfa97b3SDave Tucker return fd; 1651cfa97b3SDave Tucker } 1661cfa97b3SDave Tucker 1671cfa97b3SDave TuckerThis snippet shows how to initialize the elements of an array. 1681cfa97b3SDave Tucker 1691cfa97b3SDave Tucker.. code-block:: c 1701cfa97b3SDave Tucker 1711cfa97b3SDave Tucker int initialize_array(int fd) 1721cfa97b3SDave Tucker { 1731cfa97b3SDave Tucker __u32 i; 1741cfa97b3SDave Tucker long value; 1751cfa97b3SDave Tucker int ret; 1761cfa97b3SDave Tucker 1771cfa97b3SDave Tucker for (i = 0; i < 256; i++) { 1781cfa97b3SDave Tucker value = i; 1791cfa97b3SDave Tucker ret = bpf_map_update_elem(fd, &i, &value, BPF_ANY); 1801cfa97b3SDave Tucker if (ret < 0) 1811cfa97b3SDave Tucker return ret; 1821cfa97b3SDave Tucker } 1831cfa97b3SDave Tucker 1841cfa97b3SDave Tucker return ret; 1851cfa97b3SDave Tucker } 1861cfa97b3SDave Tucker 1871cfa97b3SDave TuckerThis snippet shows how to retrieve an element value from an array. 1881cfa97b3SDave Tucker 1891cfa97b3SDave Tucker.. code-block:: c 1901cfa97b3SDave Tucker 1911cfa97b3SDave Tucker int lookup(int fd) 1921cfa97b3SDave Tucker { 1931cfa97b3SDave Tucker __u32 index = 42; 1941cfa97b3SDave Tucker long value; 1951cfa97b3SDave Tucker int ret; 1961cfa97b3SDave Tucker 1971cfa97b3SDave Tucker ret = bpf_map_lookup_elem(fd, &index, &value); 1981cfa97b3SDave Tucker if (ret < 0) 1991cfa97b3SDave Tucker return ret; 2001cfa97b3SDave Tucker 2011cfa97b3SDave Tucker /* use value here */ 2021cfa97b3SDave Tucker assert(value == 42); 2031cfa97b3SDave Tucker 2041cfa97b3SDave Tucker return ret; 2051cfa97b3SDave Tucker } 2061cfa97b3SDave Tucker 2071cfa97b3SDave TuckerBPF_MAP_TYPE_PERCPU_ARRAY 2081cfa97b3SDave Tucker~~~~~~~~~~~~~~~~~~~~~~~~~ 2091cfa97b3SDave Tucker 2101cfa97b3SDave TuckerThis snippet shows how to initialize the elements of a per CPU array. 2111cfa97b3SDave Tucker 2121cfa97b3SDave Tucker.. code-block:: c 2131cfa97b3SDave Tucker 2141cfa97b3SDave Tucker int initialize_array(int fd) 2151cfa97b3SDave Tucker { 2161cfa97b3SDave Tucker int ncpus = libbpf_num_possible_cpus(); 2171cfa97b3SDave Tucker long values[ncpus]; 2181cfa97b3SDave Tucker __u32 i, j; 2191cfa97b3SDave Tucker int ret; 2201cfa97b3SDave Tucker 2211cfa97b3SDave Tucker for (i = 0; i < 256 ; i++) { 2221cfa97b3SDave Tucker for (j = 0; j < ncpus; j++) 2231cfa97b3SDave Tucker values[j] = i; 2241cfa97b3SDave Tucker ret = bpf_map_update_elem(fd, &i, &values, BPF_ANY); 2251cfa97b3SDave Tucker if (ret < 0) 2261cfa97b3SDave Tucker return ret; 2271cfa97b3SDave Tucker } 2281cfa97b3SDave Tucker 2291cfa97b3SDave Tucker return ret; 2301cfa97b3SDave Tucker } 2311cfa97b3SDave Tucker 2321cfa97b3SDave TuckerThis snippet shows how to access the per CPU elements of an array value. 2331cfa97b3SDave Tucker 2341cfa97b3SDave Tucker.. code-block:: c 2351cfa97b3SDave Tucker 2361cfa97b3SDave Tucker int lookup(int fd) 2371cfa97b3SDave Tucker { 2381cfa97b3SDave Tucker int ncpus = libbpf_num_possible_cpus(); 2391cfa97b3SDave Tucker __u32 index = 42, j; 2401cfa97b3SDave Tucker long values[ncpus]; 2411cfa97b3SDave Tucker int ret; 2421cfa97b3SDave Tucker 2431cfa97b3SDave Tucker ret = bpf_map_lookup_elem(fd, &index, &values); 2441cfa97b3SDave Tucker if (ret < 0) 2451cfa97b3SDave Tucker return ret; 2461cfa97b3SDave Tucker 2471cfa97b3SDave Tucker for (j = 0; j < ncpus; j++) { 2481cfa97b3SDave Tucker /* Use per CPU value here */ 2491cfa97b3SDave Tucker assert(values[j] == 42); 2501cfa97b3SDave Tucker } 2511cfa97b3SDave Tucker 2521cfa97b3SDave Tucker return ret; 2531cfa97b3SDave Tucker } 2541cfa97b3SDave Tucker 2551cfa97b3SDave TuckerSemantics 2561cfa97b3SDave Tucker========= 2571cfa97b3SDave Tucker 2581cfa97b3SDave TuckerAs shown in the example above, when accessing a ``BPF_MAP_TYPE_PERCPU_ARRAY`` 2591cfa97b3SDave Tuckerin userspace, each value is an array with ``ncpus`` elements. 2601cfa97b3SDave Tucker 2611cfa97b3SDave TuckerWhen calling ``bpf_map_update_elem()`` the flag ``BPF_NOEXIST`` can not be used 2621cfa97b3SDave Tuckerfor these maps. 263