xref: /linux/Documentation/bpf/map_array.rst (revision 7ae9fb1b7ecbb5d85d07857943f677fd1a559b18)
11cfa97b3SDave Tucker.. SPDX-License-Identifier: GPL-2.0-only
21cfa97b3SDave Tucker.. Copyright (C) 2022 Red Hat, Inc.
31cfa97b3SDave Tucker
41cfa97b3SDave Tucker================================================
51cfa97b3SDave TuckerBPF_MAP_TYPE_ARRAY and BPF_MAP_TYPE_PERCPU_ARRAY
61cfa97b3SDave Tucker================================================
71cfa97b3SDave Tucker
81cfa97b3SDave Tucker.. note::
91cfa97b3SDave Tucker   - ``BPF_MAP_TYPE_ARRAY`` was introduced in kernel version 3.19
101cfa97b3SDave Tucker   - ``BPF_MAP_TYPE_PERCPU_ARRAY`` was introduced in version 4.6
111cfa97b3SDave Tucker
121cfa97b3SDave Tucker``BPF_MAP_TYPE_ARRAY`` and ``BPF_MAP_TYPE_PERCPU_ARRAY`` provide generic array
131cfa97b3SDave Tuckerstorage. The key type is an unsigned 32-bit integer (4 bytes) and the map is
141cfa97b3SDave Tuckerof constant size. The size of the array is defined in ``max_entries`` at
151cfa97b3SDave Tuckercreation time. All array elements are pre-allocated and zero initialized when
161cfa97b3SDave Tuckercreated. ``BPF_MAP_TYPE_PERCPU_ARRAY`` uses a different memory region for each
171cfa97b3SDave TuckerCPU whereas ``BPF_MAP_TYPE_ARRAY`` uses the same memory region. The value
181cfa97b3SDave Tuckerstored can be of any size, however, all array elements are aligned to 8
191cfa97b3SDave Tuckerbytes.
201cfa97b3SDave Tucker
211cfa97b3SDave TuckerSince kernel 5.5, memory mapping may be enabled for ``BPF_MAP_TYPE_ARRAY`` by
221cfa97b3SDave Tuckersetting the flag ``BPF_F_MMAPABLE``. The map definition is page-aligned and
231cfa97b3SDave Tuckerstarts on the first page. Sufficient page-sized and page-aligned blocks of
241cfa97b3SDave Tuckermemory are allocated to store all array values, starting on the second page,
251cfa97b3SDave Tuckerwhich in some cases will result in over-allocation of memory. The benefit of
261cfa97b3SDave Tuckerusing this is increased performance and ease of use since userspace programs
271cfa97b3SDave Tuckerwould not be required to use helper functions to access and mutate data.
281cfa97b3SDave Tucker
291cfa97b3SDave TuckerUsage
301cfa97b3SDave Tucker=====
311cfa97b3SDave Tucker
321cfa97b3SDave TuckerKernel BPF
331cfa97b3SDave Tucker----------
341cfa97b3SDave Tucker
35*539886a3SDonald Hunterbpf_map_lookup_elem()
36*539886a3SDonald Hunter~~~~~~~~~~~~~~~~~~~~~
37*539886a3SDonald Hunter
38*539886a3SDonald Hunter.. code-block:: c
39*539886a3SDonald Hunter
401cfa97b3SDave Tucker   void *bpf_map_lookup_elem(struct bpf_map *map, const void *key)
411cfa97b3SDave Tucker
421cfa97b3SDave TuckerArray elements can be retrieved using the ``bpf_map_lookup_elem()`` helper.
431cfa97b3SDave TuckerThis helper returns a pointer into the array element, so to avoid data races
441cfa97b3SDave Tuckerwith userspace reading the value, the user must use primitives like
451cfa97b3SDave Tucker``__sync_fetch_and_add()`` when updating the value in-place.
461cfa97b3SDave Tucker
47*539886a3SDonald Hunterbpf_map_update_elem()
48*539886a3SDonald Hunter~~~~~~~~~~~~~~~~~~~~~
49*539886a3SDonald Hunter
50*539886a3SDonald Hunter.. code-block:: c
51*539886a3SDonald Hunter
521cfa97b3SDave Tucker   long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags)
531cfa97b3SDave Tucker
541cfa97b3SDave TuckerArray elements can be updated using the ``bpf_map_update_elem()`` helper.
551cfa97b3SDave Tucker
561cfa97b3SDave Tucker``bpf_map_update_elem()`` returns 0 on success, or negative error in case of
571cfa97b3SDave Tuckerfailure.
581cfa97b3SDave Tucker
591cfa97b3SDave TuckerSince the array is of constant size, ``bpf_map_delete_elem()`` is not supported.
601cfa97b3SDave TuckerTo clear an array element, you may use ``bpf_map_update_elem()`` to insert a
611cfa97b3SDave Tuckerzero value to that index.
621cfa97b3SDave Tucker
631cfa97b3SDave TuckerPer CPU Array
64*539886a3SDonald Hunter-------------
651cfa97b3SDave Tucker
661cfa97b3SDave TuckerValues stored in ``BPF_MAP_TYPE_ARRAY`` can be accessed by multiple programs
671cfa97b3SDave Tuckeracross different CPUs. To restrict storage to a single CPU, you may use a
681cfa97b3SDave Tucker``BPF_MAP_TYPE_PERCPU_ARRAY``.
691cfa97b3SDave Tucker
701cfa97b3SDave TuckerWhen using a ``BPF_MAP_TYPE_PERCPU_ARRAY`` the ``bpf_map_update_elem()`` and
711cfa97b3SDave Tucker``bpf_map_lookup_elem()`` helpers automatically access the slot for the current
721cfa97b3SDave TuckerCPU.
731cfa97b3SDave Tucker
74*539886a3SDonald Hunterbpf_map_lookup_percpu_elem()
75*539886a3SDonald Hunter~~~~~~~~~~~~~~~~~~~~~~~~~~~~
76*539886a3SDonald Hunter
77*539886a3SDonald Hunter.. code-block:: c
78*539886a3SDonald Hunter
791cfa97b3SDave Tucker   void *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void *key, u32 cpu)
801cfa97b3SDave Tucker
811cfa97b3SDave TuckerThe ``bpf_map_lookup_percpu_elem()`` helper can be used to lookup the array
821cfa97b3SDave Tuckervalue for a specific CPU. Returns value on success , or ``NULL`` if no entry was
831cfa97b3SDave Tuckerfound or ``cpu`` is invalid.
841cfa97b3SDave Tucker
851cfa97b3SDave TuckerConcurrency
861cfa97b3SDave Tucker-----------
871cfa97b3SDave Tucker
881cfa97b3SDave TuckerSince kernel version 5.1, the BPF infrastructure provides ``struct bpf_spin_lock``
891cfa97b3SDave Tuckerto synchronize access.
901cfa97b3SDave Tucker
911cfa97b3SDave TuckerUserspace
921cfa97b3SDave Tucker---------
931cfa97b3SDave Tucker
941cfa97b3SDave TuckerAccess from userspace uses libbpf APIs with the same names as above, with
951cfa97b3SDave Tuckerthe map identified by its ``fd``.
961cfa97b3SDave Tucker
971cfa97b3SDave TuckerExamples
981cfa97b3SDave Tucker========
991cfa97b3SDave Tucker
1001cfa97b3SDave TuckerPlease see the ``tools/testing/selftests/bpf`` directory for functional
1011cfa97b3SDave Tuckerexamples. The code samples below demonstrate API usage.
1021cfa97b3SDave Tucker
1031cfa97b3SDave TuckerKernel BPF
1041cfa97b3SDave Tucker----------
1051cfa97b3SDave Tucker
1061cfa97b3SDave TuckerThis snippet shows how to declare an array in a BPF program.
1071cfa97b3SDave Tucker
1081cfa97b3SDave Tucker.. code-block:: c
1091cfa97b3SDave Tucker
1101cfa97b3SDave Tucker    struct {
1111cfa97b3SDave Tucker            __uint(type, BPF_MAP_TYPE_ARRAY);
1121cfa97b3SDave Tucker            __type(key, u32);
1131cfa97b3SDave Tucker            __type(value, long);
1141cfa97b3SDave Tucker            __uint(max_entries, 256);
1151cfa97b3SDave Tucker    } my_map SEC(".maps");
1161cfa97b3SDave Tucker
1171cfa97b3SDave Tucker
1181cfa97b3SDave TuckerThis example BPF program shows how to access an array element.
1191cfa97b3SDave Tucker
1201cfa97b3SDave Tucker.. code-block:: c
1211cfa97b3SDave Tucker
1221cfa97b3SDave Tucker    int bpf_prog(struct __sk_buff *skb)
1231cfa97b3SDave Tucker    {
1241cfa97b3SDave Tucker            struct iphdr ip;
1251cfa97b3SDave Tucker            int index;
1261cfa97b3SDave Tucker            long *value;
1271cfa97b3SDave Tucker
1281cfa97b3SDave Tucker            if (bpf_skb_load_bytes(skb, ETH_HLEN, &ip, sizeof(ip)) < 0)
1291cfa97b3SDave Tucker                    return 0;
1301cfa97b3SDave Tucker
1311cfa97b3SDave Tucker            index = ip.protocol;
1321cfa97b3SDave Tucker            value = bpf_map_lookup_elem(&my_map, &index);
1331cfa97b3SDave Tucker            if (value)
134e0eb6082SDonald Hunter                    __sync_fetch_and_add(value, skb->len);
1351cfa97b3SDave Tucker
1361cfa97b3SDave Tucker            return 0;
1371cfa97b3SDave Tucker    }
1381cfa97b3SDave Tucker
1391cfa97b3SDave TuckerUserspace
1401cfa97b3SDave Tucker---------
1411cfa97b3SDave Tucker
1421cfa97b3SDave TuckerBPF_MAP_TYPE_ARRAY
1431cfa97b3SDave Tucker~~~~~~~~~~~~~~~~~~
1441cfa97b3SDave Tucker
1451cfa97b3SDave TuckerThis snippet shows how to create an array, using ``bpf_map_create_opts`` to
1461cfa97b3SDave Tuckerset flags.
1471cfa97b3SDave Tucker
1481cfa97b3SDave Tucker.. code-block:: c
1491cfa97b3SDave Tucker
1501cfa97b3SDave Tucker    #include <bpf/libbpf.h>
1511cfa97b3SDave Tucker    #include <bpf/bpf.h>
1521cfa97b3SDave Tucker
1531cfa97b3SDave Tucker    int create_array()
1541cfa97b3SDave Tucker    {
1551cfa97b3SDave Tucker            int fd;
1561cfa97b3SDave Tucker            LIBBPF_OPTS(bpf_map_create_opts, opts, .map_flags = BPF_F_MMAPABLE);
1571cfa97b3SDave Tucker
1581cfa97b3SDave Tucker            fd = bpf_map_create(BPF_MAP_TYPE_ARRAY,
1591cfa97b3SDave Tucker                                "example_array",       /* name */
1601cfa97b3SDave Tucker                                sizeof(__u32),         /* key size */
1611cfa97b3SDave Tucker                                sizeof(long),          /* value size */
1621cfa97b3SDave Tucker                                256,                   /* max entries */
1631cfa97b3SDave Tucker                                &opts);                /* create opts */
1641cfa97b3SDave Tucker            return fd;
1651cfa97b3SDave Tucker    }
1661cfa97b3SDave Tucker
1671cfa97b3SDave TuckerThis snippet shows how to initialize the elements of an array.
1681cfa97b3SDave Tucker
1691cfa97b3SDave Tucker.. code-block:: c
1701cfa97b3SDave Tucker
1711cfa97b3SDave Tucker    int initialize_array(int fd)
1721cfa97b3SDave Tucker    {
1731cfa97b3SDave Tucker            __u32 i;
1741cfa97b3SDave Tucker            long value;
1751cfa97b3SDave Tucker            int ret;
1761cfa97b3SDave Tucker
1771cfa97b3SDave Tucker            for (i = 0; i < 256; i++) {
1781cfa97b3SDave Tucker                    value = i;
1791cfa97b3SDave Tucker                    ret = bpf_map_update_elem(fd, &i, &value, BPF_ANY);
1801cfa97b3SDave Tucker                    if (ret < 0)
1811cfa97b3SDave Tucker                            return ret;
1821cfa97b3SDave Tucker            }
1831cfa97b3SDave Tucker
1841cfa97b3SDave Tucker            return ret;
1851cfa97b3SDave Tucker    }
1861cfa97b3SDave Tucker
1871cfa97b3SDave TuckerThis snippet shows how to retrieve an element value from an array.
1881cfa97b3SDave Tucker
1891cfa97b3SDave Tucker.. code-block:: c
1901cfa97b3SDave Tucker
1911cfa97b3SDave Tucker    int lookup(int fd)
1921cfa97b3SDave Tucker    {
1931cfa97b3SDave Tucker            __u32 index = 42;
1941cfa97b3SDave Tucker            long value;
1951cfa97b3SDave Tucker            int ret;
1961cfa97b3SDave Tucker
1971cfa97b3SDave Tucker            ret = bpf_map_lookup_elem(fd, &index, &value);
1981cfa97b3SDave Tucker            if (ret < 0)
1991cfa97b3SDave Tucker                    return ret;
2001cfa97b3SDave Tucker
2011cfa97b3SDave Tucker            /* use value here */
2021cfa97b3SDave Tucker            assert(value == 42);
2031cfa97b3SDave Tucker
2041cfa97b3SDave Tucker            return ret;
2051cfa97b3SDave Tucker    }
2061cfa97b3SDave Tucker
2071cfa97b3SDave TuckerBPF_MAP_TYPE_PERCPU_ARRAY
2081cfa97b3SDave Tucker~~~~~~~~~~~~~~~~~~~~~~~~~
2091cfa97b3SDave Tucker
2101cfa97b3SDave TuckerThis snippet shows how to initialize the elements of a per CPU array.
2111cfa97b3SDave Tucker
2121cfa97b3SDave Tucker.. code-block:: c
2131cfa97b3SDave Tucker
2141cfa97b3SDave Tucker    int initialize_array(int fd)
2151cfa97b3SDave Tucker    {
2161cfa97b3SDave Tucker            int ncpus = libbpf_num_possible_cpus();
2171cfa97b3SDave Tucker            long values[ncpus];
2181cfa97b3SDave Tucker            __u32 i, j;
2191cfa97b3SDave Tucker            int ret;
2201cfa97b3SDave Tucker
2211cfa97b3SDave Tucker            for (i = 0; i < 256 ; i++) {
2221cfa97b3SDave Tucker                    for (j = 0; j < ncpus; j++)
2231cfa97b3SDave Tucker                            values[j] = i;
2241cfa97b3SDave Tucker                    ret = bpf_map_update_elem(fd, &i, &values, BPF_ANY);
2251cfa97b3SDave Tucker                    if (ret < 0)
2261cfa97b3SDave Tucker                            return ret;
2271cfa97b3SDave Tucker            }
2281cfa97b3SDave Tucker
2291cfa97b3SDave Tucker            return ret;
2301cfa97b3SDave Tucker    }
2311cfa97b3SDave Tucker
2321cfa97b3SDave TuckerThis snippet shows how to access the per CPU elements of an array value.
2331cfa97b3SDave Tucker
2341cfa97b3SDave Tucker.. code-block:: c
2351cfa97b3SDave Tucker
2361cfa97b3SDave Tucker    int lookup(int fd)
2371cfa97b3SDave Tucker    {
2381cfa97b3SDave Tucker            int ncpus = libbpf_num_possible_cpus();
2391cfa97b3SDave Tucker            __u32 index = 42, j;
2401cfa97b3SDave Tucker            long values[ncpus];
2411cfa97b3SDave Tucker            int ret;
2421cfa97b3SDave Tucker
2431cfa97b3SDave Tucker            ret = bpf_map_lookup_elem(fd, &index, &values);
2441cfa97b3SDave Tucker            if (ret < 0)
2451cfa97b3SDave Tucker                    return ret;
2461cfa97b3SDave Tucker
2471cfa97b3SDave Tucker            for (j = 0; j < ncpus; j++) {
2481cfa97b3SDave Tucker                    /* Use per CPU value here */
2491cfa97b3SDave Tucker                    assert(values[j] == 42);
2501cfa97b3SDave Tucker            }
2511cfa97b3SDave Tucker
2521cfa97b3SDave Tucker            return ret;
2531cfa97b3SDave Tucker    }
2541cfa97b3SDave Tucker
2551cfa97b3SDave TuckerSemantics
2561cfa97b3SDave Tucker=========
2571cfa97b3SDave Tucker
2581cfa97b3SDave TuckerAs shown in the example above, when accessing a ``BPF_MAP_TYPE_PERCPU_ARRAY``
2591cfa97b3SDave Tuckerin userspace, each value is an array with ``ncpus`` elements.
2601cfa97b3SDave Tucker
2611cfa97b3SDave TuckerWhen calling ``bpf_map_update_elem()`` the flag ``BPF_NOEXIST`` can not be used
2621cfa97b3SDave Tuckerfor these maps.
263