xref: /linux/Documentation/bpf/btf.rst (revision 1aa77e716c6f2332f2d4664f747ff4eba731825b)
1=====================
2BPF Type Format (BTF)
3=====================
4
51. Introduction
6===============
7
8BTF (BPF Type Format) is the metadata format which encodes the debug info
9related to BPF program/map. The name BTF was used initially to describe data
10types. The BTF was later extended to include function info for defined
11subroutines, and line info for source/line information.
12
13The debug info is used for map pretty print, function signature, etc. The
14function signature enables better bpf program/function kernel symbol. The line
15info helps generate source annotated translated byte code, jited code and
16verifier log.
17
18The BTF specification contains two parts,
19  * BTF kernel API
20  * BTF ELF file format
21
22The kernel API is the contract between user space and kernel. The kernel
23verifies the BTF info before using it. The ELF file format is a user space
24contract between ELF file and libbpf loader.
25
26The type and string sections are part of the BTF kernel API, describing the
27debug info (mostly types related) referenced by the bpf program. These two
28sections are discussed in details in :ref:`BTF_Type_String`.
29
30.. _BTF_Type_String:
31
322. BTF Type and String Encoding
33===============================
34
35The file ``include/uapi/linux/btf.h`` provides high-level definition of how
36types/strings are encoded.
37
38The beginning of data blob must be::
39
40    struct btf_header {
41        __u16   magic;
42        __u8    version;
43        __u8    flags;
44        __u32   hdr_len;
45
46        /* All offsets are in bytes relative to the end of this header */
47        __u32   type_off;       /* offset of type section       */
48        __u32   type_len;       /* length of type section       */
49        __u32   str_off;        /* offset of string section     */
50        __u32   str_len;        /* length of string section     */
51    };
52
53The magic is ``0xeB9F``, which has different encoding for big and little
54endian systems, and can be used to test whether BTF is generated for big- or
55little-endian target. The ``btf_header`` is designed to be extensible with
56``hdr_len`` equal to ``sizeof(struct btf_header)`` when a data blob is
57generated.
58
592.1 String Encoding
60-------------------
61
62The first string in the string section must be a null string. The rest of
63string table is a concatenation of other null-terminated strings.
64
652.2 Type Encoding
66-----------------
67
68The type id ``0`` is reserved for ``void`` type. The type section is parsed
69sequentially and type id is assigned to each recognized type starting from id
70``1``. Currently, the following types are supported::
71
72    #define BTF_KIND_INT            1       /* Integer      */
73    #define BTF_KIND_PTR            2       /* Pointer      */
74    #define BTF_KIND_ARRAY          3       /* Array        */
75    #define BTF_KIND_STRUCT         4       /* Struct       */
76    #define BTF_KIND_UNION          5       /* Union        */
77    #define BTF_KIND_ENUM           6       /* Enumeration  */
78    #define BTF_KIND_FWD            7       /* Forward      */
79    #define BTF_KIND_TYPEDEF        8       /* Typedef      */
80    #define BTF_KIND_VOLATILE       9       /* Volatile     */
81    #define BTF_KIND_CONST          10      /* Const        */
82    #define BTF_KIND_RESTRICT       11      /* Restrict     */
83    #define BTF_KIND_FUNC           12      /* Function     */
84    #define BTF_KIND_FUNC_PROTO     13      /* Function Proto       */
85    #define BTF_KIND_VAR            14      /* Variable     */
86    #define BTF_KIND_DATASEC        15      /* Section      */
87    #define BTF_KIND_FLOAT          16      /* Floating point       */
88    #define BTF_KIND_DECL_TAG       17      /* Decl Tag     */
89    #define BTF_KIND_TYPE_TAG       18      /* Type Tag     */
90
91Note that the type section encodes debug info, not just pure types.
92``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram.
93
94Each type contains the following common data::
95
96    struct btf_type {
97        __u32 name_off;
98        /* "info" bits arrangement
99         * bits  0-15: vlen (e.g. # of struct's members)
100         * bits 16-23: unused
101         * bits 24-28: kind (e.g. int, ptr, array...etc)
102         * bits 29-30: unused
103         * bit     31: kind_flag, currently used by
104         *             struct, union and fwd
105         */
106        __u32 info;
107        /* "size" is used by INT, ENUM, STRUCT and UNION.
108         * "size" tells the size of the type it is describing.
109         *
110         * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
111         * FUNC, FUNC_PROTO, DECL_TAG and TYPE_TAG.
112         * "type" is a type_id referring to another type.
113         */
114        union {
115                __u32 size;
116                __u32 type;
117        };
118    };
119
120For certain kinds, the common data are followed by kind-specific data. The
121``name_off`` in ``struct btf_type`` specifies the offset in the string table.
122The following sections detail encoding of each kind.
123
1242.2.1 BTF_KIND_INT
125~~~~~~~~~~~~~~~~~~
126
127``struct btf_type`` encoding requirement:
128 * ``name_off``: any valid offset
129 * ``info.kind_flag``: 0
130 * ``info.kind``: BTF_KIND_INT
131 * ``info.vlen``: 0
132 * ``size``: the size of the int type in bytes.
133
134``btf_type`` is followed by a ``u32`` with the following bits arrangement::
135
136  #define BTF_INT_ENCODING(VAL)   (((VAL) & 0x0f000000) >> 24)
137  #define BTF_INT_OFFSET(VAL)     (((VAL) & 0x00ff0000) >> 16)
138  #define BTF_INT_BITS(VAL)       ((VAL)  & 0x000000ff)
139
140The ``BTF_INT_ENCODING`` has the following attributes::
141
142  #define BTF_INT_SIGNED  (1 << 0)
143  #define BTF_INT_CHAR    (1 << 1)
144  #define BTF_INT_BOOL    (1 << 2)
145
146The ``BTF_INT_ENCODING()`` provides extra information: signedness, char, or
147bool, for the int type. The char and bool encoding are mostly useful for
148pretty print. At most one encoding can be specified for the int type.
149
150The ``BTF_INT_BITS()`` specifies the number of actual bits held by this int
151type. For example, a 4-bit bitfield encodes ``BTF_INT_BITS()`` equals to 4.
152The ``btf_type.size * 8`` must be equal to or greater than ``BTF_INT_BITS()``
153for the type. The maximum value of ``BTF_INT_BITS()`` is 128.
154
155The ``BTF_INT_OFFSET()`` specifies the starting bit offset to calculate values
156for this int. For example, a bitfield struct member has:
157
158 * btf member bit offset 100 from the start of the structure,
159 * btf member pointing to an int type,
160 * the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4``
161
162Then in the struct memory layout, this member will occupy ``4`` bits starting
163from bits ``100 + 2 = 102``.
164
165Alternatively, the bitfield struct member can be the following to access the
166same bits as the above:
167
168 * btf member bit offset 102,
169 * btf member pointing to an int type,
170 * the int type has ``BTF_INT_OFFSET() = 0`` and ``BTF_INT_BITS() = 4``
171
172The original intention of ``BTF_INT_OFFSET()`` is to provide flexibility of
173bitfield encoding. Currently, both llvm and pahole generate
174``BTF_INT_OFFSET() = 0`` for all int types.
175
1762.2.2 BTF_KIND_PTR
177~~~~~~~~~~~~~~~~~~
178
179``struct btf_type`` encoding requirement:
180  * ``name_off``: 0
181  * ``info.kind_flag``: 0
182  * ``info.kind``: BTF_KIND_PTR
183  * ``info.vlen``: 0
184  * ``type``: the pointee type of the pointer
185
186No additional type data follow ``btf_type``.
187
1882.2.3 BTF_KIND_ARRAY
189~~~~~~~~~~~~~~~~~~~~
190
191``struct btf_type`` encoding requirement:
192  * ``name_off``: 0
193  * ``info.kind_flag``: 0
194  * ``info.kind``: BTF_KIND_ARRAY
195  * ``info.vlen``: 0
196  * ``size/type``: 0, not used
197
198``btf_type`` is followed by one ``struct btf_array``::
199
200    struct btf_array {
201        __u32   type;
202        __u32   index_type;
203        __u32   nelems;
204    };
205
206The ``struct btf_array`` encoding:
207  * ``type``: the element type
208  * ``index_type``: the index type
209  * ``nelems``: the number of elements for this array (``0`` is also allowed).
210
211The ``index_type`` can be any regular int type (``u8``, ``u16``, ``u32``,
212``u64``, ``unsigned __int128``). The original design of including
213``index_type`` follows DWARF, which has an ``index_type`` for its array type.
214Currently in BTF, beyond type verification, the ``index_type`` is not used.
215
216The ``struct btf_array`` allows chaining through element type to represent
217multidimensional arrays. For example, for ``int a[5][6]``, the following type
218information illustrates the chaining:
219
220  * [1]: int
221  * [2]: array, ``btf_array.type = [1]``, ``btf_array.nelems = 6``
222  * [3]: array, ``btf_array.type = [2]``, ``btf_array.nelems = 5``
223
224Currently, both pahole and llvm collapse multidimensional array into
225one-dimensional array, e.g., for ``a[5][6]``, the ``btf_array.nelems`` is
226equal to ``30``. This is because the original use case is map pretty print
227where the whole array is dumped out so one-dimensional array is enough. As
228more BTF usage is explored, pahole and llvm can be changed to generate proper
229chained representation for multidimensional arrays.
230
2312.2.4 BTF_KIND_STRUCT
232~~~~~~~~~~~~~~~~~~~~~
2332.2.5 BTF_KIND_UNION
234~~~~~~~~~~~~~~~~~~~~
235
236``struct btf_type`` encoding requirement:
237  * ``name_off``: 0 or offset to a valid C identifier
238  * ``info.kind_flag``: 0 or 1
239  * ``info.kind``: BTF_KIND_STRUCT or BTF_KIND_UNION
240  * ``info.vlen``: the number of struct/union members
241  * ``info.size``: the size of the struct/union in bytes
242
243``btf_type`` is followed by ``info.vlen`` number of ``struct btf_member``.::
244
245    struct btf_member {
246        __u32   name_off;
247        __u32   type;
248        __u32   offset;
249    };
250
251``struct btf_member`` encoding:
252  * ``name_off``: offset to a valid C identifier
253  * ``type``: the member type
254  * ``offset``: <see below>
255
256If the type info ``kind_flag`` is not set, the offset contains only bit offset
257of the member. Note that the base type of the bitfield can only be int or enum
258type. If the bitfield size is 32, the base type can be either int or enum
259type. If the bitfield size is not 32, the base type must be int, and int type
260``BTF_INT_BITS()`` encodes the bitfield size.
261
262If the ``kind_flag`` is set, the ``btf_member.offset`` contains both member
263bitfield size and bit offset. The bitfield size and bit offset are calculated
264as below.::
265
266  #define BTF_MEMBER_BITFIELD_SIZE(val)   ((val) >> 24)
267  #define BTF_MEMBER_BIT_OFFSET(val)      ((val) & 0xffffff)
268
269In this case, if the base type is an int type, it must be a regular int type:
270
271  * ``BTF_INT_OFFSET()`` must be 0.
272  * ``BTF_INT_BITS()`` must be equal to ``{1,2,4,8,16} * 8``.
273
274The following kernel patch introduced ``kind_flag`` and explained why both
275modes exist:
276
277  https://github.com/torvalds/linux/commit/9d5f9f701b1891466fb3dbb1806ad97716f95cc3#diff-fa650a64fdd3968396883d2fe8215ff3
278
2792.2.6 BTF_KIND_ENUM
280~~~~~~~~~~~~~~~~~~~
281
282``struct btf_type`` encoding requirement:
283  * ``name_off``: 0 or offset to a valid C identifier
284  * ``info.kind_flag``: 0
285  * ``info.kind``: BTF_KIND_ENUM
286  * ``info.vlen``: number of enum values
287  * ``size``: 4
288
289``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum``.::
290
291    struct btf_enum {
292        __u32   name_off;
293        __s32   val;
294    };
295
296The ``btf_enum`` encoding:
297  * ``name_off``: offset to a valid C identifier
298  * ``val``: any value
299
3002.2.7 BTF_KIND_FWD
301~~~~~~~~~~~~~~~~~~
302
303``struct btf_type`` encoding requirement:
304  * ``name_off``: offset to a valid C identifier
305  * ``info.kind_flag``: 0 for struct, 1 for union
306  * ``info.kind``: BTF_KIND_FWD
307  * ``info.vlen``: 0
308  * ``type``: 0
309
310No additional type data follow ``btf_type``.
311
3122.2.8 BTF_KIND_TYPEDEF
313~~~~~~~~~~~~~~~~~~~~~~
314
315``struct btf_type`` encoding requirement:
316  * ``name_off``: offset to a valid C identifier
317  * ``info.kind_flag``: 0
318  * ``info.kind``: BTF_KIND_TYPEDEF
319  * ``info.vlen``: 0
320  * ``type``: the type which can be referred by name at ``name_off``
321
322No additional type data follow ``btf_type``.
323
3242.2.9 BTF_KIND_VOLATILE
325~~~~~~~~~~~~~~~~~~~~~~~
326
327``struct btf_type`` encoding requirement:
328  * ``name_off``: 0
329  * ``info.kind_flag``: 0
330  * ``info.kind``: BTF_KIND_VOLATILE
331  * ``info.vlen``: 0
332  * ``type``: the type with ``volatile`` qualifier
333
334No additional type data follow ``btf_type``.
335
3362.2.10 BTF_KIND_CONST
337~~~~~~~~~~~~~~~~~~~~~
338
339``struct btf_type`` encoding requirement:
340  * ``name_off``: 0
341  * ``info.kind_flag``: 0
342  * ``info.kind``: BTF_KIND_CONST
343  * ``info.vlen``: 0
344  * ``type``: the type with ``const`` qualifier
345
346No additional type data follow ``btf_type``.
347
3482.2.11 BTF_KIND_RESTRICT
349~~~~~~~~~~~~~~~~~~~~~~~~
350
351``struct btf_type`` encoding requirement:
352  * ``name_off``: 0
353  * ``info.kind_flag``: 0
354  * ``info.kind``: BTF_KIND_RESTRICT
355  * ``info.vlen``: 0
356  * ``type``: the type with ``restrict`` qualifier
357
358No additional type data follow ``btf_type``.
359
3602.2.12 BTF_KIND_FUNC
361~~~~~~~~~~~~~~~~~~~~
362
363``struct btf_type`` encoding requirement:
364  * ``name_off``: offset to a valid C identifier
365  * ``info.kind_flag``: 0
366  * ``info.kind``: BTF_KIND_FUNC
367  * ``info.vlen``: 0
368  * ``type``: a BTF_KIND_FUNC_PROTO type
369
370No additional type data follow ``btf_type``.
371
372A BTF_KIND_FUNC defines not a type, but a subprogram (function) whose
373signature is defined by ``type``. The subprogram is thus an instance of that
374type. The BTF_KIND_FUNC may in turn be referenced by a func_info in the
375:ref:`BTF_Ext_Section` (ELF) or in the arguments to :ref:`BPF_Prog_Load`
376(ABI).
377
3782.2.13 BTF_KIND_FUNC_PROTO
379~~~~~~~~~~~~~~~~~~~~~~~~~~
380
381``struct btf_type`` encoding requirement:
382  * ``name_off``: 0
383  * ``info.kind_flag``: 0
384  * ``info.kind``: BTF_KIND_FUNC_PROTO
385  * ``info.vlen``: # of parameters
386  * ``type``: the return type
387
388``btf_type`` is followed by ``info.vlen`` number of ``struct btf_param``.::
389
390    struct btf_param {
391        __u32   name_off;
392        __u32   type;
393    };
394
395If a BTF_KIND_FUNC_PROTO type is referred by a BTF_KIND_FUNC type, then
396``btf_param.name_off`` must point to a valid C identifier except for the
397possible last argument representing the variable argument. The btf_param.type
398refers to parameter type.
399
400If the function has variable arguments, the last parameter is encoded with
401``name_off = 0`` and ``type = 0``.
402
4032.2.14 BTF_KIND_VAR
404~~~~~~~~~~~~~~~~~~~
405
406``struct btf_type`` encoding requirement:
407  * ``name_off``: offset to a valid C identifier
408  * ``info.kind_flag``: 0
409  * ``info.kind``: BTF_KIND_VAR
410  * ``info.vlen``: 0
411  * ``type``: the type of the variable
412
413``btf_type`` is followed by a single ``struct btf_variable`` with the
414following data::
415
416    struct btf_var {
417        __u32   linkage;
418    };
419
420``struct btf_var`` encoding:
421  * ``linkage``: currently only static variable 0, or globally allocated
422                 variable in ELF sections 1
423
424Not all type of global variables are supported by LLVM at this point.
425The following is currently available:
426
427  * static variables with or without section attributes
428  * global variables with section attributes
429
430The latter is for future extraction of map key/value type id's from a
431map definition.
432
4332.2.15 BTF_KIND_DATASEC
434~~~~~~~~~~~~~~~~~~~~~~~
435
436``struct btf_type`` encoding requirement:
437  * ``name_off``: offset to a valid name associated with a variable or
438                  one of .data/.bss/.rodata
439  * ``info.kind_flag``: 0
440  * ``info.kind``: BTF_KIND_DATASEC
441  * ``info.vlen``: # of variables
442  * ``size``: total section size in bytes (0 at compilation time, patched
443              to actual size by BPF loaders such as libbpf)
444
445``btf_type`` is followed by ``info.vlen`` number of ``struct btf_var_secinfo``.::
446
447    struct btf_var_secinfo {
448        __u32   type;
449        __u32   offset;
450        __u32   size;
451    };
452
453``struct btf_var_secinfo`` encoding:
454  * ``type``: the type of the BTF_KIND_VAR variable
455  * ``offset``: the in-section offset of the variable
456  * ``size``: the size of the variable in bytes
457
4582.2.16 BTF_KIND_FLOAT
459~~~~~~~~~~~~~~~~~~~~~
460
461``struct btf_type`` encoding requirement:
462 * ``name_off``: any valid offset
463 * ``info.kind_flag``: 0
464 * ``info.kind``: BTF_KIND_FLOAT
465 * ``info.vlen``: 0
466 * ``size``: the size of the float type in bytes: 2, 4, 8, 12 or 16.
467
468No additional type data follow ``btf_type``.
469
4702.2.17 BTF_KIND_DECL_TAG
471~~~~~~~~~~~~~~~~~~~~~~~~
472
473``struct btf_type`` encoding requirement:
474 * ``name_off``: offset to a non-empty string
475 * ``info.kind_flag``: 0
476 * ``info.kind``: BTF_KIND_DECL_TAG
477 * ``info.vlen``: 0
478 * ``type``: ``struct``, ``union``, ``func``, ``var`` or ``typedef``
479
480``btf_type`` is followed by ``struct btf_decl_tag``.::
481
482    struct btf_decl_tag {
483        __u32   component_idx;
484    };
485
486The ``name_off`` encodes btf_decl_tag attribute string.
487The ``type`` should be ``struct``, ``union``, ``func``, ``var`` or ``typedef``.
488For ``var`` or ``typedef`` type, ``btf_decl_tag.component_idx`` must be ``-1``.
489For the other three types, if the btf_decl_tag attribute is
490applied to the ``struct``, ``union`` or ``func`` itself,
491``btf_decl_tag.component_idx`` must be ``-1``. Otherwise,
492the attribute is applied to a ``struct``/``union`` member or
493a ``func`` argument, and ``btf_decl_tag.component_idx`` should be a
494valid index (starting from 0) pointing to a member or an argument.
495
4962.2.17 BTF_KIND_TYPE_TAG
497~~~~~~~~~~~~~~~~~~~~~~~~
498
499``struct btf_type`` encoding requirement:
500 * ``name_off``: offset to a non-empty string
501 * ``info.kind_flag``: 0
502 * ``info.kind``: BTF_KIND_TYPE_TAG
503 * ``info.vlen``: 0
504 * ``type``: the type with ``btf_type_tag`` attribute
505
5063. BTF Kernel API
507=================
508
509The following bpf syscall command involves BTF:
510   * BPF_BTF_LOAD: load a blob of BTF data into kernel
511   * BPF_MAP_CREATE: map creation with btf key and value type info.
512   * BPF_PROG_LOAD: prog load with btf function and line info.
513   * BPF_BTF_GET_FD_BY_ID: get a btf fd
514   * BPF_OBJ_GET_INFO_BY_FD: btf, func_info, line_info
515     and other btf related info are returned.
516
517The workflow typically looks like:
518::
519
520  Application:
521      BPF_BTF_LOAD
522          |
523          v
524      BPF_MAP_CREATE and BPF_PROG_LOAD
525          |
526          V
527      ......
528
529  Introspection tool:
530      ......
531      BPF_{PROG,MAP}_GET_NEXT_ID (get prog/map id's)
532          |
533          V
534      BPF_{PROG,MAP}_GET_FD_BY_ID (get a prog/map fd)
535          |
536          V
537      BPF_OBJ_GET_INFO_BY_FD (get bpf_prog_info/bpf_map_info with btf_id)
538          |                                     |
539          V                                     |
540      BPF_BTF_GET_FD_BY_ID (get btf_fd)         |
541          |                                     |
542          V                                     |
543      BPF_OBJ_GET_INFO_BY_FD (get btf)          |
544          |                                     |
545          V                                     V
546      pretty print types, dump func signatures and line info, etc.
547
548
5493.1 BPF_BTF_LOAD
550----------------
551
552Load a blob of BTF data into kernel. A blob of data, described in
553:ref:`BTF_Type_String`, can be directly loaded into the kernel. A ``btf_fd``
554is returned to a userspace.
555
5563.2 BPF_MAP_CREATE
557------------------
558
559A map can be created with ``btf_fd`` and specified key/value type id.::
560
561    __u32   btf_fd;         /* fd pointing to a BTF type data */
562    __u32   btf_key_type_id;        /* BTF type_id of the key */
563    __u32   btf_value_type_id;      /* BTF type_id of the value */
564
565In libbpf, the map can be defined with extra annotation like below:
566::
567
568    struct bpf_map_def SEC("maps") btf_map = {
569        .type = BPF_MAP_TYPE_ARRAY,
570        .key_size = sizeof(int),
571        .value_size = sizeof(struct ipv_counts),
572        .max_entries = 4,
573    };
574    BPF_ANNOTATE_KV_PAIR(btf_map, int, struct ipv_counts);
575
576Here, the parameters for macro BPF_ANNOTATE_KV_PAIR are map name, key and
577value types for the map. During ELF parsing, libbpf is able to extract
578key/value type_id's and assign them to BPF_MAP_CREATE attributes
579automatically.
580
581.. _BPF_Prog_Load:
582
5833.3 BPF_PROG_LOAD
584-----------------
585
586During prog_load, func_info and line_info can be passed to kernel with proper
587values for the following attributes:
588::
589
590    __u32           insn_cnt;
591    __aligned_u64   insns;
592    ......
593    __u32           prog_btf_fd;    /* fd pointing to BTF type data */
594    __u32           func_info_rec_size;     /* userspace bpf_func_info size */
595    __aligned_u64   func_info;      /* func info */
596    __u32           func_info_cnt;  /* number of bpf_func_info records */
597    __u32           line_info_rec_size;     /* userspace bpf_line_info size */
598    __aligned_u64   line_info;      /* line info */
599    __u32           line_info_cnt;  /* number of bpf_line_info records */
600
601The func_info and line_info are an array of below, respectively.::
602
603    struct bpf_func_info {
604        __u32   insn_off; /* [0, insn_cnt - 1] */
605        __u32   type_id;  /* pointing to a BTF_KIND_FUNC type */
606    };
607    struct bpf_line_info {
608        __u32   insn_off; /* [0, insn_cnt - 1] */
609        __u32   file_name_off; /* offset to string table for the filename */
610        __u32   line_off; /* offset to string table for the source line */
611        __u32   line_col; /* line number and column number */
612    };
613
614func_info_rec_size is the size of each func_info record, and
615line_info_rec_size is the size of each line_info record. Passing the record
616size to kernel make it possible to extend the record itself in the future.
617
618Below are requirements for func_info:
619  * func_info[0].insn_off must be 0.
620  * the func_info insn_off is in strictly increasing order and matches
621    bpf func boundaries.
622
623Below are requirements for line_info:
624  * the first insn in each func must have a line_info record pointing to it.
625  * the line_info insn_off is in strictly increasing order.
626
627For line_info, the line number and column number are defined as below:
628::
629
630    #define BPF_LINE_INFO_LINE_NUM(line_col)        ((line_col) >> 10)
631    #define BPF_LINE_INFO_LINE_COL(line_col)        ((line_col) & 0x3ff)
632
6333.4 BPF_{PROG,MAP}_GET_NEXT_ID
634------------------------------
635
636In kernel, every loaded program, map or btf has a unique id. The id won't
637change during the lifetime of a program, map, or btf.
638
639The bpf syscall command BPF_{PROG,MAP}_GET_NEXT_ID returns all id's, one for
640each command, to user space, for bpf program or maps, respectively, so an
641inspection tool can inspect all programs and maps.
642
6433.5 BPF_{PROG,MAP}_GET_FD_BY_ID
644-------------------------------
645
646An introspection tool cannot use id to get details about program or maps.
647A file descriptor needs to be obtained first for reference-counting purpose.
648
6493.6 BPF_OBJ_GET_INFO_BY_FD
650--------------------------
651
652Once a program/map fd is acquired, an introspection tool can get the detailed
653information from kernel about this fd, some of which are BTF-related. For
654example, ``bpf_map_info`` returns ``btf_id`` and key/value type ids.
655``bpf_prog_info`` returns ``btf_id``, func_info, and line info for translated
656bpf byte codes, and jited_line_info.
657
6583.7 BPF_BTF_GET_FD_BY_ID
659------------------------
660
661With ``btf_id`` obtained in ``bpf_map_info`` and ``bpf_prog_info``, bpf
662syscall command BPF_BTF_GET_FD_BY_ID can retrieve a btf fd. Then, with
663command BPF_OBJ_GET_INFO_BY_FD, the btf blob, originally loaded into the
664kernel with BPF_BTF_LOAD, can be retrieved.
665
666With the btf blob, ``bpf_map_info``, and ``bpf_prog_info``, an introspection
667tool has full btf knowledge and is able to pretty print map key/values, dump
668func signatures and line info, along with byte/jit codes.
669
6704. ELF File Format Interface
671============================
672
6734.1 .BTF section
674----------------
675
676The .BTF section contains type and string data. The format of this section is
677same as the one describe in :ref:`BTF_Type_String`.
678
679.. _BTF_Ext_Section:
680
6814.2 .BTF.ext section
682--------------------
683
684The .BTF.ext section encodes func_info and line_info which needs loader
685manipulation before loading into the kernel.
686
687The specification for .BTF.ext section is defined at ``tools/lib/bpf/btf.h``
688and ``tools/lib/bpf/btf.c``.
689
690The current header of .BTF.ext section::
691
692    struct btf_ext_header {
693        __u16   magic;
694        __u8    version;
695        __u8    flags;
696        __u32   hdr_len;
697
698        /* All offsets are in bytes relative to the end of this header */
699        __u32   func_info_off;
700        __u32   func_info_len;
701        __u32   line_info_off;
702        __u32   line_info_len;
703    };
704
705It is very similar to .BTF section. Instead of type/string section, it
706contains func_info and line_info section. See :ref:`BPF_Prog_Load` for details
707about func_info and line_info record format.
708
709The func_info is organized as below.::
710
711     func_info_rec_size
712     btf_ext_info_sec for section #1 /* func_info for section #1 */
713     btf_ext_info_sec for section #2 /* func_info for section #2 */
714     ...
715
716``func_info_rec_size`` specifies the size of ``bpf_func_info`` structure when
717.BTF.ext is generated. ``btf_ext_info_sec``, defined below, is a collection of
718func_info for each specific ELF section.::
719
720     struct btf_ext_info_sec {
721        __u32   sec_name_off; /* offset to section name */
722        __u32   num_info;
723        /* Followed by num_info * record_size number of bytes */
724        __u8    data[0];
725     };
726
727Here, num_info must be greater than 0.
728
729The line_info is organized as below.::
730
731     line_info_rec_size
732     btf_ext_info_sec for section #1 /* line_info for section #1 */
733     btf_ext_info_sec for section #2 /* line_info for section #2 */
734     ...
735
736``line_info_rec_size`` specifies the size of ``bpf_line_info`` structure when
737.BTF.ext is generated.
738
739The interpretation of ``bpf_func_info->insn_off`` and
740``bpf_line_info->insn_off`` is different between kernel API and ELF API. For
741kernel API, the ``insn_off`` is the instruction offset in the unit of ``struct
742bpf_insn``. For ELF API, the ``insn_off`` is the byte offset from the
743beginning of section (``btf_ext_info_sec->sec_name_off``).
744
7454.2 .BTF_ids section
746--------------------
747
748The .BTF_ids section encodes BTF ID values that are used within the kernel.
749
750This section is created during the kernel compilation with the help of
751macros defined in ``include/linux/btf_ids.h`` header file. Kernel code can
752use them to create lists and sets (sorted lists) of BTF ID values.
753
754The ``BTF_ID_LIST`` and ``BTF_ID`` macros define unsorted list of BTF ID values,
755with following syntax::
756
757  BTF_ID_LIST(list)
758  BTF_ID(type1, name1)
759  BTF_ID(type2, name2)
760
761resulting in following layout in .BTF_ids section::
762
763  __BTF_ID__type1__name1__1:
764  .zero 4
765  __BTF_ID__type2__name2__2:
766  .zero 4
767
768The ``u32 list[];`` variable is defined to access the list.
769
770The ``BTF_ID_UNUSED`` macro defines 4 zero bytes. It's used when we
771want to define unused entry in BTF_ID_LIST, like::
772
773      BTF_ID_LIST(bpf_skb_output_btf_ids)
774      BTF_ID(struct, sk_buff)
775      BTF_ID_UNUSED
776      BTF_ID(struct, task_struct)
777
778The ``BTF_SET_START/END`` macros pair defines sorted list of BTF ID values
779and their count, with following syntax::
780
781  BTF_SET_START(set)
782  BTF_ID(type1, name1)
783  BTF_ID(type2, name2)
784  BTF_SET_END(set)
785
786resulting in following layout in .BTF_ids section::
787
788  __BTF_ID__set__set:
789  .zero 4
790  __BTF_ID__type1__name1__3:
791  .zero 4
792  __BTF_ID__type2__name2__4:
793  .zero 4
794
795The ``struct btf_id_set set;`` variable is defined to access the list.
796
797The ``typeX`` name can be one of following::
798
799   struct, union, typedef, func
800
801and is used as a filter when resolving the BTF ID value.
802
803All the BTF ID lists and sets are compiled in the .BTF_ids section and
804resolved during the linking phase of kernel build by ``resolve_btfids`` tool.
805
8065. Using BTF
807============
808
8095.1 bpftool map pretty print
810----------------------------
811
812With BTF, the map key/value can be printed based on fields rather than simply
813raw bytes. This is especially valuable for large structure or if your data
814structure has bitfields. For example, for the following map,::
815
816      enum A { A1, A2, A3, A4, A5 };
817      typedef enum A ___A;
818      struct tmp_t {
819           char a1:4;
820           int  a2:4;
821           int  :4;
822           __u32 a3:4;
823           int b;
824           ___A b1:4;
825           enum A b2:4;
826      };
827      struct bpf_map_def SEC("maps") tmpmap = {
828           .type = BPF_MAP_TYPE_ARRAY,
829           .key_size = sizeof(__u32),
830           .value_size = sizeof(struct tmp_t),
831           .max_entries = 1,
832      };
833      BPF_ANNOTATE_KV_PAIR(tmpmap, int, struct tmp_t);
834
835bpftool is able to pretty print like below:
836::
837
838      [{
839            "key": 0,
840            "value": {
841                "a1": 0x2,
842                "a2": 0x4,
843                "a3": 0x6,
844                "b": 7,
845                "b1": 0x8,
846                "b2": 0xa
847            }
848        }
849      ]
850
8515.2 bpftool prog dump
852---------------------
853
854The following is an example showing how func_info and line_info can help prog
855dump with better kernel symbol names, function prototypes and line
856information.::
857
858    $ bpftool prog dump jited pinned /sys/fs/bpf/test_btf_haskv
859    [...]
860    int test_long_fname_2(struct dummy_tracepoint_args * arg):
861    bpf_prog_44a040bf25481309_test_long_fname_2:
862    ; static int test_long_fname_2(struct dummy_tracepoint_args *arg)
863       0:   push   %rbp
864       1:   mov    %rsp,%rbp
865       4:   sub    $0x30,%rsp
866       b:   sub    $0x28,%rbp
867       f:   mov    %rbx,0x0(%rbp)
868      13:   mov    %r13,0x8(%rbp)
869      17:   mov    %r14,0x10(%rbp)
870      1b:   mov    %r15,0x18(%rbp)
871      1f:   xor    %eax,%eax
872      21:   mov    %rax,0x20(%rbp)
873      25:   xor    %esi,%esi
874    ; int key = 0;
875      27:   mov    %esi,-0x4(%rbp)
876    ; if (!arg->sock)
877      2a:   mov    0x8(%rdi),%rdi
878    ; if (!arg->sock)
879      2e:   cmp    $0x0,%rdi
880      32:   je     0x0000000000000070
881      34:   mov    %rbp,%rsi
882    ; counts = bpf_map_lookup_elem(&btf_map, &key);
883    [...]
884
8855.3 Verifier Log
886----------------
887
888The following is an example of how line_info can help debugging verification
889failure.::
890
891       /* The code at tools/testing/selftests/bpf/test_xdp_noinline.c
892        * is modified as below.
893        */
894       data = (void *)(long)xdp->data;
895       data_end = (void *)(long)xdp->data_end;
896       /*
897       if (data + 4 > data_end)
898               return XDP_DROP;
899       */
900       *(u32 *)data = dst->dst;
901
902    $ bpftool prog load ./test_xdp_noinline.o /sys/fs/bpf/test_xdp_noinline type xdp
903        ; data = (void *)(long)xdp->data;
904        224: (79) r2 = *(u64 *)(r10 -112)
905        225: (61) r2 = *(u32 *)(r2 +0)
906        ; *(u32 *)data = dst->dst;
907        226: (63) *(u32 *)(r2 +0) = r1
908        invalid access to packet, off=0 size=4, R2(id=0,off=0,r=0)
909        R2 offset is outside of the packet
910
9116. BTF Generation
912=================
913
914You need latest pahole
915
916  https://git.kernel.org/pub/scm/devel/pahole/pahole.git/
917
918or llvm (8.0 or later). The pahole acts as a dwarf2btf converter. It doesn't
919support .BTF.ext and btf BTF_KIND_FUNC type yet. For example,::
920
921      -bash-4.4$ cat t.c
922      struct t {
923        int a:2;
924        int b:3;
925        int c:2;
926      } g;
927      -bash-4.4$ gcc -c -O2 -g t.c
928      -bash-4.4$ pahole -JV t.o
929      File t.o:
930      [1] STRUCT t kind_flag=1 size=4 vlen=3
931              a type_id=2 bitfield_size=2 bits_offset=0
932              b type_id=2 bitfield_size=3 bits_offset=2
933              c type_id=2 bitfield_size=2 bits_offset=5
934      [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
935
936The llvm is able to generate .BTF and .BTF.ext directly with -g for bpf target
937only. The assembly code (-S) is able to show the BTF encoding in assembly
938format.::
939
940    -bash-4.4$ cat t2.c
941    typedef int __int32;
942    struct t2 {
943      int a2;
944      int (*f2)(char q1, __int32 q2, ...);
945      int (*f3)();
946    } g2;
947    int main() { return 0; }
948    int test() { return 0; }
949    -bash-4.4$ clang -c -g -O2 -target bpf t2.c
950    -bash-4.4$ readelf -S t2.o
951      ......
952      [ 8] .BTF              PROGBITS         0000000000000000  00000247
953           000000000000016e  0000000000000000           0     0     1
954      [ 9] .BTF.ext          PROGBITS         0000000000000000  000003b5
955           0000000000000060  0000000000000000           0     0     1
956      [10] .rel.BTF.ext      REL              0000000000000000  000007e0
957           0000000000000040  0000000000000010          16     9     8
958      ......
959    -bash-4.4$ clang -S -g -O2 -target bpf t2.c
960    -bash-4.4$ cat t2.s
961      ......
962            .section        .BTF,"",@progbits
963            .short  60319                   # 0xeb9f
964            .byte   1
965            .byte   0
966            .long   24
967            .long   0
968            .long   220
969            .long   220
970            .long   122
971            .long   0                       # BTF_KIND_FUNC_PROTO(id = 1)
972            .long   218103808               # 0xd000000
973            .long   2
974            .long   83                      # BTF_KIND_INT(id = 2)
975            .long   16777216                # 0x1000000
976            .long   4
977            .long   16777248                # 0x1000020
978      ......
979            .byte   0                       # string offset=0
980            .ascii  ".text"                 # string offset=1
981            .byte   0
982            .ascii  "/home/yhs/tmp-pahole/t2.c" # string offset=7
983            .byte   0
984            .ascii  "int main() { return 0; }" # string offset=33
985            .byte   0
986            .ascii  "int test() { return 0; }" # string offset=58
987            .byte   0
988            .ascii  "int"                   # string offset=83
989      ......
990            .section        .BTF.ext,"",@progbits
991            .short  60319                   # 0xeb9f
992            .byte   1
993            .byte   0
994            .long   24
995            .long   0
996            .long   28
997            .long   28
998            .long   44
999            .long   8                       # FuncInfo
1000            .long   1                       # FuncInfo section string offset=1
1001            .long   2
1002            .long   .Lfunc_begin0
1003            .long   3
1004            .long   .Lfunc_begin1
1005            .long   5
1006            .long   16                      # LineInfo
1007            .long   1                       # LineInfo section string offset=1
1008            .long   2
1009            .long   .Ltmp0
1010            .long   7
1011            .long   33
1012            .long   7182                    # Line 7 Col 14
1013            .long   .Ltmp3
1014            .long   7
1015            .long   58
1016            .long   8206                    # Line 8 Col 14
1017
10187. Testing
1019==========
1020
1021Kernel bpf selftest `test_btf.c` provides extensive set of BTF-related tests.
1022