1===================== 2BPF Type Format (BTF) 3===================== 4 51. Introduction 6=============== 7 8BTF (BPF Type Format) is the metadata format which encodes the debug info 9related to BPF program/map. The name BTF was used initially to describe data 10types. The BTF was later extended to include function info for defined 11subroutines, and line info for source/line information. 12 13The debug info is used for map pretty print, function signature, etc. The 14function signature enables better bpf program/function kernel symbol. The line 15info helps generate source annotated translated byte code, jited code and 16verifier log. 17 18The BTF specification contains two parts, 19 * BTF kernel API 20 * BTF ELF file format 21 22The kernel API is the contract between user space and kernel. The kernel 23verifies the BTF info before using it. The ELF file format is a user space 24contract between ELF file and libbpf loader. 25 26The type and string sections are part of the BTF kernel API, describing the 27debug info (mostly types related) referenced by the bpf program. These two 28sections are discussed in details in :ref:`BTF_Type_String`. 29 30.. _BTF_Type_String: 31 322. BTF Type and String Encoding 33=============================== 34 35The file ``include/uapi/linux/btf.h`` provides high-level definition of how 36types/strings are encoded. 37 38The beginning of data blob must be:: 39 40 struct btf_header { 41 __u16 magic; 42 __u8 version; 43 __u8 flags; 44 __u32 hdr_len; 45 46 /* All offsets are in bytes relative to the end of this header */ 47 __u32 type_off; /* offset of type section */ 48 __u32 type_len; /* length of type section */ 49 __u32 str_off; /* offset of string section */ 50 __u32 str_len; /* length of string section */ 51 }; 52 53The magic is ``0xeB9F``, which has different encoding for big and little 54endian systems, and can be used to test whether BTF is generated for big- or 55little-endian target. The ``btf_header`` is designed to be extensible with 56``hdr_len`` equal to ``sizeof(struct btf_header)`` when a data blob is 57generated. 58 592.1 String Encoding 60------------------- 61 62The first string in the string section must be a null string. The rest of 63string table is a concatenation of other null-terminated strings. 64 652.2 Type Encoding 66----------------- 67 68The type id ``0`` is reserved for ``void`` type. The type section is parsed 69sequentially and type id is assigned to each recognized type starting from id 70``1``. Currently, the following types are supported:: 71 72 #define BTF_KIND_INT 1 /* Integer */ 73 #define BTF_KIND_PTR 2 /* Pointer */ 74 #define BTF_KIND_ARRAY 3 /* Array */ 75 #define BTF_KIND_STRUCT 4 /* Struct */ 76 #define BTF_KIND_UNION 5 /* Union */ 77 #define BTF_KIND_ENUM 6 /* Enumeration up to 32-bit values */ 78 #define BTF_KIND_FWD 7 /* Forward */ 79 #define BTF_KIND_TYPEDEF 8 /* Typedef */ 80 #define BTF_KIND_VOLATILE 9 /* Volatile */ 81 #define BTF_KIND_CONST 10 /* Const */ 82 #define BTF_KIND_RESTRICT 11 /* Restrict */ 83 #define BTF_KIND_FUNC 12 /* Function */ 84 #define BTF_KIND_FUNC_PROTO 13 /* Function Proto */ 85 #define BTF_KIND_VAR 14 /* Variable */ 86 #define BTF_KIND_DATASEC 15 /* Section */ 87 #define BTF_KIND_FLOAT 16 /* Floating point */ 88 #define BTF_KIND_DECL_TAG 17 /* Decl Tag */ 89 #define BTF_KIND_TYPE_TAG 18 /* Type Tag */ 90 #define BTF_KIND_ENUM64 19 /* Enumeration up to 64-bit values */ 91 92Note that the type section encodes debug info, not just pure types. 93``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram. 94 95Each type contains the following common data:: 96 97 struct btf_type { 98 __u32 name_off; 99 /* "info" bits arrangement 100 * bits 0-23: vlen (e.g. # of struct's members) 101 * bits 24-30: kind (e.g. int, ptr, array...etc) 102 * bit 31: kind_flag, currently used by 103 * struct, union, enum, fwd, enum64, 104 * decl_tag and type_tag 105 */ 106 __u32 info; 107 /* "size" is used by INT, ENUM, STRUCT, UNION and ENUM64. 108 * "size" tells the size of the type it is describing. 109 * 110 * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, 111 * FUNC, FUNC_PROTO, DECL_TAG and TYPE_TAG. 112 * "type" is a type_id referring to another type. 113 */ 114 union { 115 __u32 size; 116 __u32 type; 117 }; 118 }; 119 120For certain kinds, the common data are followed by kind-specific data. The 121``name_off`` in ``struct btf_type`` specifies the offset in the string table. 122The following sections detail encoding of each kind. 123 1242.2.1 BTF_KIND_INT 125~~~~~~~~~~~~~~~~~~ 126 127``struct btf_type`` encoding requirement: 128 * ``name_off``: any valid offset 129 * ``info.kind_flag``: 0 130 * ``info.kind``: BTF_KIND_INT 131 * ``info.vlen``: 0 132 * ``size``: the size of the int type in bytes. 133 134``btf_type`` is followed by a ``u32`` with the following bits arrangement:: 135 136 #define BTF_INT_ENCODING(VAL) (((VAL) & 0x0f000000) >> 24) 137 #define BTF_INT_OFFSET(VAL) (((VAL) & 0x00ff0000) >> 16) 138 #define BTF_INT_BITS(VAL) ((VAL) & 0x000000ff) 139 140The ``BTF_INT_ENCODING`` has the following attributes:: 141 142 #define BTF_INT_SIGNED (1 << 0) 143 #define BTF_INT_CHAR (1 << 1) 144 #define BTF_INT_BOOL (1 << 2) 145 146The ``BTF_INT_ENCODING()`` provides extra information: signedness, char, or 147bool, for the int type. The char and bool encoding are mostly useful for 148pretty print. At most one encoding can be specified for the int type. 149 150The ``BTF_INT_BITS()`` specifies the number of actual bits held by this int 151type. For example, a 4-bit bitfield encodes ``BTF_INT_BITS()`` equals to 4. 152The ``btf_type.size * 8`` must be equal to or greater than ``BTF_INT_BITS()`` 153for the type. The maximum value of ``BTF_INT_BITS()`` is 128. 154 155The ``BTF_INT_OFFSET()`` specifies the starting bit offset to calculate values 156for this int. For example, a bitfield struct member has: 157 158 * btf member bit offset 100 from the start of the structure, 159 * btf member pointing to an int type, 160 * the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4`` 161 162Then in the struct memory layout, this member will occupy ``4`` bits starting 163from bits ``100 + 2 = 102``. 164 165Alternatively, the bitfield struct member can be the following to access the 166same bits as the above: 167 168 * btf member bit offset 102, 169 * btf member pointing to an int type, 170 * the int type has ``BTF_INT_OFFSET() = 0`` and ``BTF_INT_BITS() = 4`` 171 172The original intention of ``BTF_INT_OFFSET()`` is to provide flexibility of 173bitfield encoding. Currently, both llvm and pahole generate 174``BTF_INT_OFFSET() = 0`` for all int types. 175 1762.2.2 BTF_KIND_PTR 177~~~~~~~~~~~~~~~~~~ 178 179``struct btf_type`` encoding requirement: 180 * ``name_off``: 0 181 * ``info.kind_flag``: 0 182 * ``info.kind``: BTF_KIND_PTR 183 * ``info.vlen``: 0 184 * ``type``: the pointee type of the pointer 185 186No additional type data follow ``btf_type``. 187 1882.2.3 BTF_KIND_ARRAY 189~~~~~~~~~~~~~~~~~~~~ 190 191``struct btf_type`` encoding requirement: 192 * ``name_off``: 0 193 * ``info.kind_flag``: 0 194 * ``info.kind``: BTF_KIND_ARRAY 195 * ``info.vlen``: 0 196 * ``size/type``: 0, not used 197 198``btf_type`` is followed by one ``struct btf_array``:: 199 200 struct btf_array { 201 __u32 type; 202 __u32 index_type; 203 __u32 nelems; 204 }; 205 206The ``struct btf_array`` encoding: 207 * ``type``: the element type 208 * ``index_type``: the index type 209 * ``nelems``: the number of elements for this array (``0`` is also allowed). 210 211The ``index_type`` can be any regular int type (``u8``, ``u16``, ``u32``, 212``u64``, ``unsigned __int128``). The original design of including 213``index_type`` follows DWARF, which has an ``index_type`` for its array type. 214Currently in BTF, beyond type verification, the ``index_type`` is not used. 215 216The ``struct btf_array`` allows chaining through element type to represent 217multidimensional arrays. For example, for ``int a[5][6]``, the following type 218information illustrates the chaining: 219 220 * [1]: int 221 * [2]: array, ``btf_array.type = [1]``, ``btf_array.nelems = 6`` 222 * [3]: array, ``btf_array.type = [2]``, ``btf_array.nelems = 5`` 223 224Currently, both pahole and llvm collapse multidimensional array into 225one-dimensional array, e.g., for ``a[5][6]``, the ``btf_array.nelems`` is 226equal to ``30``. This is because the original use case is map pretty print 227where the whole array is dumped out so one-dimensional array is enough. As 228more BTF usage is explored, pahole and llvm can be changed to generate proper 229chained representation for multidimensional arrays. 230 2312.2.4 BTF_KIND_STRUCT 232~~~~~~~~~~~~~~~~~~~~~ 2332.2.5 BTF_KIND_UNION 234~~~~~~~~~~~~~~~~~~~~ 235 236``struct btf_type`` encoding requirement: 237 * ``name_off``: 0 or offset to a valid C identifier 238 * ``info.kind_flag``: 0 or 1 239 * ``info.kind``: BTF_KIND_STRUCT or BTF_KIND_UNION 240 * ``info.vlen``: the number of struct/union members 241 * ``info.size``: the size of the struct/union in bytes 242 243``btf_type`` is followed by ``info.vlen`` number of ``struct btf_member``.:: 244 245 struct btf_member { 246 __u32 name_off; 247 __u32 type; 248 __u32 offset; 249 }; 250 251``struct btf_member`` encoding: 252 * ``name_off``: offset to a valid C identifier 253 * ``type``: the member type 254 * ``offset``: <see below> 255 256If the type info ``kind_flag`` is not set, the offset contains only bit offset 257of the member. Note that the base type of the bitfield can only be int or enum 258type. If the bitfield size is 32, the base type can be either int or enum 259type. If the bitfield size is not 32, the base type must be int, and int type 260``BTF_INT_BITS()`` encodes the bitfield size. 261 262If the ``kind_flag`` is set, the ``btf_member.offset`` contains both member 263bitfield size and bit offset. The bitfield size and bit offset are calculated 264as below.:: 265 266 #define BTF_MEMBER_BITFIELD_SIZE(val) ((val) >> 24) 267 #define BTF_MEMBER_BIT_OFFSET(val) ((val) & 0xffffff) 268 269In this case, if the base type is an int type, it must be a regular int type: 270 271 * ``BTF_INT_OFFSET()`` must be 0. 272 * ``BTF_INT_BITS()`` must be equal to ``{1,2,4,8,16} * 8``. 273 274Commit 9d5f9f701b18 introduced ``kind_flag`` and explains why both modes 275exist. 276 2772.2.6 BTF_KIND_ENUM 278~~~~~~~~~~~~~~~~~~~ 279 280``struct btf_type`` encoding requirement: 281 * ``name_off``: 0 or offset to a valid C identifier 282 * ``info.kind_flag``: 0 for unsigned, 1 for signed 283 * ``info.kind``: BTF_KIND_ENUM 284 * ``info.vlen``: number of enum values 285 * ``size``: 1/2/4/8 286 287``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum``.:: 288 289 struct btf_enum { 290 __u32 name_off; 291 __s32 val; 292 }; 293 294The ``btf_enum`` encoding: 295 * ``name_off``: offset to a valid C identifier 296 * ``val``: any value 297 298If the original enum value is signed and the size is less than 4, 299that value will be sign extended into 4 bytes. If the size is 8, 300the value will be truncated into 4 bytes. 301 3022.2.7 BTF_KIND_FWD 303~~~~~~~~~~~~~~~~~~ 304 305``struct btf_type`` encoding requirement: 306 * ``name_off``: offset to a valid C identifier 307 * ``info.kind_flag``: 0 for struct, 1 for union 308 * ``info.kind``: BTF_KIND_FWD 309 * ``info.vlen``: 0 310 * ``type``: 0 311 312No additional type data follow ``btf_type``. 313 3142.2.8 BTF_KIND_TYPEDEF 315~~~~~~~~~~~~~~~~~~~~~~ 316 317``struct btf_type`` encoding requirement: 318 * ``name_off``: offset to a valid C identifier 319 * ``info.kind_flag``: 0 320 * ``info.kind``: BTF_KIND_TYPEDEF 321 * ``info.vlen``: 0 322 * ``type``: the type which can be referred by name at ``name_off`` 323 324No additional type data follow ``btf_type``. 325 3262.2.9 BTF_KIND_VOLATILE 327~~~~~~~~~~~~~~~~~~~~~~~ 328 329``struct btf_type`` encoding requirement: 330 * ``name_off``: 0 331 * ``info.kind_flag``: 0 332 * ``info.kind``: BTF_KIND_VOLATILE 333 * ``info.vlen``: 0 334 * ``type``: the type with ``volatile`` qualifier 335 336No additional type data follow ``btf_type``. 337 3382.2.10 BTF_KIND_CONST 339~~~~~~~~~~~~~~~~~~~~~ 340 341``struct btf_type`` encoding requirement: 342 * ``name_off``: 0 343 * ``info.kind_flag``: 0 344 * ``info.kind``: BTF_KIND_CONST 345 * ``info.vlen``: 0 346 * ``type``: the type with ``const`` qualifier 347 348No additional type data follow ``btf_type``. 349 3502.2.11 BTF_KIND_RESTRICT 351~~~~~~~~~~~~~~~~~~~~~~~~ 352 353``struct btf_type`` encoding requirement: 354 * ``name_off``: 0 355 * ``info.kind_flag``: 0 356 * ``info.kind``: BTF_KIND_RESTRICT 357 * ``info.vlen``: 0 358 * ``type``: the type with ``restrict`` qualifier 359 360No additional type data follow ``btf_type``. 361 3622.2.12 BTF_KIND_FUNC 363~~~~~~~~~~~~~~~~~~~~ 364 365``struct btf_type`` encoding requirement: 366 * ``name_off``: offset to a valid C identifier 367 * ``info.kind_flag``: 0 368 * ``info.kind``: BTF_KIND_FUNC 369 * ``info.vlen``: linkage information (BTF_FUNC_STATIC, BTF_FUNC_GLOBAL 370 or BTF_FUNC_EXTERN - see :ref:`BTF_Function_Linkage_Constants`) 371 * ``type``: a BTF_KIND_FUNC_PROTO type 372 373No additional type data follow ``btf_type``. 374 375A BTF_KIND_FUNC defines not a type, but a subprogram (function) whose 376signature is defined by ``type``. The subprogram is thus an instance of that 377type. The BTF_KIND_FUNC may in turn be referenced by a func_info in the 378:ref:`BTF_Ext_Section` (ELF) or in the arguments to :ref:`BPF_Prog_Load` 379(ABI). 380 381Currently, only linkage values of BTF_FUNC_STATIC and BTF_FUNC_GLOBAL are 382supported in the kernel. 383 3842.2.13 BTF_KIND_FUNC_PROTO 385~~~~~~~~~~~~~~~~~~~~~~~~~~ 386 387``struct btf_type`` encoding requirement: 388 * ``name_off``: 0 389 * ``info.kind_flag``: 0 390 * ``info.kind``: BTF_KIND_FUNC_PROTO 391 * ``info.vlen``: # of parameters 392 * ``type``: the return type 393 394``btf_type`` is followed by ``info.vlen`` number of ``struct btf_param``.:: 395 396 struct btf_param { 397 __u32 name_off; 398 __u32 type; 399 }; 400 401If a BTF_KIND_FUNC_PROTO type is referred by a BTF_KIND_FUNC type, then 402``btf_param.name_off`` must point to a valid C identifier except for the 403possible last argument representing the variable argument. The btf_param.type 404refers to parameter type. 405 406If the function has variable arguments, the last parameter is encoded with 407``name_off = 0`` and ``type = 0``. 408 4092.2.14 BTF_KIND_VAR 410~~~~~~~~~~~~~~~~~~~ 411 412``struct btf_type`` encoding requirement: 413 * ``name_off``: offset to a valid C identifier 414 * ``info.kind_flag``: 0 415 * ``info.kind``: BTF_KIND_VAR 416 * ``info.vlen``: 0 417 * ``type``: the type of the variable 418 419``btf_type`` is followed by a single ``struct btf_variable`` with the 420following data:: 421 422 struct btf_var { 423 __u32 linkage; 424 }; 425 426``btf_var.linkage`` may take the values: BTF_VAR_STATIC, BTF_VAR_GLOBAL_ALLOCATED or BTF_VAR_GLOBAL_EXTERN - 427see :ref:`BTF_Var_Linkage_Constants`. 428 429Not all type of global variables are supported by LLVM at this point. 430The following is currently available: 431 432 * static variables with or without section attributes 433 * global variables with section attributes 434 435The latter is for future extraction of map key/value type id's from a 436map definition. 437 4382.2.15 BTF_KIND_DATASEC 439~~~~~~~~~~~~~~~~~~~~~~~ 440 441``struct btf_type`` encoding requirement: 442 * ``name_off``: offset to a valid name associated with a variable or 443 one of .data/.bss/.rodata 444 * ``info.kind_flag``: 0 445 * ``info.kind``: BTF_KIND_DATASEC 446 * ``info.vlen``: # of variables 447 * ``size``: total section size in bytes (0 at compilation time, patched 448 to actual size by BPF loaders such as libbpf) 449 450``btf_type`` is followed by ``info.vlen`` number of ``struct btf_var_secinfo``.:: 451 452 struct btf_var_secinfo { 453 __u32 type; 454 __u32 offset; 455 __u32 size; 456 }; 457 458``struct btf_var_secinfo`` encoding: 459 * ``type``: the type of the BTF_KIND_VAR variable 460 * ``offset``: the in-section offset of the variable 461 * ``size``: the size of the variable in bytes 462 4632.2.16 BTF_KIND_FLOAT 464~~~~~~~~~~~~~~~~~~~~~ 465 466``struct btf_type`` encoding requirement: 467 * ``name_off``: any valid offset 468 * ``info.kind_flag``: 0 469 * ``info.kind``: BTF_KIND_FLOAT 470 * ``info.vlen``: 0 471 * ``size``: the size of the float type in bytes: 2, 4, 8, 12 or 16. 472 473No additional type data follow ``btf_type``. 474 4752.2.17 BTF_KIND_DECL_TAG 476~~~~~~~~~~~~~~~~~~~~~~~~ 477 478``struct btf_type`` encoding requirement: 479 * ``name_off``: offset to a non-empty string 480 * ``info.kind_flag``: 0 or 1 481 * ``info.kind``: BTF_KIND_DECL_TAG 482 * ``info.vlen``: 0 483 * ``type``: ``struct``, ``union``, ``func``, ``var`` or ``typedef`` 484 485``btf_type`` is followed by ``struct btf_decl_tag``.:: 486 487 struct btf_decl_tag { 488 __u32 component_idx; 489 }; 490 491The ``type`` should be ``struct``, ``union``, ``func``, ``var`` or ``typedef``. 492For ``var`` or ``typedef`` type, ``btf_decl_tag.component_idx`` must be ``-1``. 493For the other three types, if the btf_decl_tag attribute is 494applied to the ``struct``, ``union`` or ``func`` itself, 495``btf_decl_tag.component_idx`` must be ``-1``. Otherwise, 496the attribute is applied to a ``struct``/``union`` member or 497a ``func`` argument, and ``btf_decl_tag.component_idx`` should be a 498valid index (starting from 0) pointing to a member or an argument. 499 500If ``info.kind_flag`` is 0, then this is a normal decl tag, and the 501``name_off`` encodes btf_decl_tag attribute string. 502 503If ``info.kind_flag`` is 1, then the decl tag represents an arbitrary 504__attribute__. In this case, ``name_off`` encodes a string 505representing the attribute-list of the attribute specifier. For 506example, for an ``__attribute__((aligned(4)))`` the string's contents 507is ``aligned(4)``. 508 5092.2.18 BTF_KIND_TYPE_TAG 510~~~~~~~~~~~~~~~~~~~~~~~~ 511 512``struct btf_type`` encoding requirement: 513 * ``name_off``: offset to a non-empty string 514 * ``info.kind_flag``: 0 or 1 515 * ``info.kind``: BTF_KIND_TYPE_TAG 516 * ``info.vlen``: 0 517 * ``type``: the type with ``btf_type_tag`` attribute 518 519Currently, ``BTF_KIND_TYPE_TAG`` is only emitted for pointer types. 520It has the following btf type chain: 521:: 522 523 ptr -> [type_tag]* 524 -> [const | volatile | restrict | typedef]* 525 -> base_type 526 527Basically, a pointer type points to zero or more 528type_tag, then zero or more const/volatile/restrict/typedef 529and finally the base type. The base type is one of 530int, ptr, array, struct, union, enum, func_proto and float types. 531 532Similarly to decl tags, if the ``info.kind_flag`` is 0, then this is a 533normal type tag, and the ``name_off`` encodes btf_type_tag attribute 534string. 535 536If ``info.kind_flag`` is 1, then the type tag represents an arbitrary 537__attribute__, and the ``name_off`` encodes a string representing the 538attribute-list of the attribute specifier. 539 5402.2.19 BTF_KIND_ENUM64 541~~~~~~~~~~~~~~~~~~~~~~ 542 543``struct btf_type`` encoding requirement: 544 * ``name_off``: 0 or offset to a valid C identifier 545 * ``info.kind_flag``: 0 for unsigned, 1 for signed 546 * ``info.kind``: BTF_KIND_ENUM64 547 * ``info.vlen``: number of enum values 548 * ``size``: 1/2/4/8 549 550``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum64``.:: 551 552 struct btf_enum64 { 553 __u32 name_off; 554 __u32 val_lo32; 555 __u32 val_hi32; 556 }; 557 558The ``btf_enum64`` encoding: 559 * ``name_off``: offset to a valid C identifier 560 * ``val_lo32``: lower 32-bit value for a 64-bit value 561 * ``val_hi32``: high 32-bit value for a 64-bit value 562 563If the original enum value is signed and the size is less than 8, 564that value will be sign extended into 8 bytes. 565 5662.3 Constant Values 567------------------- 568 569.. _BTF_Function_Linkage_Constants: 570 5712.3.1 Function Linkage Constant Values 572~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 573.. table:: Function Linkage Values and Meanings 574 575 =================== ===== =========== 576 kind value description 577 =================== ===== =========== 578 ``BTF_FUNC_STATIC`` 0x0 definition of subprogram not visible outside containing compilation unit 579 ``BTF_FUNC_GLOBAL`` 0x1 definition of subprogram visible outside containing compilation unit 580 ``BTF_FUNC_EXTERN`` 0x2 declaration of a subprogram whose definition is outside the containing compilation unit 581 =================== ===== =========== 582 583 584.. _BTF_Var_Linkage_Constants: 585 5862.3.2 Variable Linkage Constant Values 587~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 588.. table:: Variable Linkage Values and Meanings 589 590 ============================ ===== =========== 591 kind value description 592 ============================ ===== =========== 593 ``BTF_VAR_STATIC`` 0x0 definition of global variable not visible outside containing compilation unit 594 ``BTF_VAR_GLOBAL_ALLOCATED`` 0x1 definition of global variable visible outside containing compilation unit 595 ``BTF_VAR_GLOBAL_EXTERN`` 0x2 declaration of global variable whose definition is outside the containing compilation unit 596 ============================ ===== =========== 597 5983. BTF Kernel API 599================= 600 601The following bpf syscall command involves BTF: 602 * BPF_BTF_LOAD: load a blob of BTF data into kernel 603 * BPF_MAP_CREATE: map creation with btf key and value type info. 604 * BPF_PROG_LOAD: prog load with btf function and line info. 605 * BPF_BTF_GET_FD_BY_ID: get a btf fd 606 * BPF_OBJ_GET_INFO_BY_FD: btf, func_info, line_info 607 and other btf related info are returned. 608 609The workflow typically looks like: 610:: 611 612 Application: 613 BPF_BTF_LOAD 614 | 615 v 616 BPF_MAP_CREATE and BPF_PROG_LOAD 617 | 618 V 619 ...... 620 621 Introspection tool: 622 ...... 623 BPF_{PROG,MAP}_GET_NEXT_ID (get prog/map id's) 624 | 625 V 626 BPF_{PROG,MAP}_GET_FD_BY_ID (get a prog/map fd) 627 | 628 V 629 BPF_OBJ_GET_INFO_BY_FD (get bpf_prog_info/bpf_map_info with btf_id) 630 | | 631 V | 632 BPF_BTF_GET_FD_BY_ID (get btf_fd) | 633 | | 634 V | 635 BPF_OBJ_GET_INFO_BY_FD (get btf) | 636 | | 637 V V 638 pretty print types, dump func signatures and line info, etc. 639 640 6413.1 BPF_BTF_LOAD 642---------------- 643 644Load a blob of BTF data into kernel. A blob of data, described in 645:ref:`BTF_Type_String`, can be directly loaded into the kernel. A ``btf_fd`` 646is returned to a userspace. 647 6483.2 BPF_MAP_CREATE 649------------------ 650 651A map can be created with ``btf_fd`` and specified key/value type id.:: 652 653 __u32 btf_fd; /* fd pointing to a BTF type data */ 654 __u32 btf_key_type_id; /* BTF type_id of the key */ 655 __u32 btf_value_type_id; /* BTF type_id of the value */ 656 657In libbpf, the map can be defined with extra annotation like below: 658:: 659 660 struct { 661 __uint(type, BPF_MAP_TYPE_ARRAY); 662 __type(key, int); 663 __type(value, struct ipv_counts); 664 __uint(max_entries, 4); 665 } btf_map SEC(".maps"); 666 667During ELF parsing, libbpf is able to extract key/value type_id's and assign 668them to BPF_MAP_CREATE attributes automatically. 669 670.. _BPF_Prog_Load: 671 6723.3 BPF_PROG_LOAD 673----------------- 674 675During prog_load, func_info and line_info can be passed to kernel with proper 676values for the following attributes: 677:: 678 679 __u32 insn_cnt; 680 __aligned_u64 insns; 681 ...... 682 __u32 prog_btf_fd; /* fd pointing to BTF type data */ 683 __u32 func_info_rec_size; /* userspace bpf_func_info size */ 684 __aligned_u64 func_info; /* func info */ 685 __u32 func_info_cnt; /* number of bpf_func_info records */ 686 __u32 line_info_rec_size; /* userspace bpf_line_info size */ 687 __aligned_u64 line_info; /* line info */ 688 __u32 line_info_cnt; /* number of bpf_line_info records */ 689 690The func_info and line_info are an array of below, respectively.:: 691 692 struct bpf_func_info { 693 __u32 insn_off; /* [0, insn_cnt - 1] */ 694 __u32 type_id; /* pointing to a BTF_KIND_FUNC type */ 695 }; 696 struct bpf_line_info { 697 __u32 insn_off; /* [0, insn_cnt - 1] */ 698 __u32 file_name_off; /* offset to string table for the filename */ 699 __u32 line_off; /* offset to string table for the source line */ 700 __u32 line_col; /* line number and column number */ 701 }; 702 703func_info_rec_size is the size of each func_info record, and 704line_info_rec_size is the size of each line_info record. Passing the record 705size to kernel make it possible to extend the record itself in the future. 706 707Below are requirements for func_info: 708 * func_info[0].insn_off must be 0. 709 * the func_info insn_off is in strictly increasing order and matches 710 bpf func boundaries. 711 712Below are requirements for line_info: 713 * the first insn in each func must have a line_info record pointing to it. 714 * the line_info insn_off is in strictly increasing order. 715 716For line_info, the line number and column number are defined as below: 717:: 718 719 #define BPF_LINE_INFO_LINE_NUM(line_col) ((line_col) >> 10) 720 #define BPF_LINE_INFO_LINE_COL(line_col) ((line_col) & 0x3ff) 721 7223.4 BPF_{PROG,MAP}_GET_NEXT_ID 723------------------------------ 724 725In kernel, every loaded program, map or btf has a unique id. The id won't 726change during the lifetime of a program, map, or btf. 727 728The bpf syscall command BPF_{PROG,MAP}_GET_NEXT_ID returns all id's, one for 729each command, to user space, for bpf program or maps, respectively, so an 730inspection tool can inspect all programs and maps. 731 7323.5 BPF_{PROG,MAP}_GET_FD_BY_ID 733------------------------------- 734 735An introspection tool cannot use id to get details about program or maps. 736A file descriptor needs to be obtained first for reference-counting purpose. 737 7383.6 BPF_OBJ_GET_INFO_BY_FD 739-------------------------- 740 741Once a program/map fd is acquired, an introspection tool can get the detailed 742information from kernel about this fd, some of which are BTF-related. For 743example, ``bpf_map_info`` returns ``btf_id`` and key/value type ids. 744``bpf_prog_info`` returns ``btf_id``, func_info, and line info for translated 745bpf byte codes, and jited_line_info. 746 7473.7 BPF_BTF_GET_FD_BY_ID 748------------------------ 749 750With ``btf_id`` obtained in ``bpf_map_info`` and ``bpf_prog_info``, bpf 751syscall command BPF_BTF_GET_FD_BY_ID can retrieve a btf fd. Then, with 752command BPF_OBJ_GET_INFO_BY_FD, the btf blob, originally loaded into the 753kernel with BPF_BTF_LOAD, can be retrieved. 754 755With the btf blob, ``bpf_map_info``, and ``bpf_prog_info``, an introspection 756tool has full btf knowledge and is able to pretty print map key/values, dump 757func signatures and line info, along with byte/jit codes. 758 7594. ELF File Format Interface 760============================ 761 7624.1 .BTF section 763---------------- 764 765The .BTF section contains type and string data. The format of this section is 766same as the one describe in :ref:`BTF_Type_String`. 767 768.. _BTF_Ext_Section: 769 7704.2 .BTF.ext section 771-------------------- 772 773The .BTF.ext section encodes func_info, line_info and CO-RE relocations 774which needs loader manipulation before loading into the kernel. 775 776The specification for .BTF.ext section is defined at ``tools/lib/bpf/btf.h`` 777and ``tools/lib/bpf/btf.c``. 778 779The current header of .BTF.ext section:: 780 781 struct btf_ext_header { 782 __u16 magic; 783 __u8 version; 784 __u8 flags; 785 __u32 hdr_len; 786 787 /* All offsets are in bytes relative to the end of this header */ 788 __u32 func_info_off; 789 __u32 func_info_len; 790 __u32 line_info_off; 791 __u32 line_info_len; 792 793 /* optional part of .BTF.ext header */ 794 __u32 core_relo_off; 795 __u32 core_relo_len; 796 }; 797 798It is very similar to .BTF section. Instead of type/string section, it 799contains func_info, line_info and core_relo sub-sections. 800See :ref:`BPF_Prog_Load` for details about func_info and line_info 801record format. 802 803The func_info is organized as below.:: 804 805 func_info_rec_size /* __u32 value */ 806 btf_ext_info_sec for section #1 /* func_info for section #1 */ 807 btf_ext_info_sec for section #2 /* func_info for section #2 */ 808 ... 809 810``func_info_rec_size`` specifies the size of ``bpf_func_info`` structure when 811.BTF.ext is generated. ``btf_ext_info_sec``, defined below, is a collection of 812func_info for each specific ELF section.:: 813 814 struct btf_ext_info_sec { 815 __u32 sec_name_off; /* offset to section name */ 816 __u32 num_info; 817 /* Followed by num_info * record_size number of bytes */ 818 __u8 data[0]; 819 }; 820 821Here, num_info must be greater than 0. 822 823The line_info is organized as below.:: 824 825 line_info_rec_size /* __u32 value */ 826 btf_ext_info_sec for section #1 /* line_info for section #1 */ 827 btf_ext_info_sec for section #2 /* line_info for section #2 */ 828 ... 829 830``line_info_rec_size`` specifies the size of ``bpf_line_info`` structure when 831.BTF.ext is generated. 832 833The interpretation of ``bpf_func_info->insn_off`` and 834``bpf_line_info->insn_off`` is different between kernel API and ELF API. For 835kernel API, the ``insn_off`` is the instruction offset in the unit of ``struct 836bpf_insn``. For ELF API, the ``insn_off`` is the byte offset from the 837beginning of section (``btf_ext_info_sec->sec_name_off``). 838 839The core_relo is organized as below.:: 840 841 core_relo_rec_size /* __u32 value */ 842 btf_ext_info_sec for section #1 /* core_relo for section #1 */ 843 btf_ext_info_sec for section #2 /* core_relo for section #2 */ 844 845``core_relo_rec_size`` specifies the size of ``bpf_core_relo`` 846structure when .BTF.ext is generated. All ``bpf_core_relo`` structures 847within a single ``btf_ext_info_sec`` describe relocations applied to 848section named by ``btf_ext_info_sec->sec_name_off``. 849 850See :ref:`Documentation/bpf/llvm_reloc.rst <btf-co-re-relocations>` 851for more information on CO-RE relocations. 852 8534.3 .BTF_ids section 854-------------------- 855 856The .BTF_ids section encodes BTF ID values that are used within the kernel. 857 858This section is created during the kernel compilation with the help of 859macros defined in ``include/linux/btf_ids.h`` header file. Kernel code can 860use them to create lists and sets (sorted lists) of BTF ID values. 861 862The ``BTF_ID_LIST`` and ``BTF_ID`` macros define unsorted list of BTF ID values, 863with following syntax:: 864 865 BTF_ID_LIST(list) 866 BTF_ID(type1, name1) 867 BTF_ID(type2, name2) 868 869resulting in following layout in .BTF_ids section:: 870 871 __BTF_ID__type1__name1__1: 872 .zero 4 873 __BTF_ID__type2__name2__2: 874 .zero 4 875 876The ``u32 list[];`` variable is defined to access the list. 877 878The ``BTF_ID_UNUSED`` macro defines 4 zero bytes. It's used when we 879want to define unused entry in BTF_ID_LIST, like:: 880 881 BTF_ID_LIST(bpf_skb_output_btf_ids) 882 BTF_ID(struct, sk_buff) 883 BTF_ID_UNUSED 884 BTF_ID(struct, task_struct) 885 886The ``BTF_SET_START/END`` macros pair defines sorted list of BTF ID values 887and their count, with following syntax:: 888 889 BTF_SET_START(set) 890 BTF_ID(type1, name1) 891 BTF_ID(type2, name2) 892 BTF_SET_END(set) 893 894resulting in following layout in .BTF_ids section:: 895 896 __BTF_ID__set__set: 897 .zero 4 898 __BTF_ID__type1__name1__3: 899 .zero 4 900 __BTF_ID__type2__name2__4: 901 .zero 4 902 903The ``struct btf_id_set set;`` variable is defined to access the list. 904 905The ``typeX`` name can be one of following:: 906 907 struct, union, typedef, func 908 909and is used as a filter when resolving the BTF ID value. 910 911All the BTF ID lists and sets are compiled in the .BTF_ids section and 912resolved during the linking phase of kernel build by ``resolve_btfids`` tool. 913 9144.4 .BTF.base section 915--------------------- 916Split BTF - where the .BTF section only contains types not in the associated 917base .BTF section - is an extremely efficient way to encode type information 918for kernel modules, since they generally consist of a few module-specific 919types along with a large set of shared kernel types. The former are encoded 920in split BTF, while the latter are encoded in base BTF, resulting in more 921compact representations. A type in split BTF that refers to a type in 922base BTF refers to it using its base BTF ID, and split BTF IDs start 923at last_base_BTF_ID + 1. 924 925The downside of this approach however is that this makes the split BTF 926somewhat brittle - when the base BTF changes, base BTF ID references are 927no longer valid and the split BTF itself becomes useless. The role of the 928.BTF.base section is to make split BTF more resilient for cases where 929the base BTF may change, as is the case for kernel modules not built every 930time the kernel is for example. .BTF.base contains named base types; INTs, 931FLOATs, STRUCTs, UNIONs, ENUM[64]s and FWDs. INTs and FLOATs are fully 932described in .BTF.base sections, while composite types like structs 933and unions are not fully defined - the .BTF.base type simply serves as 934a description of the type the split BTF referred to, so structs/unions 935have 0 members in the .BTF.base section. ENUM[64]s are similarly recorded 936with 0 members. Any other types are added to the split BTF. This 937distillation process then leaves us with a .BTF.base section with 938such minimal descriptions of base types and .BTF split section which refers 939to those base types. Later, we can relocate the split BTF using both the 940information stored in the .BTF.base section and the new .BTF base; the type 941information in the .BTF.base section allows us to update the split BTF 942references to point at the corresponding new base BTF IDs. 943 944BTF relocation happens on kernel module load when a kernel module has a 945.BTF.base section, and libbpf also provides a btf__relocate() API to 946accomplish this. 947 948As an example consider the following base BTF:: 949 950 [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED 951 [2] STRUCT 'foo' size=8 vlen=2 952 'f1' type_id=1 bits_offset=0 953 'f2' type_id=1 bits_offset=32 954 955...and associated split BTF:: 956 957 [3] PTR '(anon)' type_id=2 958 959i.e. split BTF describes a pointer to struct foo { int f1; int f2 }; 960 961.BTF.base will consist of:: 962 963 [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED 964 [2] STRUCT 'foo' size=8 vlen=0 965 966If we relocate the split BTF later using the following new base BTF:: 967 968 [1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none) 969 [2] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED 970 [3] STRUCT 'foo' size=8 vlen=2 971 'f1' type_id=2 bits_offset=0 972 'f2' type_id=2 bits_offset=32 973 974...we can use our .BTF.base description to know that the split BTF reference 975is to struct foo, and relocation results in new split BTF:: 976 977 [4] PTR '(anon)' type_id=3 978 979Note that we had to update BTF ID and start BTF ID for the split BTF. 980 981So we see how .BTF.base plays the role of facilitating later relocation, 982leading to more resilient split BTF. 983 984.BTF.base sections will be generated automatically for out-of-tree kernel module 985builds - i.e. where KBUILD_EXTMOD is set (as it would be for "make M=path/2/mod" 986cases). .BTF.base generation requires pahole support for the "distilled_base" 987BTF feature; this is available in pahole v1.28 and later. 988 9895. Using BTF 990============ 991 9925.1 bpftool map pretty print 993---------------------------- 994 995With BTF, the map key/value can be printed based on fields rather than simply 996raw bytes. This is especially valuable for large structure or if your data 997structure has bitfields. For example, for the following map,:: 998 999 enum A { A1, A2, A3, A4, A5 }; 1000 typedef enum A ___A; 1001 struct tmp_t { 1002 char a1:4; 1003 int a2:4; 1004 int :4; 1005 __u32 a3:4; 1006 int b; 1007 ___A b1:4; 1008 enum A b2:4; 1009 }; 1010 struct { 1011 __uint(type, BPF_MAP_TYPE_ARRAY); 1012 __type(key, int); 1013 __type(value, struct tmp_t); 1014 __uint(max_entries, 1); 1015 } tmpmap SEC(".maps"); 1016 1017bpftool is able to pretty print like below: 1018:: 1019 1020 [{ 1021 "key": 0, 1022 "value": { 1023 "a1": 0x2, 1024 "a2": 0x4, 1025 "a3": 0x6, 1026 "b": 7, 1027 "b1": 0x8, 1028 "b2": 0xa 1029 } 1030 } 1031 ] 1032 10335.2 bpftool prog dump 1034--------------------- 1035 1036The following is an example showing how func_info and line_info can help prog 1037dump with better kernel symbol names, function prototypes and line 1038information.:: 1039 1040 $ bpftool prog dump jited pinned /sys/fs/bpf/test_btf_haskv 1041 [...] 1042 int test_long_fname_2(struct dummy_tracepoint_args * arg): 1043 bpf_prog_44a040bf25481309_test_long_fname_2: 1044 ; static int test_long_fname_2(struct dummy_tracepoint_args *arg) 1045 0: push %rbp 1046 1: mov %rsp,%rbp 1047 4: sub $0x30,%rsp 1048 b: sub $0x28,%rbp 1049 f: mov %rbx,0x0(%rbp) 1050 13: mov %r13,0x8(%rbp) 1051 17: mov %r14,0x10(%rbp) 1052 1b: mov %r15,0x18(%rbp) 1053 1f: xor %eax,%eax 1054 21: mov %rax,0x20(%rbp) 1055 25: xor %esi,%esi 1056 ; int key = 0; 1057 27: mov %esi,-0x4(%rbp) 1058 ; if (!arg->sock) 1059 2a: mov 0x8(%rdi),%rdi 1060 ; if (!arg->sock) 1061 2e: cmp $0x0,%rdi 1062 32: je 0x0000000000000070 1063 34: mov %rbp,%rsi 1064 ; counts = bpf_map_lookup_elem(&btf_map, &key); 1065 [...] 1066 10675.3 Verifier Log 1068---------------- 1069 1070The following is an example of how line_info can help debugging verification 1071failure.:: 1072 1073 /* The code at tools/testing/selftests/bpf/test_xdp_noinline.c 1074 * is modified as below. 1075 */ 1076 data = (void *)(long)xdp->data; 1077 data_end = (void *)(long)xdp->data_end; 1078 /* 1079 if (data + 4 > data_end) 1080 return XDP_DROP; 1081 */ 1082 *(u32 *)data = dst->dst; 1083 1084 $ bpftool prog load ./test_xdp_noinline.o /sys/fs/bpf/test_xdp_noinline type xdp 1085 ; data = (void *)(long)xdp->data; 1086 224: (79) r2 = *(u64 *)(r10 -112) 1087 225: (61) r2 = *(u32 *)(r2 +0) 1088 ; *(u32 *)data = dst->dst; 1089 226: (63) *(u32 *)(r2 +0) = r1 1090 invalid access to packet, off=0 size=4, R2(id=0,off=0,r=0) 1091 R2 offset is outside of the packet 1092 10936. BTF Generation 1094================= 1095 1096You need latest pahole 1097 1098 https://git.kernel.org/pub/scm/devel/pahole/pahole.git/ 1099 1100or llvm (8.0 or later). The pahole acts as a dwarf2btf converter. It doesn't 1101support .BTF.ext and btf BTF_KIND_FUNC type yet. For example,:: 1102 1103 -bash-4.4$ cat t.c 1104 struct t { 1105 int a:2; 1106 int b:3; 1107 int c:2; 1108 } g; 1109 -bash-4.4$ gcc -c -O2 -g t.c 1110 -bash-4.4$ pahole -JV t.o 1111 File t.o: 1112 [1] STRUCT t kind_flag=1 size=4 vlen=3 1113 a type_id=2 bitfield_size=2 bits_offset=0 1114 b type_id=2 bitfield_size=3 bits_offset=2 1115 c type_id=2 bitfield_size=2 bits_offset=5 1116 [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED 1117 1118The llvm is able to generate .BTF and .BTF.ext directly with -g for bpf target 1119only. The assembly code (-S) is able to show the BTF encoding in assembly 1120format.:: 1121 1122 -bash-4.4$ cat t2.c 1123 typedef int __int32; 1124 struct t2 { 1125 int a2; 1126 int (*f2)(char q1, __int32 q2, ...); 1127 int (*f3)(); 1128 } g2; 1129 int main() { return 0; } 1130 int test() { return 0; } 1131 -bash-4.4$ clang -c -g -O2 --target=bpf t2.c 1132 -bash-4.4$ readelf -S t2.o 1133 ...... 1134 [ 8] .BTF PROGBITS 0000000000000000 00000247 1135 000000000000016e 0000000000000000 0 0 1 1136 [ 9] .BTF.ext PROGBITS 0000000000000000 000003b5 1137 0000000000000060 0000000000000000 0 0 1 1138 [10] .rel.BTF.ext REL 0000000000000000 000007e0 1139 0000000000000040 0000000000000010 16 9 8 1140 ...... 1141 -bash-4.4$ clang -S -g -O2 --target=bpf t2.c 1142 -bash-4.4$ cat t2.s 1143 ...... 1144 .section .BTF,"",@progbits 1145 .short 60319 # 0xeb9f 1146 .byte 1 1147 .byte 0 1148 .long 24 1149 .long 0 1150 .long 220 1151 .long 220 1152 .long 122 1153 .long 0 # BTF_KIND_FUNC_PROTO(id = 1) 1154 .long 218103808 # 0xd000000 1155 .long 2 1156 .long 83 # BTF_KIND_INT(id = 2) 1157 .long 16777216 # 0x1000000 1158 .long 4 1159 .long 16777248 # 0x1000020 1160 ...... 1161 .byte 0 # string offset=0 1162 .ascii ".text" # string offset=1 1163 .byte 0 1164 .ascii "/home/yhs/tmp-pahole/t2.c" # string offset=7 1165 .byte 0 1166 .ascii "int main() { return 0; }" # string offset=33 1167 .byte 0 1168 .ascii "int test() { return 0; }" # string offset=58 1169 .byte 0 1170 .ascii "int" # string offset=83 1171 ...... 1172 .section .BTF.ext,"",@progbits 1173 .short 60319 # 0xeb9f 1174 .byte 1 1175 .byte 0 1176 .long 24 1177 .long 0 1178 .long 28 1179 .long 28 1180 .long 44 1181 .long 8 # FuncInfo 1182 .long 1 # FuncInfo section string offset=1 1183 .long 2 1184 .long .Lfunc_begin0 1185 .long 3 1186 .long .Lfunc_begin1 1187 .long 5 1188 .long 16 # LineInfo 1189 .long 1 # LineInfo section string offset=1 1190 .long 2 1191 .long .Ltmp0 1192 .long 7 1193 .long 33 1194 .long 7182 # Line 7 Col 14 1195 .long .Ltmp3 1196 .long 7 1197 .long 58 1198 .long 8206 # Line 8 Col 14 1199 12007. Testing 1201========== 1202 1203The kernel BPF selftest `tools/testing/selftests/bpf/prog_tests/btf.c`_ 1204provides an extensive set of BTF-related tests. 1205 1206.. Links 1207.. _tools/testing/selftests/bpf/prog_tests/btf.c: 1208 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/testing/selftests/bpf/prog_tests/btf.c 1209