1===================== 2BPF Type Format (BTF) 3===================== 4 51. Introduction 6=============== 7 8BTF (BPF Type Format) is the metadata format which encodes the debug info 9related to BPF program/map. The name BTF was used initially to describe data 10types. The BTF was later extended to include function info for defined 11subroutines, and line info for source/line information. 12 13The debug info is used for map pretty print, function signature, etc. The 14function signature enables better bpf program/function kernel symbol. The line 15info helps generate source annotated translated byte code, jited code and 16verifier log. 17 18The BTF specification contains two parts, 19 * BTF kernel API 20 * BTF ELF file format 21 22The kernel API is the contract between user space and kernel. The kernel 23verifies the BTF info before using it. The ELF file format is a user space 24contract between ELF file and libbpf loader. 25 26The type and string sections are part of the BTF kernel API, describing the 27debug info (mostly types related) referenced by the bpf program. These two 28sections are discussed in details in :ref:`BTF_Type_String`. 29 30.. _BTF_Type_String: 31 322. BTF Type and String Encoding 33=============================== 34 35The file ``include/uapi/linux/btf.h`` provides high-level definition of how 36types/strings are encoded. 37 38The beginning of data blob must be:: 39 40 struct btf_header { 41 __u16 magic; 42 __u8 version; 43 __u8 flags; 44 __u32 hdr_len; 45 46 /* All offsets are in bytes relative to the end of this header */ 47 __u32 type_off; /* offset of type section */ 48 __u32 type_len; /* length of type section */ 49 __u32 str_off; /* offset of string section */ 50 __u32 str_len; /* length of string section */ 51 }; 52 53The magic is ``0xeB9F``, which has different encoding for big and little 54endian systems, and can be used to test whether BTF is generated for big- or 55little-endian target. The ``btf_header`` is designed to be extensible with 56``hdr_len`` equal to ``sizeof(struct btf_header)`` when a data blob is 57generated. 58 592.1 String Encoding 60------------------- 61 62The first string in the string section must be a null string. The rest of 63string table is a concatenation of other null-terminated strings. 64 652.2 Type Encoding 66----------------- 67 68The type id ``0`` is reserved for ``void`` type. The type section is parsed 69sequentially and type id is assigned to each recognized type starting from id 70``1``. Currently, the following types are supported:: 71 72 #define BTF_KIND_INT 1 /* Integer */ 73 #define BTF_KIND_PTR 2 /* Pointer */ 74 #define BTF_KIND_ARRAY 3 /* Array */ 75 #define BTF_KIND_STRUCT 4 /* Struct */ 76 #define BTF_KIND_UNION 5 /* Union */ 77 #define BTF_KIND_ENUM 6 /* Enumeration up to 32-bit values */ 78 #define BTF_KIND_FWD 7 /* Forward */ 79 #define BTF_KIND_TYPEDEF 8 /* Typedef */ 80 #define BTF_KIND_VOLATILE 9 /* Volatile */ 81 #define BTF_KIND_CONST 10 /* Const */ 82 #define BTF_KIND_RESTRICT 11 /* Restrict */ 83 #define BTF_KIND_FUNC 12 /* Function */ 84 #define BTF_KIND_FUNC_PROTO 13 /* Function Proto */ 85 #define BTF_KIND_VAR 14 /* Variable */ 86 #define BTF_KIND_DATASEC 15 /* Section */ 87 #define BTF_KIND_FLOAT 16 /* Floating point */ 88 #define BTF_KIND_DECL_TAG 17 /* Decl Tag */ 89 #define BTF_KIND_TYPE_TAG 18 /* Type Tag */ 90 #define BTF_KIND_ENUM64 19 /* Enumeration up to 64-bit values */ 91 92Note that the type section encodes debug info, not just pure types. 93``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram. 94 95Each type contains the following common data:: 96 97 struct btf_type { 98 __u32 name_off; 99 /* "info" bits arrangement 100 * bits 0-15: vlen (e.g. # of struct's members) 101 * bits 16-23: unused 102 * bits 24-28: kind (e.g. int, ptr, array...etc) 103 * bits 29-30: unused 104 * bit 31: kind_flag, currently used by 105 * struct, union, fwd, enum and enum64. 106 */ 107 __u32 info; 108 /* "size" is used by INT, ENUM, STRUCT, UNION and ENUM64. 109 * "size" tells the size of the type it is describing. 110 * 111 * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT, 112 * FUNC, FUNC_PROTO, DECL_TAG and TYPE_TAG. 113 * "type" is a type_id referring to another type. 114 */ 115 union { 116 __u32 size; 117 __u32 type; 118 }; 119 }; 120 121For certain kinds, the common data are followed by kind-specific data. The 122``name_off`` in ``struct btf_type`` specifies the offset in the string table. 123The following sections detail encoding of each kind. 124 1252.2.1 BTF_KIND_INT 126~~~~~~~~~~~~~~~~~~ 127 128``struct btf_type`` encoding requirement: 129 * ``name_off``: any valid offset 130 * ``info.kind_flag``: 0 131 * ``info.kind``: BTF_KIND_INT 132 * ``info.vlen``: 0 133 * ``size``: the size of the int type in bytes. 134 135``btf_type`` is followed by a ``u32`` with the following bits arrangement:: 136 137 #define BTF_INT_ENCODING(VAL) (((VAL) & 0x0f000000) >> 24) 138 #define BTF_INT_OFFSET(VAL) (((VAL) & 0x00ff0000) >> 16) 139 #define BTF_INT_BITS(VAL) ((VAL) & 0x000000ff) 140 141The ``BTF_INT_ENCODING`` has the following attributes:: 142 143 #define BTF_INT_SIGNED (1 << 0) 144 #define BTF_INT_CHAR (1 << 1) 145 #define BTF_INT_BOOL (1 << 2) 146 147The ``BTF_INT_ENCODING()`` provides extra information: signedness, char, or 148bool, for the int type. The char and bool encoding are mostly useful for 149pretty print. At most one encoding can be specified for the int type. 150 151The ``BTF_INT_BITS()`` specifies the number of actual bits held by this int 152type. For example, a 4-bit bitfield encodes ``BTF_INT_BITS()`` equals to 4. 153The ``btf_type.size * 8`` must be equal to or greater than ``BTF_INT_BITS()`` 154for the type. The maximum value of ``BTF_INT_BITS()`` is 128. 155 156The ``BTF_INT_OFFSET()`` specifies the starting bit offset to calculate values 157for this int. For example, a bitfield struct member has: 158 159 * btf member bit offset 100 from the start of the structure, 160 * btf member pointing to an int type, 161 * the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4`` 162 163Then in the struct memory layout, this member will occupy ``4`` bits starting 164from bits ``100 + 2 = 102``. 165 166Alternatively, the bitfield struct member can be the following to access the 167same bits as the above: 168 169 * btf member bit offset 102, 170 * btf member pointing to an int type, 171 * the int type has ``BTF_INT_OFFSET() = 0`` and ``BTF_INT_BITS() = 4`` 172 173The original intention of ``BTF_INT_OFFSET()`` is to provide flexibility of 174bitfield encoding. Currently, both llvm and pahole generate 175``BTF_INT_OFFSET() = 0`` for all int types. 176 1772.2.2 BTF_KIND_PTR 178~~~~~~~~~~~~~~~~~~ 179 180``struct btf_type`` encoding requirement: 181 * ``name_off``: 0 182 * ``info.kind_flag``: 0 183 * ``info.kind``: BTF_KIND_PTR 184 * ``info.vlen``: 0 185 * ``type``: the pointee type of the pointer 186 187No additional type data follow ``btf_type``. 188 1892.2.3 BTF_KIND_ARRAY 190~~~~~~~~~~~~~~~~~~~~ 191 192``struct btf_type`` encoding requirement: 193 * ``name_off``: 0 194 * ``info.kind_flag``: 0 195 * ``info.kind``: BTF_KIND_ARRAY 196 * ``info.vlen``: 0 197 * ``size/type``: 0, not used 198 199``btf_type`` is followed by one ``struct btf_array``:: 200 201 struct btf_array { 202 __u32 type; 203 __u32 index_type; 204 __u32 nelems; 205 }; 206 207The ``struct btf_array`` encoding: 208 * ``type``: the element type 209 * ``index_type``: the index type 210 * ``nelems``: the number of elements for this array (``0`` is also allowed). 211 212The ``index_type`` can be any regular int type (``u8``, ``u16``, ``u32``, 213``u64``, ``unsigned __int128``). The original design of including 214``index_type`` follows DWARF, which has an ``index_type`` for its array type. 215Currently in BTF, beyond type verification, the ``index_type`` is not used. 216 217The ``struct btf_array`` allows chaining through element type to represent 218multidimensional arrays. For example, for ``int a[5][6]``, the following type 219information illustrates the chaining: 220 221 * [1]: int 222 * [2]: array, ``btf_array.type = [1]``, ``btf_array.nelems = 6`` 223 * [3]: array, ``btf_array.type = [2]``, ``btf_array.nelems = 5`` 224 225Currently, both pahole and llvm collapse multidimensional array into 226one-dimensional array, e.g., for ``a[5][6]``, the ``btf_array.nelems`` is 227equal to ``30``. This is because the original use case is map pretty print 228where the whole array is dumped out so one-dimensional array is enough. As 229more BTF usage is explored, pahole and llvm can be changed to generate proper 230chained representation for multidimensional arrays. 231 2322.2.4 BTF_KIND_STRUCT 233~~~~~~~~~~~~~~~~~~~~~ 2342.2.5 BTF_KIND_UNION 235~~~~~~~~~~~~~~~~~~~~ 236 237``struct btf_type`` encoding requirement: 238 * ``name_off``: 0 or offset to a valid C identifier 239 * ``info.kind_flag``: 0 or 1 240 * ``info.kind``: BTF_KIND_STRUCT or BTF_KIND_UNION 241 * ``info.vlen``: the number of struct/union members 242 * ``info.size``: the size of the struct/union in bytes 243 244``btf_type`` is followed by ``info.vlen`` number of ``struct btf_member``.:: 245 246 struct btf_member { 247 __u32 name_off; 248 __u32 type; 249 __u32 offset; 250 }; 251 252``struct btf_member`` encoding: 253 * ``name_off``: offset to a valid C identifier 254 * ``type``: the member type 255 * ``offset``: <see below> 256 257If the type info ``kind_flag`` is not set, the offset contains only bit offset 258of the member. Note that the base type of the bitfield can only be int or enum 259type. If the bitfield size is 32, the base type can be either int or enum 260type. If the bitfield size is not 32, the base type must be int, and int type 261``BTF_INT_BITS()`` encodes the bitfield size. 262 263If the ``kind_flag`` is set, the ``btf_member.offset`` contains both member 264bitfield size and bit offset. The bitfield size and bit offset are calculated 265as below.:: 266 267 #define BTF_MEMBER_BITFIELD_SIZE(val) ((val) >> 24) 268 #define BTF_MEMBER_BIT_OFFSET(val) ((val) & 0xffffff) 269 270In this case, if the base type is an int type, it must be a regular int type: 271 272 * ``BTF_INT_OFFSET()`` must be 0. 273 * ``BTF_INT_BITS()`` must be equal to ``{1,2,4,8,16} * 8``. 274 275Commit 9d5f9f701b18 introduced ``kind_flag`` and explains why both modes 276exist. 277 2782.2.6 BTF_KIND_ENUM 279~~~~~~~~~~~~~~~~~~~ 280 281``struct btf_type`` encoding requirement: 282 * ``name_off``: 0 or offset to a valid C identifier 283 * ``info.kind_flag``: 0 for unsigned, 1 for signed 284 * ``info.kind``: BTF_KIND_ENUM 285 * ``info.vlen``: number of enum values 286 * ``size``: 1/2/4/8 287 288``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum``.:: 289 290 struct btf_enum { 291 __u32 name_off; 292 __s32 val; 293 }; 294 295The ``btf_enum`` encoding: 296 * ``name_off``: offset to a valid C identifier 297 * ``val``: any value 298 299If the original enum value is signed and the size is less than 4, 300that value will be sign extended into 4 bytes. If the size is 8, 301the value will be truncated into 4 bytes. 302 3032.2.7 BTF_KIND_FWD 304~~~~~~~~~~~~~~~~~~ 305 306``struct btf_type`` encoding requirement: 307 * ``name_off``: offset to a valid C identifier 308 * ``info.kind_flag``: 0 for struct, 1 for union 309 * ``info.kind``: BTF_KIND_FWD 310 * ``info.vlen``: 0 311 * ``type``: 0 312 313No additional type data follow ``btf_type``. 314 3152.2.8 BTF_KIND_TYPEDEF 316~~~~~~~~~~~~~~~~~~~~~~ 317 318``struct btf_type`` encoding requirement: 319 * ``name_off``: offset to a valid C identifier 320 * ``info.kind_flag``: 0 321 * ``info.kind``: BTF_KIND_TYPEDEF 322 * ``info.vlen``: 0 323 * ``type``: the type which can be referred by name at ``name_off`` 324 325No additional type data follow ``btf_type``. 326 3272.2.9 BTF_KIND_VOLATILE 328~~~~~~~~~~~~~~~~~~~~~~~ 329 330``struct btf_type`` encoding requirement: 331 * ``name_off``: 0 332 * ``info.kind_flag``: 0 333 * ``info.kind``: BTF_KIND_VOLATILE 334 * ``info.vlen``: 0 335 * ``type``: the type with ``volatile`` qualifier 336 337No additional type data follow ``btf_type``. 338 3392.2.10 BTF_KIND_CONST 340~~~~~~~~~~~~~~~~~~~~~ 341 342``struct btf_type`` encoding requirement: 343 * ``name_off``: 0 344 * ``info.kind_flag``: 0 345 * ``info.kind``: BTF_KIND_CONST 346 * ``info.vlen``: 0 347 * ``type``: the type with ``const`` qualifier 348 349No additional type data follow ``btf_type``. 350 3512.2.11 BTF_KIND_RESTRICT 352~~~~~~~~~~~~~~~~~~~~~~~~ 353 354``struct btf_type`` encoding requirement: 355 * ``name_off``: 0 356 * ``info.kind_flag``: 0 357 * ``info.kind``: BTF_KIND_RESTRICT 358 * ``info.vlen``: 0 359 * ``type``: the type with ``restrict`` qualifier 360 361No additional type data follow ``btf_type``. 362 3632.2.12 BTF_KIND_FUNC 364~~~~~~~~~~~~~~~~~~~~ 365 366``struct btf_type`` encoding requirement: 367 * ``name_off``: offset to a valid C identifier 368 * ``info.kind_flag``: 0 369 * ``info.kind``: BTF_KIND_FUNC 370 * ``info.vlen``: linkage information (BTF_FUNC_STATIC, BTF_FUNC_GLOBAL 371 or BTF_FUNC_EXTERN - see :ref:`BTF_Function_Linkage_Constants`) 372 * ``type``: a BTF_KIND_FUNC_PROTO type 373 374No additional type data follow ``btf_type``. 375 376A BTF_KIND_FUNC defines not a type, but a subprogram (function) whose 377signature is defined by ``type``. The subprogram is thus an instance of that 378type. The BTF_KIND_FUNC may in turn be referenced by a func_info in the 379:ref:`BTF_Ext_Section` (ELF) or in the arguments to :ref:`BPF_Prog_Load` 380(ABI). 381 382Currently, only linkage values of BTF_FUNC_STATIC and BTF_FUNC_GLOBAL are 383supported in the kernel. 384 3852.2.13 BTF_KIND_FUNC_PROTO 386~~~~~~~~~~~~~~~~~~~~~~~~~~ 387 388``struct btf_type`` encoding requirement: 389 * ``name_off``: 0 390 * ``info.kind_flag``: 0 391 * ``info.kind``: BTF_KIND_FUNC_PROTO 392 * ``info.vlen``: # of parameters 393 * ``type``: the return type 394 395``btf_type`` is followed by ``info.vlen`` number of ``struct btf_param``.:: 396 397 struct btf_param { 398 __u32 name_off; 399 __u32 type; 400 }; 401 402If a BTF_KIND_FUNC_PROTO type is referred by a BTF_KIND_FUNC type, then 403``btf_param.name_off`` must point to a valid C identifier except for the 404possible last argument representing the variable argument. The btf_param.type 405refers to parameter type. 406 407If the function has variable arguments, the last parameter is encoded with 408``name_off = 0`` and ``type = 0``. 409 4102.2.14 BTF_KIND_VAR 411~~~~~~~~~~~~~~~~~~~ 412 413``struct btf_type`` encoding requirement: 414 * ``name_off``: offset to a valid C identifier 415 * ``info.kind_flag``: 0 416 * ``info.kind``: BTF_KIND_VAR 417 * ``info.vlen``: 0 418 * ``type``: the type of the variable 419 420``btf_type`` is followed by a single ``struct btf_variable`` with the 421following data:: 422 423 struct btf_var { 424 __u32 linkage; 425 }; 426 427``btf_var.linkage`` may take the values: BTF_VAR_STATIC, BTF_VAR_GLOBAL_ALLOCATED or BTF_VAR_GLOBAL_EXTERN - 428see :ref:`BTF_Var_Linkage_Constants`. 429 430Not all type of global variables are supported by LLVM at this point. 431The following is currently available: 432 433 * static variables with or without section attributes 434 * global variables with section attributes 435 436The latter is for future extraction of map key/value type id's from a 437map definition. 438 4392.2.15 BTF_KIND_DATASEC 440~~~~~~~~~~~~~~~~~~~~~~~ 441 442``struct btf_type`` encoding requirement: 443 * ``name_off``: offset to a valid name associated with a variable or 444 one of .data/.bss/.rodata 445 * ``info.kind_flag``: 0 446 * ``info.kind``: BTF_KIND_DATASEC 447 * ``info.vlen``: # of variables 448 * ``size``: total section size in bytes (0 at compilation time, patched 449 to actual size by BPF loaders such as libbpf) 450 451``btf_type`` is followed by ``info.vlen`` number of ``struct btf_var_secinfo``.:: 452 453 struct btf_var_secinfo { 454 __u32 type; 455 __u32 offset; 456 __u32 size; 457 }; 458 459``struct btf_var_secinfo`` encoding: 460 * ``type``: the type of the BTF_KIND_VAR variable 461 * ``offset``: the in-section offset of the variable 462 * ``size``: the size of the variable in bytes 463 4642.2.16 BTF_KIND_FLOAT 465~~~~~~~~~~~~~~~~~~~~~ 466 467``struct btf_type`` encoding requirement: 468 * ``name_off``: any valid offset 469 * ``info.kind_flag``: 0 470 * ``info.kind``: BTF_KIND_FLOAT 471 * ``info.vlen``: 0 472 * ``size``: the size of the float type in bytes: 2, 4, 8, 12 or 16. 473 474No additional type data follow ``btf_type``. 475 4762.2.17 BTF_KIND_DECL_TAG 477~~~~~~~~~~~~~~~~~~~~~~~~ 478 479``struct btf_type`` encoding requirement: 480 * ``name_off``: offset to a non-empty string 481 * ``info.kind_flag``: 0 482 * ``info.kind``: BTF_KIND_DECL_TAG 483 * ``info.vlen``: 0 484 * ``type``: ``struct``, ``union``, ``func``, ``var`` or ``typedef`` 485 486``btf_type`` is followed by ``struct btf_decl_tag``.:: 487 488 struct btf_decl_tag { 489 __u32 component_idx; 490 }; 491 492The ``name_off`` encodes btf_decl_tag attribute string. 493The ``type`` should be ``struct``, ``union``, ``func``, ``var`` or ``typedef``. 494For ``var`` or ``typedef`` type, ``btf_decl_tag.component_idx`` must be ``-1``. 495For the other three types, if the btf_decl_tag attribute is 496applied to the ``struct``, ``union`` or ``func`` itself, 497``btf_decl_tag.component_idx`` must be ``-1``. Otherwise, 498the attribute is applied to a ``struct``/``union`` member or 499a ``func`` argument, and ``btf_decl_tag.component_idx`` should be a 500valid index (starting from 0) pointing to a member or an argument. 501 5022.2.18 BTF_KIND_TYPE_TAG 503~~~~~~~~~~~~~~~~~~~~~~~~ 504 505``struct btf_type`` encoding requirement: 506 * ``name_off``: offset to a non-empty string 507 * ``info.kind_flag``: 0 508 * ``info.kind``: BTF_KIND_TYPE_TAG 509 * ``info.vlen``: 0 510 * ``type``: the type with ``btf_type_tag`` attribute 511 512Currently, ``BTF_KIND_TYPE_TAG`` is only emitted for pointer types. 513It has the following btf type chain: 514:: 515 516 ptr -> [type_tag]* 517 -> [const | volatile | restrict | typedef]* 518 -> base_type 519 520Basically, a pointer type points to zero or more 521type_tag, then zero or more const/volatile/restrict/typedef 522and finally the base type. The base type is one of 523int, ptr, array, struct, union, enum, func_proto and float types. 524 5252.2.19 BTF_KIND_ENUM64 526~~~~~~~~~~~~~~~~~~~~~~ 527 528``struct btf_type`` encoding requirement: 529 * ``name_off``: 0 or offset to a valid C identifier 530 * ``info.kind_flag``: 0 for unsigned, 1 for signed 531 * ``info.kind``: BTF_KIND_ENUM64 532 * ``info.vlen``: number of enum values 533 * ``size``: 1/2/4/8 534 535``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum64``.:: 536 537 struct btf_enum64 { 538 __u32 name_off; 539 __u32 val_lo32; 540 __u32 val_hi32; 541 }; 542 543The ``btf_enum64`` encoding: 544 * ``name_off``: offset to a valid C identifier 545 * ``val_lo32``: lower 32-bit value for a 64-bit value 546 * ``val_hi32``: high 32-bit value for a 64-bit value 547 548If the original enum value is signed and the size is less than 8, 549that value will be sign extended into 8 bytes. 550 5512.3 Constant Values 552------------------- 553 554.. _BTF_Function_Linkage_Constants: 555 5562.3.1 Function Linkage Constant Values 557~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 558.. table:: Function Linkage Values and Meanings 559 560 =================== ===== =========== 561 kind value description 562 =================== ===== =========== 563 ``BTF_FUNC_STATIC`` 0x0 definition of subprogram not visible outside containing compilation unit 564 ``BTF_FUNC_GLOBAL`` 0x1 definition of subprogram visible outside containing compilation unit 565 ``BTF_FUNC_EXTERN`` 0x2 declaration of a subprogram whose definition is outside the containing compilation unit 566 =================== ===== =========== 567 568 569.. _BTF_Var_Linkage_Constants: 570 5712.3.2 Variable Linkage Constant Values 572~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 573.. table:: Variable Linkage Values and Meanings 574 575 ============================ ===== =========== 576 kind value description 577 ============================ ===== =========== 578 ``BTF_VAR_STATIC`` 0x0 definition of global variable not visible outside containing compilation unit 579 ``BTF_VAR_GLOBAL_ALLOCATED`` 0x1 definition of global variable visible outside containing compilation unit 580 ``BTF_VAR_GLOBAL_EXTERN`` 0x2 declaration of global variable whose definition is outside the containing compilation unit 581 ============================ ===== =========== 582 5833. BTF Kernel API 584================= 585 586The following bpf syscall command involves BTF: 587 * BPF_BTF_LOAD: load a blob of BTF data into kernel 588 * BPF_MAP_CREATE: map creation with btf key and value type info. 589 * BPF_PROG_LOAD: prog load with btf function and line info. 590 * BPF_BTF_GET_FD_BY_ID: get a btf fd 591 * BPF_OBJ_GET_INFO_BY_FD: btf, func_info, line_info 592 and other btf related info are returned. 593 594The workflow typically looks like: 595:: 596 597 Application: 598 BPF_BTF_LOAD 599 | 600 v 601 BPF_MAP_CREATE and BPF_PROG_LOAD 602 | 603 V 604 ...... 605 606 Introspection tool: 607 ...... 608 BPF_{PROG,MAP}_GET_NEXT_ID (get prog/map id's) 609 | 610 V 611 BPF_{PROG,MAP}_GET_FD_BY_ID (get a prog/map fd) 612 | 613 V 614 BPF_OBJ_GET_INFO_BY_FD (get bpf_prog_info/bpf_map_info with btf_id) 615 | | 616 V | 617 BPF_BTF_GET_FD_BY_ID (get btf_fd) | 618 | | 619 V | 620 BPF_OBJ_GET_INFO_BY_FD (get btf) | 621 | | 622 V V 623 pretty print types, dump func signatures and line info, etc. 624 625 6263.1 BPF_BTF_LOAD 627---------------- 628 629Load a blob of BTF data into kernel. A blob of data, described in 630:ref:`BTF_Type_String`, can be directly loaded into the kernel. A ``btf_fd`` 631is returned to a userspace. 632 6333.2 BPF_MAP_CREATE 634------------------ 635 636A map can be created with ``btf_fd`` and specified key/value type id.:: 637 638 __u32 btf_fd; /* fd pointing to a BTF type data */ 639 __u32 btf_key_type_id; /* BTF type_id of the key */ 640 __u32 btf_value_type_id; /* BTF type_id of the value */ 641 642In libbpf, the map can be defined with extra annotation like below: 643:: 644 645 struct { 646 __uint(type, BPF_MAP_TYPE_ARRAY); 647 __type(key, int); 648 __type(value, struct ipv_counts); 649 __uint(max_entries, 4); 650 } btf_map SEC(".maps"); 651 652During ELF parsing, libbpf is able to extract key/value type_id's and assign 653them to BPF_MAP_CREATE attributes automatically. 654 655.. _BPF_Prog_Load: 656 6573.3 BPF_PROG_LOAD 658----------------- 659 660During prog_load, func_info and line_info can be passed to kernel with proper 661values for the following attributes: 662:: 663 664 __u32 insn_cnt; 665 __aligned_u64 insns; 666 ...... 667 __u32 prog_btf_fd; /* fd pointing to BTF type data */ 668 __u32 func_info_rec_size; /* userspace bpf_func_info size */ 669 __aligned_u64 func_info; /* func info */ 670 __u32 func_info_cnt; /* number of bpf_func_info records */ 671 __u32 line_info_rec_size; /* userspace bpf_line_info size */ 672 __aligned_u64 line_info; /* line info */ 673 __u32 line_info_cnt; /* number of bpf_line_info records */ 674 675The func_info and line_info are an array of below, respectively.:: 676 677 struct bpf_func_info { 678 __u32 insn_off; /* [0, insn_cnt - 1] */ 679 __u32 type_id; /* pointing to a BTF_KIND_FUNC type */ 680 }; 681 struct bpf_line_info { 682 __u32 insn_off; /* [0, insn_cnt - 1] */ 683 __u32 file_name_off; /* offset to string table for the filename */ 684 __u32 line_off; /* offset to string table for the source line */ 685 __u32 line_col; /* line number and column number */ 686 }; 687 688func_info_rec_size is the size of each func_info record, and 689line_info_rec_size is the size of each line_info record. Passing the record 690size to kernel make it possible to extend the record itself in the future. 691 692Below are requirements for func_info: 693 * func_info[0].insn_off must be 0. 694 * the func_info insn_off is in strictly increasing order and matches 695 bpf func boundaries. 696 697Below are requirements for line_info: 698 * the first insn in each func must have a line_info record pointing to it. 699 * the line_info insn_off is in strictly increasing order. 700 701For line_info, the line number and column number are defined as below: 702:: 703 704 #define BPF_LINE_INFO_LINE_NUM(line_col) ((line_col) >> 10) 705 #define BPF_LINE_INFO_LINE_COL(line_col) ((line_col) & 0x3ff) 706 7073.4 BPF_{PROG,MAP}_GET_NEXT_ID 708------------------------------ 709 710In kernel, every loaded program, map or btf has a unique id. The id won't 711change during the lifetime of a program, map, or btf. 712 713The bpf syscall command BPF_{PROG,MAP}_GET_NEXT_ID returns all id's, one for 714each command, to user space, for bpf program or maps, respectively, so an 715inspection tool can inspect all programs and maps. 716 7173.5 BPF_{PROG,MAP}_GET_FD_BY_ID 718------------------------------- 719 720An introspection tool cannot use id to get details about program or maps. 721A file descriptor needs to be obtained first for reference-counting purpose. 722 7233.6 BPF_OBJ_GET_INFO_BY_FD 724-------------------------- 725 726Once a program/map fd is acquired, an introspection tool can get the detailed 727information from kernel about this fd, some of which are BTF-related. For 728example, ``bpf_map_info`` returns ``btf_id`` and key/value type ids. 729``bpf_prog_info`` returns ``btf_id``, func_info, and line info for translated 730bpf byte codes, and jited_line_info. 731 7323.7 BPF_BTF_GET_FD_BY_ID 733------------------------ 734 735With ``btf_id`` obtained in ``bpf_map_info`` and ``bpf_prog_info``, bpf 736syscall command BPF_BTF_GET_FD_BY_ID can retrieve a btf fd. Then, with 737command BPF_OBJ_GET_INFO_BY_FD, the btf blob, originally loaded into the 738kernel with BPF_BTF_LOAD, can be retrieved. 739 740With the btf blob, ``bpf_map_info``, and ``bpf_prog_info``, an introspection 741tool has full btf knowledge and is able to pretty print map key/values, dump 742func signatures and line info, along with byte/jit codes. 743 7444. ELF File Format Interface 745============================ 746 7474.1 .BTF section 748---------------- 749 750The .BTF section contains type and string data. The format of this section is 751same as the one describe in :ref:`BTF_Type_String`. 752 753.. _BTF_Ext_Section: 754 7554.2 .BTF.ext section 756-------------------- 757 758The .BTF.ext section encodes func_info, line_info and CO-RE relocations 759which needs loader manipulation before loading into the kernel. 760 761The specification for .BTF.ext section is defined at ``tools/lib/bpf/btf.h`` 762and ``tools/lib/bpf/btf.c``. 763 764The current header of .BTF.ext section:: 765 766 struct btf_ext_header { 767 __u16 magic; 768 __u8 version; 769 __u8 flags; 770 __u32 hdr_len; 771 772 /* All offsets are in bytes relative to the end of this header */ 773 __u32 func_info_off; 774 __u32 func_info_len; 775 __u32 line_info_off; 776 __u32 line_info_len; 777 778 /* optional part of .BTF.ext header */ 779 __u32 core_relo_off; 780 __u32 core_relo_len; 781 }; 782 783It is very similar to .BTF section. Instead of type/string section, it 784contains func_info, line_info and core_relo sub-sections. 785See :ref:`BPF_Prog_Load` for details about func_info and line_info 786record format. 787 788The func_info is organized as below.:: 789 790 func_info_rec_size /* __u32 value */ 791 btf_ext_info_sec for section #1 /* func_info for section #1 */ 792 btf_ext_info_sec for section #2 /* func_info for section #2 */ 793 ... 794 795``func_info_rec_size`` specifies the size of ``bpf_func_info`` structure when 796.BTF.ext is generated. ``btf_ext_info_sec``, defined below, is a collection of 797func_info for each specific ELF section.:: 798 799 struct btf_ext_info_sec { 800 __u32 sec_name_off; /* offset to section name */ 801 __u32 num_info; 802 /* Followed by num_info * record_size number of bytes */ 803 __u8 data[0]; 804 }; 805 806Here, num_info must be greater than 0. 807 808The line_info is organized as below.:: 809 810 line_info_rec_size /* __u32 value */ 811 btf_ext_info_sec for section #1 /* line_info for section #1 */ 812 btf_ext_info_sec for section #2 /* line_info for section #2 */ 813 ... 814 815``line_info_rec_size`` specifies the size of ``bpf_line_info`` structure when 816.BTF.ext is generated. 817 818The interpretation of ``bpf_func_info->insn_off`` and 819``bpf_line_info->insn_off`` is different between kernel API and ELF API. For 820kernel API, the ``insn_off`` is the instruction offset in the unit of ``struct 821bpf_insn``. For ELF API, the ``insn_off`` is the byte offset from the 822beginning of section (``btf_ext_info_sec->sec_name_off``). 823 824The core_relo is organized as below.:: 825 826 core_relo_rec_size /* __u32 value */ 827 btf_ext_info_sec for section #1 /* core_relo for section #1 */ 828 btf_ext_info_sec for section #2 /* core_relo for section #2 */ 829 830``core_relo_rec_size`` specifies the size of ``bpf_core_relo`` 831structure when .BTF.ext is generated. All ``bpf_core_relo`` structures 832within a single ``btf_ext_info_sec`` describe relocations applied to 833section named by ``btf_ext_info_sec->sec_name_off``. 834 835See :ref:`Documentation/bpf/llvm_reloc.rst <btf-co-re-relocations>` 836for more information on CO-RE relocations. 837 8384.3 .BTF_ids section 839-------------------- 840 841The .BTF_ids section encodes BTF ID values that are used within the kernel. 842 843This section is created during the kernel compilation with the help of 844macros defined in ``include/linux/btf_ids.h`` header file. Kernel code can 845use them to create lists and sets (sorted lists) of BTF ID values. 846 847The ``BTF_ID_LIST`` and ``BTF_ID`` macros define unsorted list of BTF ID values, 848with following syntax:: 849 850 BTF_ID_LIST(list) 851 BTF_ID(type1, name1) 852 BTF_ID(type2, name2) 853 854resulting in following layout in .BTF_ids section:: 855 856 __BTF_ID__type1__name1__1: 857 .zero 4 858 __BTF_ID__type2__name2__2: 859 .zero 4 860 861The ``u32 list[];`` variable is defined to access the list. 862 863The ``BTF_ID_UNUSED`` macro defines 4 zero bytes. It's used when we 864want to define unused entry in BTF_ID_LIST, like:: 865 866 BTF_ID_LIST(bpf_skb_output_btf_ids) 867 BTF_ID(struct, sk_buff) 868 BTF_ID_UNUSED 869 BTF_ID(struct, task_struct) 870 871The ``BTF_SET_START/END`` macros pair defines sorted list of BTF ID values 872and their count, with following syntax:: 873 874 BTF_SET_START(set) 875 BTF_ID(type1, name1) 876 BTF_ID(type2, name2) 877 BTF_SET_END(set) 878 879resulting in following layout in .BTF_ids section:: 880 881 __BTF_ID__set__set: 882 .zero 4 883 __BTF_ID__type1__name1__3: 884 .zero 4 885 __BTF_ID__type2__name2__4: 886 .zero 4 887 888The ``struct btf_id_set set;`` variable is defined to access the list. 889 890The ``typeX`` name can be one of following:: 891 892 struct, union, typedef, func 893 894and is used as a filter when resolving the BTF ID value. 895 896All the BTF ID lists and sets are compiled in the .BTF_ids section and 897resolved during the linking phase of kernel build by ``resolve_btfids`` tool. 898 8994.4 .BTF.base section 900--------------------- 901Split BTF - where the .BTF section only contains types not in the associated 902base .BTF section - is an extremely efficient way to encode type information 903for kernel modules, since they generally consist of a few module-specific 904types along with a large set of shared kernel types. The former are encoded 905in split BTF, while the latter are encoded in base BTF, resulting in more 906compact representations. A type in split BTF that refers to a type in 907base BTF refers to it using its base BTF ID, and split BTF IDs start 908at last_base_BTF_ID + 1. 909 910The downside of this approach however is that this makes the split BTF 911somewhat brittle - when the base BTF changes, base BTF ID references are 912no longer valid and the split BTF itself becomes useless. The role of the 913.BTF.base section is to make split BTF more resilient for cases where 914the base BTF may change, as is the case for kernel modules not built every 915time the kernel is for example. .BTF.base contains named base types; INTs, 916FLOATs, STRUCTs, UNIONs, ENUM[64]s and FWDs. INTs and FLOATs are fully 917described in .BTF.base sections, while composite types like structs 918and unions are not fully defined - the .BTF.base type simply serves as 919a description of the type the split BTF referred to, so structs/unions 920have 0 members in the .BTF.base section. ENUM[64]s are similarly recorded 921with 0 members. Any other types are added to the split BTF. This 922distillation process then leaves us with a .BTF.base section with 923such minimal descriptions of base types and .BTF split section which refers 924to those base types. Later, we can relocate the split BTF using both the 925information stored in the .BTF.base section and the new .BTF base; the type 926information in the .BTF.base section allows us to update the split BTF 927references to point at the corresponding new base BTF IDs. 928 929BTF relocation happens on kernel module load when a kernel module has a 930.BTF.base section, and libbpf also provides a btf__relocate() API to 931accomplish this. 932 933As an example consider the following base BTF:: 934 935 [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED 936 [2] STRUCT 'foo' size=8 vlen=2 937 'f1' type_id=1 bits_offset=0 938 'f2' type_id=1 bits_offset=32 939 940...and associated split BTF:: 941 942 [3] PTR '(anon)' type_id=2 943 944i.e. split BTF describes a pointer to struct foo { int f1; int f2 }; 945 946.BTF.base will consist of:: 947 948 [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED 949 [2] STRUCT 'foo' size=8 vlen=0 950 951If we relocate the split BTF later using the following new base BTF:: 952 953 [1] INT 'long unsigned int' size=8 bits_offset=0 nr_bits=64 encoding=(none) 954 [2] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED 955 [3] STRUCT 'foo' size=8 vlen=2 956 'f1' type_id=2 bits_offset=0 957 'f2' type_id=2 bits_offset=32 958 959...we can use our .BTF.base description to know that the split BTF reference 960is to struct foo, and relocation results in new split BTF:: 961 962 [4] PTR '(anon)' type_id=3 963 964Note that we had to update BTF ID and start BTF ID for the split BTF. 965 966So we see how .BTF.base plays the role of facilitating later relocation, 967leading to more resilient split BTF. 968 969.BTF.base sections will be generated automatically for out-of-tree kernel module 970builds - i.e. where KBUILD_EXTMOD is set (as it would be for "make M=path/2/mod" 971cases). .BTF.base generation requires pahole support for the "distilled_base" 972BTF feature; this is available in pahole v1.28 and later. 973 9745. Using BTF 975============ 976 9775.1 bpftool map pretty print 978---------------------------- 979 980With BTF, the map key/value can be printed based on fields rather than simply 981raw bytes. This is especially valuable for large structure or if your data 982structure has bitfields. For example, for the following map,:: 983 984 enum A { A1, A2, A3, A4, A5 }; 985 typedef enum A ___A; 986 struct tmp_t { 987 char a1:4; 988 int a2:4; 989 int :4; 990 __u32 a3:4; 991 int b; 992 ___A b1:4; 993 enum A b2:4; 994 }; 995 struct { 996 __uint(type, BPF_MAP_TYPE_ARRAY); 997 __type(key, int); 998 __type(value, struct tmp_t); 999 __uint(max_entries, 1); 1000 } tmpmap SEC(".maps"); 1001 1002bpftool is able to pretty print like below: 1003:: 1004 1005 [{ 1006 "key": 0, 1007 "value": { 1008 "a1": 0x2, 1009 "a2": 0x4, 1010 "a3": 0x6, 1011 "b": 7, 1012 "b1": 0x8, 1013 "b2": 0xa 1014 } 1015 } 1016 ] 1017 10185.2 bpftool prog dump 1019--------------------- 1020 1021The following is an example showing how func_info and line_info can help prog 1022dump with better kernel symbol names, function prototypes and line 1023information.:: 1024 1025 $ bpftool prog dump jited pinned /sys/fs/bpf/test_btf_haskv 1026 [...] 1027 int test_long_fname_2(struct dummy_tracepoint_args * arg): 1028 bpf_prog_44a040bf25481309_test_long_fname_2: 1029 ; static int test_long_fname_2(struct dummy_tracepoint_args *arg) 1030 0: push %rbp 1031 1: mov %rsp,%rbp 1032 4: sub $0x30,%rsp 1033 b: sub $0x28,%rbp 1034 f: mov %rbx,0x0(%rbp) 1035 13: mov %r13,0x8(%rbp) 1036 17: mov %r14,0x10(%rbp) 1037 1b: mov %r15,0x18(%rbp) 1038 1f: xor %eax,%eax 1039 21: mov %rax,0x20(%rbp) 1040 25: xor %esi,%esi 1041 ; int key = 0; 1042 27: mov %esi,-0x4(%rbp) 1043 ; if (!arg->sock) 1044 2a: mov 0x8(%rdi),%rdi 1045 ; if (!arg->sock) 1046 2e: cmp $0x0,%rdi 1047 32: je 0x0000000000000070 1048 34: mov %rbp,%rsi 1049 ; counts = bpf_map_lookup_elem(&btf_map, &key); 1050 [...] 1051 10525.3 Verifier Log 1053---------------- 1054 1055The following is an example of how line_info can help debugging verification 1056failure.:: 1057 1058 /* The code at tools/testing/selftests/bpf/test_xdp_noinline.c 1059 * is modified as below. 1060 */ 1061 data = (void *)(long)xdp->data; 1062 data_end = (void *)(long)xdp->data_end; 1063 /* 1064 if (data + 4 > data_end) 1065 return XDP_DROP; 1066 */ 1067 *(u32 *)data = dst->dst; 1068 1069 $ bpftool prog load ./test_xdp_noinline.o /sys/fs/bpf/test_xdp_noinline type xdp 1070 ; data = (void *)(long)xdp->data; 1071 224: (79) r2 = *(u64 *)(r10 -112) 1072 225: (61) r2 = *(u32 *)(r2 +0) 1073 ; *(u32 *)data = dst->dst; 1074 226: (63) *(u32 *)(r2 +0) = r1 1075 invalid access to packet, off=0 size=4, R2(id=0,off=0,r=0) 1076 R2 offset is outside of the packet 1077 10786. BTF Generation 1079================= 1080 1081You need latest pahole 1082 1083 https://git.kernel.org/pub/scm/devel/pahole/pahole.git/ 1084 1085or llvm (8.0 or later). The pahole acts as a dwarf2btf converter. It doesn't 1086support .BTF.ext and btf BTF_KIND_FUNC type yet. For example,:: 1087 1088 -bash-4.4$ cat t.c 1089 struct t { 1090 int a:2; 1091 int b:3; 1092 int c:2; 1093 } g; 1094 -bash-4.4$ gcc -c -O2 -g t.c 1095 -bash-4.4$ pahole -JV t.o 1096 File t.o: 1097 [1] STRUCT t kind_flag=1 size=4 vlen=3 1098 a type_id=2 bitfield_size=2 bits_offset=0 1099 b type_id=2 bitfield_size=3 bits_offset=2 1100 c type_id=2 bitfield_size=2 bits_offset=5 1101 [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED 1102 1103The llvm is able to generate .BTF and .BTF.ext directly with -g for bpf target 1104only. The assembly code (-S) is able to show the BTF encoding in assembly 1105format.:: 1106 1107 -bash-4.4$ cat t2.c 1108 typedef int __int32; 1109 struct t2 { 1110 int a2; 1111 int (*f2)(char q1, __int32 q2, ...); 1112 int (*f3)(); 1113 } g2; 1114 int main() { return 0; } 1115 int test() { return 0; } 1116 -bash-4.4$ clang -c -g -O2 --target=bpf t2.c 1117 -bash-4.4$ readelf -S t2.o 1118 ...... 1119 [ 8] .BTF PROGBITS 0000000000000000 00000247 1120 000000000000016e 0000000000000000 0 0 1 1121 [ 9] .BTF.ext PROGBITS 0000000000000000 000003b5 1122 0000000000000060 0000000000000000 0 0 1 1123 [10] .rel.BTF.ext REL 0000000000000000 000007e0 1124 0000000000000040 0000000000000010 16 9 8 1125 ...... 1126 -bash-4.4$ clang -S -g -O2 --target=bpf t2.c 1127 -bash-4.4$ cat t2.s 1128 ...... 1129 .section .BTF,"",@progbits 1130 .short 60319 # 0xeb9f 1131 .byte 1 1132 .byte 0 1133 .long 24 1134 .long 0 1135 .long 220 1136 .long 220 1137 .long 122 1138 .long 0 # BTF_KIND_FUNC_PROTO(id = 1) 1139 .long 218103808 # 0xd000000 1140 .long 2 1141 .long 83 # BTF_KIND_INT(id = 2) 1142 .long 16777216 # 0x1000000 1143 .long 4 1144 .long 16777248 # 0x1000020 1145 ...... 1146 .byte 0 # string offset=0 1147 .ascii ".text" # string offset=1 1148 .byte 0 1149 .ascii "/home/yhs/tmp-pahole/t2.c" # string offset=7 1150 .byte 0 1151 .ascii "int main() { return 0; }" # string offset=33 1152 .byte 0 1153 .ascii "int test() { return 0; }" # string offset=58 1154 .byte 0 1155 .ascii "int" # string offset=83 1156 ...... 1157 .section .BTF.ext,"",@progbits 1158 .short 60319 # 0xeb9f 1159 .byte 1 1160 .byte 0 1161 .long 24 1162 .long 0 1163 .long 28 1164 .long 28 1165 .long 44 1166 .long 8 # FuncInfo 1167 .long 1 # FuncInfo section string offset=1 1168 .long 2 1169 .long .Lfunc_begin0 1170 .long 3 1171 .long .Lfunc_begin1 1172 .long 5 1173 .long 16 # LineInfo 1174 .long 1 # LineInfo section string offset=1 1175 .long 2 1176 .long .Ltmp0 1177 .long 7 1178 .long 33 1179 .long 7182 # Line 7 Col 14 1180 .long .Ltmp3 1181 .long 7 1182 .long 58 1183 .long 8206 # Line 8 Col 14 1184 11857. Testing 1186========== 1187 1188The kernel BPF selftest `tools/testing/selftests/bpf/prog_tests/btf.c`_ 1189provides an extensive set of BTF-related tests. 1190 1191.. Links 1192.. _tools/testing/selftests/bpf/prog_tests/btf.c: 1193 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/testing/selftests/bpf/prog_tests/btf.c 1194