ctf.5 (ef36b3f75658d201edb495068db5e1be49593de5) | ctf.5 (7421ff0751fadff2b6f5154f43067b869509603f) |
---|---|
1.\" 2.\" This file and its contents are supplied under the terms of the 3.\" Common Development and Distribution License ("CDDL"), version 1.0. 4.\" You may only use this file in accordance with the terms of version 5.\" 1.0 of the CDDL. 6.\" 7.\" A full copy of the text of the CDDL should have accompanied this 8.\" source. A copy of the CDDL is also available via the Internet at --- 38 unchanged lines hidden (view full) --- 47.Lp 48Because a 49.Nm 50file is often embedded inside a file, rather than being a standalone 51file itself, it may also be referred to as a 52.Nm 53.Sy container . 54.Lp | 1.\" 2.\" This file and its contents are supplied under the terms of the 3.\" Common Development and Distribution License ("CDDL"), version 1.0. 4.\" You may only use this file in accordance with the terms of version 5.\" 1.0 of the CDDL. 6.\" 7.\" A full copy of the text of the CDDL should have accompanied this 8.\" source. A copy of the CDDL is also available via the Internet at --- 38 unchanged lines hidden (view full) --- 47.Lp 48Because a 49.Nm 50file is often embedded inside a file, rather than being a standalone 51file itself, it may also be referred to as a 52.Nm 53.Sy container . 54.Lp |
55On illumos systems, | 55On 56.Fx 57systems, |
56.Nm | 58.Nm |
57data is consumed by multiple programs. 58It can be used by the modular debugger, 59.Xr mdb 1 , 60as well as by | 59data is consumed by |
61.Xr dtrace 1 . 62Programmatic access to 63.Nm | 60.Xr dtrace 1 . 61Programmatic access to 62.Nm |
64data can be obtained through 65.Xr libctf 3LIB . | 63data can be obtained through libctf. |
66.Lp 67The 68.Nm 69file format is broken down into seven different sections. 70The first section is the 71.Sy preamble 72and 73.Sy header , 74which describes the version of the 75.Nm | 64.Lp 65The 66.Nm 67file format is broken down into seven different sections. 68The first section is the 69.Sy preamble 70and 71.Sy header , 72which describes the version of the 73.Nm |
76file, links it has to other | 74file, the links it has to other |
77.Nm 78files, and the sizes of the other sections. 79The next section is the 80.Sy label 81section, 82which provides a way of identifying similar groups of 83.Nm 84data across multiple files. 85This is followed by the 86.Sy object | 75.Nm 76files, and the sizes of the other sections. 77The next section is the 78.Sy label 79section, 80which provides a way of identifying similar groups of 81.Nm 82data across multiple files. 83This is followed by the 84.Sy object |
87information section, which describes the type of global | 85information section, which describes the types of global |
88symbols. 89The subsequent section is the 90.Sy function 91information section, which describes the return 92types and arguments of functions. 93The next section is the 94.Sy type 95information section, which describes --- 36 unchanged lines hidden (view full) --- 132This means that a module only has types that are unique to itself and the most 133common types in the kernel are not duplicated. 134.Sh FILE FORMAT 135This documents version 136.Em two 137of the 138.Nm 139file format. | 86symbols. 87The subsequent section is the 88.Sy function 89information section, which describes the return 90types and arguments of functions. 91The next section is the 92.Sy type 93information section, which describes --- 36 unchanged lines hidden (view full) --- 130This means that a module only has types that are unique to itself and the most 131common types in the kernel are not duplicated. 132.Sh FILE FORMAT 133This documents version 134.Em two 135of the 136.Nm 137file format. |
140All applications and tools currently produce and operate on this version. | 138All applications and tools on 139.Fx 140currently produce and operate on this version. |
141.Lp 142The file format can be summarized with the following image, the 143following sections will cover this in more detail. 144.Bd -literal 145 146 +-------------+ 0t0 147+--------| Preamble | 148| +-------------+ 0t4 --- 146 unchanged lines hidden (view full) --- 295always refer to the 296.Sy uncompressed 297data. 298.Lp 299In version two of the 300.Nm 301file format, the 302.Sy header | 141.Lp 142The file format can be summarized with the following image, the 143following sections will cover this in more detail. 144.Bd -literal 145 146 +-------------+ 0t0 147+--------| Preamble | 148| +-------------+ 0t4 --- 146 unchanged lines hidden (view full) --- 295always refer to the 296.Sy uncompressed 297data. 298.Lp 299In version two of the 300.Nm 301file format, the 302.Sy header |
303denotes whether whether or not this | 303denotes whether or not this |
304.Nm 305file is the child of another 306.Nm 307file and also indicates the size of the remaining sections. 308The structure for the | 304.Nm 305file is the child of another 306.Nm 307file and also indicates the size of the remaining sections. 308The structure for the |
309.Sy header , | 309.Sy header |
310logically contains a copy of the 311.Sy preamble 312and the two have a combined size of 36 bytes. 313.Bd -literal 314typedef struct ctf_header { 315 ctf_preamble_t cth_preamble; 316 uint_t cth_parlabel; /* ref to name of parent lbl uniq'd against */ 317 uint_t cth_parname; /* ref to basename of parent */ --- 211 unchanged lines hidden (view full) --- 529For example, when building illumos, there are many kernel modules that are built 530against a single collection of source code. 531A label is encoded into the 532.Nm 533files that corresponds with the particular build. 534This ensures that if files on the system were to become mixed up from multiple 535releases, that they are not used together by tools, particularly when a child 536needs to refer to a type in the parent. | 310logically contains a copy of the 311.Sy preamble 312and the two have a combined size of 36 bytes. 313.Bd -literal 314typedef struct ctf_header { 315 ctf_preamble_t cth_preamble; 316 uint_t cth_parlabel; /* ref to name of parent lbl uniq'd against */ 317 uint_t cth_parname; /* ref to basename of parent */ --- 211 unchanged lines hidden (view full) --- 529For example, when building illumos, there are many kernel modules that are built 530against a single collection of source code. 531A label is encoded into the 532.Nm 533files that corresponds with the particular build. 534This ensures that if files on the system were to become mixed up from multiple 535releases, that they are not used together by tools, particularly when a child 536needs to refer to a type in the parent. |
537Because they are linked used the type identifiers, if the wrong parent is used | 537Because they are linked using the type identifiers, if the wrong parent is used |
538then the wrong type will be encountered. 539.Lp 540Each label is encoded in the file format using the following eight byte 541structure: 542.Bd -literal 543typedef struct ctf_lblent { 544 uint_t ctl_label; /* ref to name of label */ 545 uint_t ctl_typeidx; /* last type associated with this label */ --- 10 unchanged lines hidden (view full) --- 556.Lp 557The type identifier encoded in the member 558.Em ctl_typeidx 559refers to the last type identifier that a label refers to in the current 560file. 561Labels only refer to types in the current file, if the 562.Nm 563file is a child, then it will have the same label as its parent; | 538then the wrong type will be encountered. 539.Lp 540Each label is encoded in the file format using the following eight byte 541structure: 542.Bd -literal 543typedef struct ctf_lblent { 544 uint_t ctl_label; /* ref to name of label */ 545 uint_t ctl_typeidx; /* last type associated with this label */ --- 10 unchanged lines hidden (view full) --- 556.Lp 557The type identifier encoded in the member 558.Em ctl_typeidx 559refers to the last type identifier that a label refers to in the current 560file. 561Labels only refer to types in the current file, if the 562.Nm 563file is a child, then it will have the same label as its parent; |
564however, its label will only refer to its types, not its parents. | 564however, its label will only refer to its types, not its parent's. |
565.Lp 566It is also possible, though rather uncommon, for a 567.Nm 568file to have multiple labels. 569Labels are placed one after another, every eight bytes. 570When multiple labels are present, types may only belong to a single label. 571.Ss The Object Section 572The object section provides a mapping from ELF symbols of type --- 7 unchanged lines hidden (view full) --- 580is stored for that entry. 581.Lp 582To walk the object section, you need to have a corresponding 583.Sy symbol table 584in the ELF object that contains the 585.Nm 586data. 587Not every object is included in this section. | 565.Lp 566It is also possible, though rather uncommon, for a 567.Nm 568file to have multiple labels. 569Labels are placed one after another, every eight bytes. 570When multiple labels are present, types may only belong to a single label. 571.Ss The Object Section 572The object section provides a mapping from ELF symbols of type --- 7 unchanged lines hidden (view full) --- 580is stored for that entry. 581.Lp 582To walk the object section, you need to have a corresponding 583.Sy symbol table 584in the ELF object that contains the 585.Nm 586data. 587Not every object is included in this section. |
588Specifically, when walking the symbol table. 589An entry is skipped if it matches any of the following conditions: | 588Specifically, when walking the symbol table, an entry is skipped if it matches 589any of the following conditions: |
590.Lp 591.Bl -bullet -offset indent -compact 592.It 593The symbol type is not 594.Sy STT_OBJECT 595.It 596The symbol's section index is 597.Sy SHN_UNDEF --- 54 unchanged lines hidden (view full) --- 652 653 return (0); 654} 655.Ed 656.Ss The Function Section 657The function section of the 658.Nm 659file encodes the types of both the function's arguments and the function's | 590.Lp 591.Bl -bullet -offset indent -compact 592.It 593The symbol type is not 594.Sy STT_OBJECT 595.It 596The symbol's section index is 597.Sy SHN_UNDEF --- 54 unchanged lines hidden (view full) --- 652 653 return (0); 654} 655.Ed 656.Ss The Function Section 657The function section of the 658.Nm 659file encodes the types of both the function's arguments and the function's |
660return type. | 660return value. |
661Similar to 662.Sx The Object Section , 663the function section encodes information for all symbols of type 664.Sy STT_FUNCTION , 665excepting those that fit specific criteria. 666Unlike with objects, because functions have a variable number of arguments, they 667start with a type encoding as defined in 668.Sx Type Encoding , --- 116 unchanged lines hidden (view full) --- 785child. 786The member 787.Em ctt_name 788is encoded as described in the section 789.Sx String Identifiers . 790The string that it points to is the name of the type. 791If the identifier points to an empty string (one that consists solely of a null 792terminator) then the type does not have a name, this is common with anonymous | 661Similar to 662.Sx The Object Section , 663the function section encodes information for all symbols of type 664.Sy STT_FUNCTION , 665excepting those that fit specific criteria. 666Unlike with objects, because functions have a variable number of arguments, they 667start with a type encoding as defined in 668.Sx Type Encoding , --- 116 unchanged lines hidden (view full) --- 785child. 786The member 787.Em ctt_name 788is encoded as described in the section 789.Sx String Identifiers . 790The string that it points to is the name of the type. 791If the identifier points to an empty string (one that consists solely of a null 792terminator) then the type does not have a name, this is common with anonymous |
793structures and unions that only have a typedef to name them, as well as, | 793structures and unions that only have a typedef to name them, as well as |
794pointers and qualifiers. 795.Lp 796The next member, the 797.Em ctt_info , 798is encoded as described in the section 799.Sx Type Encoding . | 794pointers and qualifiers. 795.Lp 796The next member, the 797.Em ctt_info , 798is encoded as described in the section 799.Sx Type Encoding . |
800The types kind tells us how to interpret the remaining data in the | 800The type's kind tells us how to interpret the remaining data in the |
801.Sy ctf_type_t 802and any variable length data that may exist. 803The rest of this section will be broken down into the interpretation of the 804various kinds. 805.Ss Encoding of Integers 806Integers, which are of type 807.Sy CTF_K_INTEGER , 808have no variable length arguments. --- 90 unchanged lines hidden (view full) --- 899#define CTF_FP_ENCODING(data) (((data) & 0xff000000) >> 24) 900#define CTF_FP_OFFSET(data) (((data) & 0x00ff0000) >> 16) 901#define CTF_FP_BITS(data) (((data) & 0x0000ffff)) 902 903#define CTF_FP_DATA(encoding, offset, bits) \\ 904 (((encoding) << 24) | ((offset) << 16) | (bits)) 905.Ed 906.Lp | 801.Sy ctf_type_t 802and any variable length data that may exist. 803The rest of this section will be broken down into the interpretation of the 804various kinds. 805.Ss Encoding of Integers 806Integers, which are of type 807.Sy CTF_K_INTEGER , 808have no variable length arguments. --- 90 unchanged lines hidden (view full) --- 899#define CTF_FP_ENCODING(data) (((data) & 0xff000000) >> 24) 900#define CTF_FP_OFFSET(data) (((data) & 0x00ff0000) >> 16) 901#define CTF_FP_BITS(data) (((data) & 0x0000ffff)) 902 903#define CTF_FP_DATA(encoding, offset, bits) \\ 904 (((encoding) << 24) | ((offset) << 16) | (bits)) 905.Ed 906.Lp |
907Where as the encoding for integers was a series of flags, the encoding for | 907Where as the encoding for integers is a series of flags, the encoding for |
908floats maps to a specific kind of float. 909It is not a flag-based value. 910The kinds of floats correspond to both their size, and the encoding. 911This covers all of the basic C intrinsic floating point types. 912The following are the different kinds of floats represented in the encoding: 913.Bd -literal -offset indent 914#define CTF_FP_SINGLE 1 /* IEEE 32-bit float encoding */ 915#define CTF_FP_DOUBLE 2 /* IEEE 64-bit float encoding */ --- 57 unchanged lines hidden (view full) --- 973Each one is represented by a 974.Sy uint16_t 975and encoded according to the 976.Sx Type Identifiers 977section. 978If the function's last argument is of type varargs, then it is also written out, 979but the type identifier is zero. 980This is included in the count of the function's arguments. | 908floats maps to a specific kind of float. 909It is not a flag-based value. 910The kinds of floats correspond to both their size, and the encoding. 911This covers all of the basic C intrinsic floating point types. 912The following are the different kinds of floats represented in the encoding: 913.Bd -literal -offset indent 914#define CTF_FP_SINGLE 1 /* IEEE 32-bit float encoding */ 915#define CTF_FP_DOUBLE 2 /* IEEE 64-bit float encoding */ --- 57 unchanged lines hidden (view full) --- 973Each one is represented by a 974.Sy uint16_t 975and encoded according to the 976.Sx Type Identifiers 977section. 978If the function's last argument is of type varargs, then it is also written out, 979but the type identifier is zero. 980This is included in the count of the function's arguments. |
981An extra type identifier may follow the argument and return type identifiers 982in order to maintain four-byte alignment for the following type definition. 983Such a type identifier is not included in the argument count and has a value 984of zero. |
|
981.Ss Encoding of Structures and Unions 982Structures and Unions, which are encoded with 983.Sy CTF_K_STRUCT 984and 985.Sy CTF_K_UNION 986respectively, are very similar constructs in C. | 985.Ss Encoding of Structures and Unions 986Structures and Unions, which are encoded with 987.Sy CTF_K_STRUCT 988and 989.Sy CTF_K_UNION 990respectively, are very similar constructs in C. |
987The main difference between them is the fact that every member of a structure 988follows one another, where as in a union, all members share the same memory. | 991The main difference between them is the fact that members of a structure 992follow one another, where as in a union, all members share the same memory. |
989They are also very similar in terms of their encoding in 990.Nm . 991The variable length argument for structures and unions represents the number of 992members that they have. 993The value of the member 994.Em ctt_size 995is the size of the structure and union. 996There are two different structures which are used to encode members in the --- 30 unchanged lines hidden (view full) --- 1027.Sy ctm_type 1028and 1029.Sy ctlm_type 1030both refer to the type of the member. 1031They are encoded as per the section 1032.Sx Type Identifiers . 1033.Lp 1034The last piece of information that is present is the offset which describes the | 993They are also very similar in terms of their encoding in 994.Nm . 995The variable length argument for structures and unions represents the number of 996members that they have. 997The value of the member 998.Em ctt_size 999is the size of the structure and union. 1000There are two different structures which are used to encode members in the --- 30 unchanged lines hidden (view full) --- 1031.Sy ctm_type 1032and 1033.Sy ctlm_type 1034both refer to the type of the member. 1035They are encoded as per the section 1036.Sx Type Identifiers . 1037.Lp 1038The last piece of information that is present is the offset which describes the |
1035offset in memory that the member begins at. 1036For unions, this value will always be zero because the start of unions in memory 1037is always zero. | 1039offset in memory at which the member begins. 1040For unions, this value will always be zero because each member of a union has 1041an offset of zero. |
1038For structures, this is the offset in 1039.Sy bits | 1042For structures, this is the offset in 1043.Sy bits |
1040that the member begins at. | 1044at which the member begins. |
1041Note that a compiler may lay out a type with padding. 1042This means that the difference in offset between two consecutive members may be 1043larger than the size of the member. 1044When the size of the overall structure is strictly less than 8192 bytes, the 1045normal structure, 1046.Sy ctf_member_t , 1047is used and the offset in bits is stored in the member 1048.Em ctm_offset . --- 15 unchanged lines hidden (view full) --- 1064are similar to structures. 1065Enumerations use the variable list to note the number of values that the 1066enumeration contains, which we'll term enumerators. 1067In C, an enumeration is always equivalent to the intrinsic type 1068.Sy int , 1069thus the value of the member 1070.Em ctt_size 1071is always the size of an integer which is determined based on the current model. | 1045Note that a compiler may lay out a type with padding. 1046This means that the difference in offset between two consecutive members may be 1047larger than the size of the member. 1048When the size of the overall structure is strictly less than 8192 bytes, the 1049normal structure, 1050.Sy ctf_member_t , 1051is used and the offset in bits is stored in the member 1052.Em ctm_offset . --- 15 unchanged lines hidden (view full) --- 1068are similar to structures. 1069Enumerations use the variable list to note the number of values that the 1070enumeration contains, which we'll term enumerators. 1071In C, an enumeration is always equivalent to the intrinsic type 1072.Sy int , 1073thus the value of the member 1074.Em ctt_size 1075is always the size of an integer which is determined based on the current model. |
1072For illumos systems, this will always be 4, as an integer is always defined to | 1076For 1077.Fx 1078systems, this will always be 4, as an integer is always defined to |
1073be 4 bytes large in both 1074.Sy ILP32 1075and 1076.Sy LP64 , 1077regardless of the architecture. | 1079be 4 bytes large in both 1080.Sy ILP32 1081and 1082.Sy LP64 , 1083regardless of the architecture. |
1084For further details, see 1085.Xr arch 7 . |
|
1078.Lp 1079The enumerators encoded in an enumeration have the following structure in the 1080variable list: 1081.Bd -literal 1082typedef struct ctf_enum { 1083 uint_t cte_name; /* reference to name in string table */ 1084 int cte_value; /* value associated with this name */ 1085} ctf_enum_t; --- 63 unchanged lines hidden (view full) --- 1149.Nm 1150file is the 1151.Sy string 1152section. 1153This section encodes all of the strings that appear throughout the other 1154sections. 1155It is laid out as a series of characters followed by a null terminator. 1156Generally, all names are written out in ASCII, as most C compilers do not allow | 1086.Lp 1087The enumerators encoded in an enumeration have the following structure in the 1088variable list: 1089.Bd -literal 1090typedef struct ctf_enum { 1091 uint_t cte_name; /* reference to name in string table */ 1092 int cte_value; /* value associated with this name */ 1093} ctf_enum_t; --- 63 unchanged lines hidden (view full) --- 1157.Nm 1158file is the 1159.Sy string 1160section. 1161This section encodes all of the strings that appear throughout the other 1162sections. 1163It is laid out as a series of characters followed by a null terminator. 1164Generally, all names are written out in ASCII, as most C compilers do not allow |
1157and characters to appear in identifiers outside of a subset of ASCII. | 1165any characters to appear in identifiers outside of a subset of ASCII. |
1158However, any extended characters sets should be written out as a series of UTF-8 1159bytes. 1160.Lp 1161The first entry in the section, at offset zero, is a single null 1162terminator to reference the empty string. 1163Following that, each C string should be written out, including the null 1164terminator. 1165Offsets that refer to something in this section should refer to the first byte 1166which begins a string. 1167Beyond the first byte in the section being the null terminator, the order of 1168strings is unimportant. | 1166However, any extended characters sets should be written out as a series of UTF-8 1167bytes. 1168.Lp 1169The first entry in the section, at offset zero, is a single null 1170terminator to reference the empty string. 1171Following that, each C string should be written out, including the null 1172terminator. 1173Offsets that refer to something in this section should refer to the first byte 1174which begins a string. 1175Beyond the first byte in the section being the null terminator, the order of 1176strings is unimportant. |
1169.Sh Data Encoding and ELF Considerations | 1177.Ss Data Encoding and ELF Considerations |
1170.Nm 1171data is generally included in ELF objects which specify information to 1172identify the architecture and endianness of the file. 1173A 1174.Nm 1175container inside such an object must match the endianness of the ELF object. 1176Aside from the question of the endian encoding of data, there should be no other 1177differences between architectures. --- 29 unchanged lines hidden (view full) --- 1207.Sy SHT_PROGBITS . 1208The section should have a link set to the symbol table and its address 1209alignment must be 4. 1210.Sh SEE ALSO 1211.Xr dtrace 1 , 1212.Xr elf 3 , 1213.Xr gelf 3 , 1214.Xr a.out 5 , | 1178.Nm 1179data is generally included in ELF objects which specify information to 1180identify the architecture and endianness of the file. 1181A 1182.Nm 1183container inside such an object must match the endianness of the ELF object. 1184Aside from the question of the endian encoding of data, there should be no other 1185differences between architectures. --- 29 unchanged lines hidden (view full) --- 1215.Sy SHT_PROGBITS . 1216The section should have a link set to the symbol table and its address 1217alignment must be 4. 1218.Sh SEE ALSO 1219.Xr dtrace 1 , 1220.Xr elf 3 , 1221.Xr gelf 3 , 1222.Xr a.out 5 , |
1215.Xr elf 5 | 1223.Xr elf 5 , 1224.Xr arch 7 |