ctf.5 (ef36b3f75658d201edb495068db5e1be49593de5) ctf.5 (7421ff0751fadff2b6f5154f43067b869509603f)
1.\"
2.\" This file and its contents are supplied under the terms of the
3.\" Common Development and Distribution License ("CDDL"), version 1.0.
4.\" You may only use this file in accordance with the terms of version
5.\" 1.0 of the CDDL.
6.\"
7.\" A full copy of the text of the CDDL should have accompanied this
8.\" source. A copy of the CDDL is also available via the Internet at

--- 38 unchanged lines hidden (view full) ---

47.Lp
48Because a
49.Nm
50file is often embedded inside a file, rather than being a standalone
51file itself, it may also be referred to as a
52.Nm
53.Sy container .
54.Lp
1.\"
2.\" This file and its contents are supplied under the terms of the
3.\" Common Development and Distribution License ("CDDL"), version 1.0.
4.\" You may only use this file in accordance with the terms of version
5.\" 1.0 of the CDDL.
6.\"
7.\" A full copy of the text of the CDDL should have accompanied this
8.\" source. A copy of the CDDL is also available via the Internet at

--- 38 unchanged lines hidden (view full) ---

47.Lp
48Because a
49.Nm
50file is often embedded inside a file, rather than being a standalone
51file itself, it may also be referred to as a
52.Nm
53.Sy container .
54.Lp
55On illumos systems,
55On
56.Fx
57systems,
56.Nm
58.Nm
57data is consumed by multiple programs.
58It can be used by the modular debugger,
59.Xr mdb 1 ,
60as well as by
59data is consumed by
61.Xr dtrace 1 .
62Programmatic access to
63.Nm
60.Xr dtrace 1 .
61Programmatic access to
62.Nm
64data can be obtained through
65.Xr libctf 3LIB .
63data can be obtained through libctf.
66.Lp
67The
68.Nm
69file format is broken down into seven different sections.
70The first section is the
71.Sy preamble
72and
73.Sy header ,
74which describes the version of the
75.Nm
64.Lp
65The
66.Nm
67file format is broken down into seven different sections.
68The first section is the
69.Sy preamble
70and
71.Sy header ,
72which describes the version of the
73.Nm
76file, links it has to other
74file, the links it has to other
77.Nm
78files, and the sizes of the other sections.
79The next section is the
80.Sy label
81section,
82which provides a way of identifying similar groups of
83.Nm
84data across multiple files.
85This is followed by the
86.Sy object
75.Nm
76files, and the sizes of the other sections.
77The next section is the
78.Sy label
79section,
80which provides a way of identifying similar groups of
81.Nm
82data across multiple files.
83This is followed by the
84.Sy object
87information section, which describes the type of global
85information section, which describes the types of global
88symbols.
89The subsequent section is the
90.Sy function
91information section, which describes the return
92types and arguments of functions.
93The next section is the
94.Sy type
95information section, which describes

--- 36 unchanged lines hidden (view full) ---

132This means that a module only has types that are unique to itself and the most
133common types in the kernel are not duplicated.
134.Sh FILE FORMAT
135This documents version
136.Em two
137of the
138.Nm
139file format.
86symbols.
87The subsequent section is the
88.Sy function
89information section, which describes the return
90types and arguments of functions.
91The next section is the
92.Sy type
93information section, which describes

--- 36 unchanged lines hidden (view full) ---

130This means that a module only has types that are unique to itself and the most
131common types in the kernel are not duplicated.
132.Sh FILE FORMAT
133This documents version
134.Em two
135of the
136.Nm
137file format.
140All applications and tools currently produce and operate on this version.
138All applications and tools on
139.Fx
140currently produce and operate on this version.
141.Lp
142The file format can be summarized with the following image, the
143following sections will cover this in more detail.
144.Bd -literal
145
146 +-------------+ 0t0
147+--------| Preamble |
148| +-------------+ 0t4

--- 146 unchanged lines hidden (view full) ---

295always refer to the
296.Sy uncompressed
297data.
298.Lp
299In version two of the
300.Nm
301file format, the
302.Sy header
141.Lp
142The file format can be summarized with the following image, the
143following sections will cover this in more detail.
144.Bd -literal
145
146 +-------------+ 0t0
147+--------| Preamble |
148| +-------------+ 0t4

--- 146 unchanged lines hidden (view full) ---

295always refer to the
296.Sy uncompressed
297data.
298.Lp
299In version two of the
300.Nm
301file format, the
302.Sy header
303denotes whether whether or not this
303denotes whether or not this
304.Nm
305file is the child of another
306.Nm
307file and also indicates the size of the remaining sections.
308The structure for the
304.Nm
305file is the child of another
306.Nm
307file and also indicates the size of the remaining sections.
308The structure for the
309.Sy header ,
309.Sy header
310logically contains a copy of the
311.Sy preamble
312and the two have a combined size of 36 bytes.
313.Bd -literal
314typedef struct ctf_header {
315 ctf_preamble_t cth_preamble;
316 uint_t cth_parlabel; /* ref to name of parent lbl uniq'd against */
317 uint_t cth_parname; /* ref to basename of parent */

--- 211 unchanged lines hidden (view full) ---

529For example, when building illumos, there are many kernel modules that are built
530against a single collection of source code.
531A label is encoded into the
532.Nm
533files that corresponds with the particular build.
534This ensures that if files on the system were to become mixed up from multiple
535releases, that they are not used together by tools, particularly when a child
536needs to refer to a type in the parent.
310logically contains a copy of the
311.Sy preamble
312and the two have a combined size of 36 bytes.
313.Bd -literal
314typedef struct ctf_header {
315 ctf_preamble_t cth_preamble;
316 uint_t cth_parlabel; /* ref to name of parent lbl uniq'd against */
317 uint_t cth_parname; /* ref to basename of parent */

--- 211 unchanged lines hidden (view full) ---

529For example, when building illumos, there are many kernel modules that are built
530against a single collection of source code.
531A label is encoded into the
532.Nm
533files that corresponds with the particular build.
534This ensures that if files on the system were to become mixed up from multiple
535releases, that they are not used together by tools, particularly when a child
536needs to refer to a type in the parent.
537Because they are linked used the type identifiers, if the wrong parent is used
537Because they are linked using the type identifiers, if the wrong parent is used
538then the wrong type will be encountered.
539.Lp
540Each label is encoded in the file format using the following eight byte
541structure:
542.Bd -literal
543typedef struct ctf_lblent {
544 uint_t ctl_label; /* ref to name of label */
545 uint_t ctl_typeidx; /* last type associated with this label */

--- 10 unchanged lines hidden (view full) ---

556.Lp
557The type identifier encoded in the member
558.Em ctl_typeidx
559refers to the last type identifier that a label refers to in the current
560file.
561Labels only refer to types in the current file, if the
562.Nm
563file is a child, then it will have the same label as its parent;
538then the wrong type will be encountered.
539.Lp
540Each label is encoded in the file format using the following eight byte
541structure:
542.Bd -literal
543typedef struct ctf_lblent {
544 uint_t ctl_label; /* ref to name of label */
545 uint_t ctl_typeidx; /* last type associated with this label */

--- 10 unchanged lines hidden (view full) ---

556.Lp
557The type identifier encoded in the member
558.Em ctl_typeidx
559refers to the last type identifier that a label refers to in the current
560file.
561Labels only refer to types in the current file, if the
562.Nm
563file is a child, then it will have the same label as its parent;
564however, its label will only refer to its types, not its parents.
564however, its label will only refer to its types, not its parent's.
565.Lp
566It is also possible, though rather uncommon, for a
567.Nm
568file to have multiple labels.
569Labels are placed one after another, every eight bytes.
570When multiple labels are present, types may only belong to a single label.
571.Ss The Object Section
572The object section provides a mapping from ELF symbols of type

--- 7 unchanged lines hidden (view full) ---

580is stored for that entry.
581.Lp
582To walk the object section, you need to have a corresponding
583.Sy symbol table
584in the ELF object that contains the
585.Nm
586data.
587Not every object is included in this section.
565.Lp
566It is also possible, though rather uncommon, for a
567.Nm
568file to have multiple labels.
569Labels are placed one after another, every eight bytes.
570When multiple labels are present, types may only belong to a single label.
571.Ss The Object Section
572The object section provides a mapping from ELF symbols of type

--- 7 unchanged lines hidden (view full) ---

580is stored for that entry.
581.Lp
582To walk the object section, you need to have a corresponding
583.Sy symbol table
584in the ELF object that contains the
585.Nm
586data.
587Not every object is included in this section.
588Specifically, when walking the symbol table.
589An entry is skipped if it matches any of the following conditions:
588Specifically, when walking the symbol table, an entry is skipped if it matches
589any of the following conditions:
590.Lp
591.Bl -bullet -offset indent -compact
592.It
593The symbol type is not
594.Sy STT_OBJECT
595.It
596The symbol's section index is
597.Sy SHN_UNDEF

--- 54 unchanged lines hidden (view full) ---

652
653 return (0);
654}
655.Ed
656.Ss The Function Section
657The function section of the
658.Nm
659file encodes the types of both the function's arguments and the function's
590.Lp
591.Bl -bullet -offset indent -compact
592.It
593The symbol type is not
594.Sy STT_OBJECT
595.It
596The symbol's section index is
597.Sy SHN_UNDEF

--- 54 unchanged lines hidden (view full) ---

652
653 return (0);
654}
655.Ed
656.Ss The Function Section
657The function section of the
658.Nm
659file encodes the types of both the function's arguments and the function's
660return type.
660return value.
661Similar to
662.Sx The Object Section ,
663the function section encodes information for all symbols of type
664.Sy STT_FUNCTION ,
665excepting those that fit specific criteria.
666Unlike with objects, because functions have a variable number of arguments, they
667start with a type encoding as defined in
668.Sx Type Encoding ,

--- 116 unchanged lines hidden (view full) ---

785child.
786The member
787.Em ctt_name
788is encoded as described in the section
789.Sx String Identifiers .
790The string that it points to is the name of the type.
791If the identifier points to an empty string (one that consists solely of a null
792terminator) then the type does not have a name, this is common with anonymous
661Similar to
662.Sx The Object Section ,
663the function section encodes information for all symbols of type
664.Sy STT_FUNCTION ,
665excepting those that fit specific criteria.
666Unlike with objects, because functions have a variable number of arguments, they
667start with a type encoding as defined in
668.Sx Type Encoding ,

--- 116 unchanged lines hidden (view full) ---

785child.
786The member
787.Em ctt_name
788is encoded as described in the section
789.Sx String Identifiers .
790The string that it points to is the name of the type.
791If the identifier points to an empty string (one that consists solely of a null
792terminator) then the type does not have a name, this is common with anonymous
793structures and unions that only have a typedef to name them, as well as,
793structures and unions that only have a typedef to name them, as well as
794pointers and qualifiers.
795.Lp
796The next member, the
797.Em ctt_info ,
798is encoded as described in the section
799.Sx Type Encoding .
794pointers and qualifiers.
795.Lp
796The next member, the
797.Em ctt_info ,
798is encoded as described in the section
799.Sx Type Encoding .
800The types kind tells us how to interpret the remaining data in the
800The type's kind tells us how to interpret the remaining data in the
801.Sy ctf_type_t
802and any variable length data that may exist.
803The rest of this section will be broken down into the interpretation of the
804various kinds.
805.Ss Encoding of Integers
806Integers, which are of type
807.Sy CTF_K_INTEGER ,
808have no variable length arguments.

--- 90 unchanged lines hidden (view full) ---

899#define CTF_FP_ENCODING(data) (((data) & 0xff000000) >> 24)
900#define CTF_FP_OFFSET(data) (((data) & 0x00ff0000) >> 16)
901#define CTF_FP_BITS(data) (((data) & 0x0000ffff))
902
903#define CTF_FP_DATA(encoding, offset, bits) \\
904 (((encoding) << 24) | ((offset) << 16) | (bits))
905.Ed
906.Lp
801.Sy ctf_type_t
802and any variable length data that may exist.
803The rest of this section will be broken down into the interpretation of the
804various kinds.
805.Ss Encoding of Integers
806Integers, which are of type
807.Sy CTF_K_INTEGER ,
808have no variable length arguments.

--- 90 unchanged lines hidden (view full) ---

899#define CTF_FP_ENCODING(data) (((data) & 0xff000000) >> 24)
900#define CTF_FP_OFFSET(data) (((data) & 0x00ff0000) >> 16)
901#define CTF_FP_BITS(data) (((data) & 0x0000ffff))
902
903#define CTF_FP_DATA(encoding, offset, bits) \\
904 (((encoding) << 24) | ((offset) << 16) | (bits))
905.Ed
906.Lp
907Where as the encoding for integers was a series of flags, the encoding for
907Where as the encoding for integers is a series of flags, the encoding for
908floats maps to a specific kind of float.
909It is not a flag-based value.
910The kinds of floats correspond to both their size, and the encoding.
911This covers all of the basic C intrinsic floating point types.
912The following are the different kinds of floats represented in the encoding:
913.Bd -literal -offset indent
914#define CTF_FP_SINGLE 1 /* IEEE 32-bit float encoding */
915#define CTF_FP_DOUBLE 2 /* IEEE 64-bit float encoding */

--- 57 unchanged lines hidden (view full) ---

973Each one is represented by a
974.Sy uint16_t
975and encoded according to the
976.Sx Type Identifiers
977section.
978If the function's last argument is of type varargs, then it is also written out,
979but the type identifier is zero.
980This is included in the count of the function's arguments.
908floats maps to a specific kind of float.
909It is not a flag-based value.
910The kinds of floats correspond to both their size, and the encoding.
911This covers all of the basic C intrinsic floating point types.
912The following are the different kinds of floats represented in the encoding:
913.Bd -literal -offset indent
914#define CTF_FP_SINGLE 1 /* IEEE 32-bit float encoding */
915#define CTF_FP_DOUBLE 2 /* IEEE 64-bit float encoding */

--- 57 unchanged lines hidden (view full) ---

973Each one is represented by a
974.Sy uint16_t
975and encoded according to the
976.Sx Type Identifiers
977section.
978If the function's last argument is of type varargs, then it is also written out,
979but the type identifier is zero.
980This is included in the count of the function's arguments.
981An extra type identifier may follow the argument and return type identifiers
982in order to maintain four-byte alignment for the following type definition.
983Such a type identifier is not included in the argument count and has a value
984of zero.
981.Ss Encoding of Structures and Unions
982Structures and Unions, which are encoded with
983.Sy CTF_K_STRUCT
984and
985.Sy CTF_K_UNION
986respectively, are very similar constructs in C.
985.Ss Encoding of Structures and Unions
986Structures and Unions, which are encoded with
987.Sy CTF_K_STRUCT
988and
989.Sy CTF_K_UNION
990respectively, are very similar constructs in C.
987The main difference between them is the fact that every member of a structure
988follows one another, where as in a union, all members share the same memory.
991The main difference between them is the fact that members of a structure
992follow one another, where as in a union, all members share the same memory.
989They are also very similar in terms of their encoding in
990.Nm .
991The variable length argument for structures and unions represents the number of
992members that they have.
993The value of the member
994.Em ctt_size
995is the size of the structure and union.
996There are two different structures which are used to encode members in the

--- 30 unchanged lines hidden (view full) ---

1027.Sy ctm_type
1028and
1029.Sy ctlm_type
1030both refer to the type of the member.
1031They are encoded as per the section
1032.Sx Type Identifiers .
1033.Lp
1034The last piece of information that is present is the offset which describes the
993They are also very similar in terms of their encoding in
994.Nm .
995The variable length argument for structures and unions represents the number of
996members that they have.
997The value of the member
998.Em ctt_size
999is the size of the structure and union.
1000There are two different structures which are used to encode members in the

--- 30 unchanged lines hidden (view full) ---

1031.Sy ctm_type
1032and
1033.Sy ctlm_type
1034both refer to the type of the member.
1035They are encoded as per the section
1036.Sx Type Identifiers .
1037.Lp
1038The last piece of information that is present is the offset which describes the
1035offset in memory that the member begins at.
1036For unions, this value will always be zero because the start of unions in memory
1037is always zero.
1039offset in memory at which the member begins.
1040For unions, this value will always be zero because each member of a union has
1041an offset of zero.
1038For structures, this is the offset in
1039.Sy bits
1042For structures, this is the offset in
1043.Sy bits
1040that the member begins at.
1044at which the member begins.
1041Note that a compiler may lay out a type with padding.
1042This means that the difference in offset between two consecutive members may be
1043larger than the size of the member.
1044When the size of the overall structure is strictly less than 8192 bytes, the
1045normal structure,
1046.Sy ctf_member_t ,
1047is used and the offset in bits is stored in the member
1048.Em ctm_offset .

--- 15 unchanged lines hidden (view full) ---

1064are similar to structures.
1065Enumerations use the variable list to note the number of values that the
1066enumeration contains, which we'll term enumerators.
1067In C, an enumeration is always equivalent to the intrinsic type
1068.Sy int ,
1069thus the value of the member
1070.Em ctt_size
1071is always the size of an integer which is determined based on the current model.
1045Note that a compiler may lay out a type with padding.
1046This means that the difference in offset between two consecutive members may be
1047larger than the size of the member.
1048When the size of the overall structure is strictly less than 8192 bytes, the
1049normal structure,
1050.Sy ctf_member_t ,
1051is used and the offset in bits is stored in the member
1052.Em ctm_offset .

--- 15 unchanged lines hidden (view full) ---

1068are similar to structures.
1069Enumerations use the variable list to note the number of values that the
1070enumeration contains, which we'll term enumerators.
1071In C, an enumeration is always equivalent to the intrinsic type
1072.Sy int ,
1073thus the value of the member
1074.Em ctt_size
1075is always the size of an integer which is determined based on the current model.
1072For illumos systems, this will always be 4, as an integer is always defined to
1076For
1077.Fx
1078systems, this will always be 4, as an integer is always defined to
1073be 4 bytes large in both
1074.Sy ILP32
1075and
1076.Sy LP64 ,
1077regardless of the architecture.
1079be 4 bytes large in both
1080.Sy ILP32
1081and
1082.Sy LP64 ,
1083regardless of the architecture.
1084For further details, see
1085.Xr arch 7 .
1078.Lp
1079The enumerators encoded in an enumeration have the following structure in the
1080variable list:
1081.Bd -literal
1082typedef struct ctf_enum {
1083 uint_t cte_name; /* reference to name in string table */
1084 int cte_value; /* value associated with this name */
1085} ctf_enum_t;

--- 63 unchanged lines hidden (view full) ---

1149.Nm
1150file is the
1151.Sy string
1152section.
1153This section encodes all of the strings that appear throughout the other
1154sections.
1155It is laid out as a series of characters followed by a null terminator.
1156Generally, all names are written out in ASCII, as most C compilers do not allow
1086.Lp
1087The enumerators encoded in an enumeration have the following structure in the
1088variable list:
1089.Bd -literal
1090typedef struct ctf_enum {
1091 uint_t cte_name; /* reference to name in string table */
1092 int cte_value; /* value associated with this name */
1093} ctf_enum_t;

--- 63 unchanged lines hidden (view full) ---

1157.Nm
1158file is the
1159.Sy string
1160section.
1161This section encodes all of the strings that appear throughout the other
1162sections.
1163It is laid out as a series of characters followed by a null terminator.
1164Generally, all names are written out in ASCII, as most C compilers do not allow
1157and characters to appear in identifiers outside of a subset of ASCII.
1165any characters to appear in identifiers outside of a subset of ASCII.
1158However, any extended characters sets should be written out as a series of UTF-8
1159bytes.
1160.Lp
1161The first entry in the section, at offset zero, is a single null
1162terminator to reference the empty string.
1163Following that, each C string should be written out, including the null
1164terminator.
1165Offsets that refer to something in this section should refer to the first byte
1166which begins a string.
1167Beyond the first byte in the section being the null terminator, the order of
1168strings is unimportant.
1166However, any extended characters sets should be written out as a series of UTF-8
1167bytes.
1168.Lp
1169The first entry in the section, at offset zero, is a single null
1170terminator to reference the empty string.
1171Following that, each C string should be written out, including the null
1172terminator.
1173Offsets that refer to something in this section should refer to the first byte
1174which begins a string.
1175Beyond the first byte in the section being the null terminator, the order of
1176strings is unimportant.
1169.Sh Data Encoding and ELF Considerations
1177.Ss Data Encoding and ELF Considerations
1170.Nm
1171data is generally included in ELF objects which specify information to
1172identify the architecture and endianness of the file.
1173A
1174.Nm
1175container inside such an object must match the endianness of the ELF object.
1176Aside from the question of the endian encoding of data, there should be no other
1177differences between architectures.

--- 29 unchanged lines hidden (view full) ---

1207.Sy SHT_PROGBITS .
1208The section should have a link set to the symbol table and its address
1209alignment must be 4.
1210.Sh SEE ALSO
1211.Xr dtrace 1 ,
1212.Xr elf 3 ,
1213.Xr gelf 3 ,
1214.Xr a.out 5 ,
1178.Nm
1179data is generally included in ELF objects which specify information to
1180identify the architecture and endianness of the file.
1181A
1182.Nm
1183container inside such an object must match the endianness of the ELF object.
1184Aside from the question of the endian encoding of data, there should be no other
1185differences between architectures.

--- 29 unchanged lines hidden (view full) ---

1215.Sy SHT_PROGBITS .
1216The section should have a link set to the symbol table and its address
1217alignment must be 4.
1218.Sh SEE ALSO
1219.Xr dtrace 1 ,
1220.Xr elf 3 ,
1221.Xr gelf 3 ,
1222.Xr a.out 5 ,
1215.Xr elf 5
1223.Xr elf 5 ,
1224.Xr arch 7