xref: /titanic_52/usr/src/man/man5/byteorder.5 (revision 945167b5f89d2827087e9de0dd3844260dc16242)
1.\"
2.\" This file and its contents are supplied under the terms of the
3.\" Common Development and Distribution License ("CDDL"), version 1.0.
4.\" You may only use this file in accordance with the terms of version
5.\" 1.0 of the CDDL.
6.\"
7.\" A full copy of the text of the CDDL should have accompanied this
8.\" source.  A copy of the CDDL is also available via the Internet at
9.\" http://www.illumos.org/license/CDDL.
10.\"
11.\"
12.\" Copyright 2016 Joyent, Inc.
13.\"
14.Dd January 31, 2016
15.Dt BYTEORDER 5
16.Os
17.Sh NAME
18.Nm byteorder ,
19.Nm endian
20.Nd byte order and endianness
21.Sh DESCRIPTION
22Integer values which occupy more than 1 byte in memory can be laid out
23in different ways on different platforms. In particular, there is a
24major split between those which place the least significant byte of an
25integer at the lowest address, and those which place the most
26significant byte there instead. As this difference relates to which
27end of the integer is found in memory first, the term
28.Em endian
29is used to refer to a particular byte order.
30.Pp
31A platform is referred to as using a
32.Em big-endian
33byte order when it places the most significant byte at the lowest
34address, and
35.Em little-endian
36when it places the least significant byte first. Some platforms may also
37switch between big- and little-endian mode and run code compiled for
38either.
39.Pp
40Historically, there have also been some systems that utilized
41.Em middle-endian
42byte orders for integers larger than 2 bytes. Such orderings are not in
43common use today.
44.Pp
45Endianness is also of particular importance when dealing with values
46that are being read into memory from an external source. For example,
47network protocols such as IP conventionally define the fields in a
48packet as being always stored in big-endian byte order. This means that
49a little-endian machine will have to perform transformations on these
50fields in order to process them.
51.Ss Examples
52To illustrate endianness in memory, let us consider the decimal integer
532864434397. This number fits in 32 bits of storage (4 bytes).
54.Pp
55On a big-endian system, this integer would be written into memory as
56the bytes 0xAA, 0xBB, 0xCC, 0xDD, in order from lowest memory address to
57highest.
58.Pp
59On a little-endian system, it would be written instead as the bytes
600xDD, 0xCC, 0xBB, 0xAA, in that order.
61.Pp
62If both the big- and little-endian systems were asked to store this
63integer at address 0x100, we would see the following in each of their
64memory:
65.Bd -literal
66
67                    Big-Endian
68
69        ++------++------++------++------++
70        || 0xAA || 0xBB || 0xCC || 0xDD ||
71        ++------++------++------++------++
72            ^^      ^^      ^^      ^^
73          0x100   0x101   0x102   0x103
74            vv      vv      vv      vv
75        ++------++------++------++------++
76        || 0xDD || 0xCC || 0xBB || 0xAA ||
77        ++------++------++------++------++
78
79                  Little-Endian
80.Ed
81.Pp
82It is particularly important to note that even though the byte order is
83different between these two machines, the bit ordering within each byte,
84by convention, is still the same.
85.Pp
86For example, take the decimal integer 4660, which occupies in 16 bits (2
87bytes).
88.Pp
89On a big-endian system, this would be written into memory as 0x12, then
900x34.
91.Pp
92On a little-endian system, it would be written as 0x34, then 0x12.  Note
93that this is not at all the same as seeing 0x43 then 0x21 in memory --
94only the bytes are re-ordered, not any bits (or nybbles) within them.
95.Pp
96As before, storing this at address 0x100:
97.Bd -literal
98                    Big-Endian
99
100                ++------++------++
101                || 0x12 || 0x34 ||
102                ++------++------++
103                    ^^      ^^
104                  0x100   0x101
105                    vv      vv
106                ++------++------++
107                || 0x34 || 0x12 ||
108                ++------++------++
109
110                   Little-Endian
111.Ed
112.Pp
113This example shows how an eight byte number, 0xBADCAFEDEADBEEF is stored
114in both big and little-endian:
115.Bd -literal
116                        Big-Endian
117
118    +------+------+------+------+------+------+------+------+
119    | 0xBA | 0xDC | 0xAF | 0xFE | 0xDE | 0xAD | 0xBE | 0xEF |
120    +------+------+------+------+------+------+------+------+
121       ^^     ^^     ^^     ^^     ^^     ^^     ^^     ^^
122     0x100  0x101  0x102  0x103  0x104  0x105  0x106  0x107
123       vv     vv     vv     vv     vv     vv     vv     vv
124    +------+------+------+------+------+------+------+------+
125    | 0xEF | 0xBE | 0xAD | 0xDE | 0xFE | 0xAF | 0xDC | 0xBA |
126    +------+------+------+------+------+------+------+------+
127
128                       Little-Endian
129
130.Ed
131.Pp
132The treatment of different endian values would not be complete without
133discussing
134.Em PDP-endian ,
135which is also known as
136.Em middle-endian .
137While the PDP-11 was a 16-bit little-endian system, it laid out 32-bit
138values in a different way from current little-endian systems. First, it
139would divide a 32-bit number into two 16-bit numbers. Each 16-bit number
140would be stored in little-endian; however, the two 16-bit words would be
141stored with the larger 16-bit word appearing first in memory, followed
142by the latter.
143.Pp
144The following image illustrates PDP-endian and compares it against
145little-endian values. Here, we'll start with the value 0xAABBCCDD and
146show how the four bytes for it will be laid out, starting at 0x100.
147.Bd -literal
148                    PDP-Endian
149
150        ++------++------++------++------++
151        || 0xBB || 0xAA || 0xDD || 0xCC ||
152        ++------++------++------++------++
153            ^^      ^^      ^^      ^^
154          0x100   0x101   0x102   0x103
155            vv      vv      vv      vv
156        ++------++------++------++------++
157        || 0xDD || 0xCC || 0xBB || 0xAA ||
158        ++------++------++------++------++
159
160                  Little-Endian
161
162.Ed
163.Ss Network Byte Order
164The term 'network byte order' refers to big-endian ordering, and
165originates from the IEEE. Early disagreements over which byte ordering
166to use for network traffic prompted RFC1700 to define that all
167IETF-specified network protocols use big-endian ordering unless noted
168explicitly otherwise. The Internet protocol family (IP, and thus TCP and
169UDP etc) particularly adhere to this convention.
170.Ss Determining the System's Byte Order
171The operating system supports both big-endian and little-endian CPUs. To
172make it easier for programs to determine the endianness of the
173platform they are being compiled for, functions and macro constants are
174provided in the system header files.
175.Pp
176The endianness of the system can be obtained by including the header
177.In sys/types.h
178and using the pre-processor macros
179.Sy _LITTLE_ENDIAN
180and
181.Sy _BIG_ENDIAN .
182See
183.Xr types.h 3HEAD
184for more information.
185.Pp
186Additionally, the header
187.In endian.h
188defines an alternative means for determining the endianness of the
189current system. See
190.Xr endian.h 3HEAD
191for more information.
192.Pp
193illumos runs on both big- and little-endian systems. When writing
194software for which the endianness is important, one must always check
195the byte order and convert it appropriately.
196.Ss Converting Between Byte Orders
197The system provides two different sets of functions to convert values
198between big-endian and little-endian. They are defined in
199.Xr byteorder 3C
200and
201.Xr endian 3C .
202.Pp
203The
204.Xr byteorder 3SOCKET
205family of functions convert data between the host's native byte order
206and big- or little-endian.
207The functions operate on either 16-bit, 32-bit, or 64-bit values.
208Functions that convert from network byte order to the host's byte order
209start with the string
210.Sy ntoh ,
211while functions which convert from the host's byte order to network byte
212order, begin with
213.Sy hton .
214For example, to convert a 32-bit value, a long, from network byte order
215to the host's, one would use the function
216.Xr ntohl 3SOCKET .
217.Pp
218These functions have been standardized by POSIX. However, the 64-bit variants,
219.Xr ntohll 3SOCKET
220and
221.Xr htonll 3SOCKET
222are not standardized and may not be found on other systems. For more
223information on these functions, see
224.Xr byteorder 3SOCKET .
225.Pp
226The second family of functions,
227.Xr endian 3C ,
228provide a means to convert between the host's byte order
229and big-endian and little-endian specifically. While these functions are
230similar to those in
231.Xr byteorder 3C ,
232they more explicitly cover different data conversions. Like them, these
233functions operate on either 16-bit, 32-bit, or 64-bit values. When
234converting from big-endian, to the host's endianness, the functions
235begin with
236.Sy betoh .
237If instead, one is converting data from the host's native endianness to
238another, then it starts with
239.Sy htobe .
240When working with little-endian data, the prefixes
241.Sy letoh
242and
243.Sy htole
244convert little-endian data to the host's endianness and from the host's
245to little-endian respectively.
246.Pp
247These functions
248are not standardized and the header they appear in varies between the
249BSDs and GNU/Linux. Applications that wish to be portable, should
250instead use the
251.Xr byteorder 3C
252functions.
253.Pp
254All of these functions in both families simply return their input when
255the host's native byte order is the same as the desired order. For
256example, when calling
257.Xr htonl 3SOCKET
258on a big-endian system the original data is returned with no conversion
259or modification.
260.Sh SEE ALSO
261.Xr endian 3C ,
262.Xr endian.h 3HEAD ,
263.Xr inet 3HEAD ,
264.Xr byteorder 3SOCKET
265