xref: /illumos-gate/usr/src/man/man7/byteorder.7 (revision 8119dad84d6416f13557b0ba8e2aaf9064cbcfd3)
1.\"
2.\" This file and its contents are supplied under the terms of the
3.\" Common Development and Distribution License ("CDDL"), version 1.0.
4.\" You may only use this file in accordance with the terms of version
5.\" 1.0 of the CDDL.
6.\"
7.\" A full copy of the text of the CDDL should have accompanied this
8.\" source.  A copy of the CDDL is also available via the Internet at
9.\" http://www.illumos.org/license/CDDL.
10.\"
11.\"
12.\" Copyright 2016 Joyent, Inc.
13.\"
14.Dd August 2, 2018
15.Dt BYTEORDER 7
16.Os
17.Sh NAME
18.Nm byteorder ,
19.Nm endian
20.Nd byte order and endianness
21.Sh DESCRIPTION
22Integer values which occupy more than 1 byte in memory can be laid out
23in different ways on different platforms.
24In particular, there is a major split between those which place the least
25significant byte of an integer at the lowest address, and those which place the
26most significant byte there instead.
27As this difference relates to which end of the integer is found in memory first,
28the term
29.Em endian
30is used to refer to a particular byte order.
31.Pp
32A platform is referred to as using a
33.Em big-endian
34byte order when it places the most significant byte at the lowest
35address, and
36.Em little-endian
37when it places the least significant byte first.
38Some platforms may also switch between big- and little-endian mode and run code
39compiled for either.
40.Pp
41Historically, there have also been some systems that utilized
42.Em middle-endian
43byte orders for integers larger than 2 bytes.
44Such orderings are not in common use today.
45.Pp
46Endianness is also of particular importance when dealing with values
47that are being read into memory from an external source.
48For example, network protocols such as IP conventionally define the fields in a
49packet as being always stored in big-endian byte order.
50This means that a little-endian machine will have to perform transformations on
51these fields in order to process them.
52.Ss Examples
53To illustrate endianness in memory, let us consider the decimal integer
542864434397.
55This number fits in 32 bits of storage (4 bytes).
56.Pp
57On a big-endian system, this integer would be written into memory as
58the bytes 0xAA, 0xBB, 0xCC, 0xDD, in order from lowest memory address to
59highest.
60.Pp
61On a little-endian system, it would be written instead as the bytes
620xDD, 0xCC, 0xBB, 0xAA, in that order.
63.Pp
64If both the big- and little-endian systems were asked to store this
65integer at address 0x100, we would see the following in each of their
66memory:
67.Bd -literal
68
69                    Big-Endian
70
71        ++------++------++------++------++
72        || 0xAA || 0xBB || 0xCC || 0xDD ||
73        ++------++------++------++------++
74            ^^      ^^      ^^      ^^
75          0x100   0x101   0x102   0x103
76            vv      vv      vv      vv
77        ++------++------++------++------++
78        || 0xDD || 0xCC || 0xBB || 0xAA ||
79        ++------++------++------++------++
80
81                  Little-Endian
82.Ed
83.Pp
84It is particularly important to note that even though the byte order is
85different between these two machines, the bit ordering within each byte,
86by convention, is still the same.
87.Pp
88For example, take the decimal integer 4660, which occupies in 16 bits (2
89bytes).
90.Pp
91On a big-endian system, this would be written into memory as 0x12, then
920x34.
93.Pp
94On a little-endian system, it would be written as 0x34, then 0x12.
95Note that this is not at all the same as seeing 0x43 then 0x21 in memory --
96only the bytes are re-ordered, not any bits (or nybbles) within them.
97.Pp
98As before, storing this at address 0x100:
99.Bd -literal
100                    Big-Endian
101
102                ++------++------++
103                || 0x12 || 0x34 ||
104                ++------++------++
105                    ^^      ^^
106                  0x100   0x101
107                    vv      vv
108                ++------++------++
109                || 0x34 || 0x12 ||
110                ++------++------++
111
112                   Little-Endian
113.Ed
114.Pp
115This example shows how an eight byte number, 0xBADCAFEDEADBEEF is stored
116in both big and little-endian:
117.Bd -literal
118                        Big-Endian
119
120    +------+------+------+------+------+------+------+------+
121    | 0xBA | 0xDC | 0xAF | 0xFE | 0xDE | 0xAD | 0xBE | 0xEF |
122    +------+------+------+------+------+------+------+------+
123       ^^     ^^     ^^     ^^     ^^     ^^     ^^     ^^
124     0x100  0x101  0x102  0x103  0x104  0x105  0x106  0x107
125       vv     vv     vv     vv     vv     vv     vv     vv
126    +------+------+------+------+------+------+------+------+
127    | 0xEF | 0xBE | 0xAD | 0xDE | 0xFE | 0xAF | 0xDC | 0xBA |
128    +------+------+------+------+------+------+------+------+
129
130                       Little-Endian
131
132.Ed
133.Pp
134The treatment of different endian values would not be complete without
135discussing
136.Em PDP-endian ,
137which is also known as
138.Em middle-endian .
139While the PDP-11 was a 16-bit little-endian system, it laid out 32-bit
140values in a different way from current little-endian systems.
141First, it would divide a 32-bit number into two 16-bit numbers.
142Each 16-bit number would be stored in little-endian; however, the two 16-bit
143words would be stored with the larger 16-bit word appearing first in memory,
144followed by the latter.
145.Pp
146The following image illustrates PDP-endian and compares it against
147little-endian values.
148Here, we'll start with the value 0xAABBCCDD and show how the four bytes for it
149will be laid out, starting at 0x100.
150.Bd -literal
151                    PDP-Endian
152
153        ++------++------++------++------++
154        || 0xBB || 0xAA || 0xDD || 0xCC ||
155        ++------++------++------++------++
156            ^^      ^^      ^^      ^^
157          0x100   0x101   0x102   0x103
158            vv      vv      vv      vv
159        ++------++------++------++------++
160        || 0xDD || 0xCC || 0xBB || 0xAA ||
161        ++------++------++------++------++
162
163                  Little-Endian
164
165.Ed
166.Ss Network Byte Order
167The term 'network byte order' refers to big-endian ordering, and
168originates from the IEEE.
169Early disagreements over which byte ordering to use for network traffic prompted
170RFC1700 to define that all IETF-specified network protocols use big-endian
171ordering unless noted explicitly otherwise.
172The Internet protocol family (IP, and thus TCP and UDP etc) particularly adhere
173to this convention.
174.Ss Determining the System's Byte Order
175The operating system supports both big-endian and little-endian CPUs.
176To make it easier for programs to determine the endianness of the platform they
177are being compiled for, functions and macro constants are provided in the system
178header files.
179.Pp
180The endianness of the system can be obtained by including the header
181.In sys/types.h
182and using the pre-processor macros
183.Sy _LITTLE_ENDIAN
184and
185.Sy _BIG_ENDIAN .
186See
187.Xr types.h 3HEAD
188for more information.
189.Pp
190Additionally, the header
191.In endian.h
192defines an alternative means for determining the endianness of the
193current system.
194See
195.Xr endian.h 3HEAD
196for more information.
197.Pp
198illumos runs on both big- and little-endian systems.
199When writing software for which the endianness is important, one must always
200check the byte order and convert it appropriately.
201.Ss Converting Between Byte Orders
202The system provides two different sets of functions to convert values
203between big-endian and little-endian.
204They are defined in
205.Xr byteorder 3C
206and
207.Xr endian 3C .
208.Pp
209The
210.Xr byteorder 3C
211family of functions convert data between the host's native byte order
212and big- or little-endian.
213The functions operate on either 16-bit, 32-bit, or 64-bit values.
214Functions that convert from network byte order to the host's byte order
215start with the string
216.Sy ntoh ,
217while functions which convert from the host's byte order to network byte
218order, begin with
219.Sy hton .
220For example, to convert a 32-bit value, a long, from network byte order
221to the host's, one would use the function
222.Xr ntohl 3C .
223.Pp
224These functions have been standardized by POSIX.
225However, the 64-bit variants,
226.Xr ntohll 3C
227and
228.Xr htonll 3C
229are not standardized and may not be found on other systems.
230For more information on these functions, see
231.Xr byteorder 3C .
232.Pp
233The second family of functions,
234.Xr endian 3C ,
235provide a means to convert between the host's byte order
236and big-endian and little-endian specifically.
237While these functions are similar to those in
238.Xr byteorder 3C ,
239they more explicitly cover different data conversions.
240Like them, these functions operate on either 16-bit, 32-bit, or 64-bit values.
241When converting from big-endian, to the host's endianness, the functions
242begin with
243.Sy betoh .
244If instead, one is converting data from the host's native endianness to
245another, then it starts with
246.Sy htobe .
247When working with little-endian data, the prefixes
248.Sy letoh
249and
250.Sy htole
251convert little-endian data to the host's endianness and from the host's
252to little-endian respectively.
253.Pp
254These functions are not standardized and the header they appear in varies
255between the BSDs and GNU/Linux.
256Applications that wish to be portable, should instead use the
257.Xr byteorder 3C
258functions.
259.Pp
260All of these functions in both families simply return their input when
261the host's native byte order is the same as the desired order.
262For example, when calling
263.Xr htonl 3C
264on a big-endian system the original data is returned with no conversion
265or modification.
266.Sh SEE ALSO
267.Xr byteorder 3C ,
268.Xr endian 3C ,
269.Xr endian.h 3HEAD ,
270.Xr inet 3HEAD
271