xref: /illumos-gate/usr/src/lib/libc/port/stdio/README.design (revision 45744051679350ee063cdc366b66bee5223a11ea)
1#
2# This file and its contents are supplied under the terms of the
3# Common Development and Distribution License ("CDDL"), version 1.0.
4# You may only use this file in accordance with the terms of version
5# 1.0 of the CDDL.
6#
7# A full copy of the text of the CDDL should have accompanied this
8# source.  A copy of the CDDL is also available via the Internet at
9# http://www.illumos.org/license/CDDL.
10#
11
12#
13# Copyright 2020 Robert Mustacchi
14#
15     _      _ _
16 ___| |_ __| (_) ___
17/ __| __/ _` | |/ _ \
18\__ \ || (_| | | (_) |
19|___/\__\__,_|_|\___/
20
21Notes on the design of stdio.
22
23------------
24File Streams
25------------
26
27At the heart of the stdio is the 'FILE *'. The 'FILE *' represents a
28stream that can be read, written, and seeked. The streams traditionally
29refer to a file descriptor, when created by fopen(3C), or may refer to
30memory, when created by open_memstream(3C) or fmopen(3C). This document
31focuses on the implementation of streams. Other misc. functions in stdio
32are not discussed.
33
34------------
35Organization
36------------
37
38Most functions exist in a file with the same name. When adding new
39files to stdio the file name should match the primary function name.
40There are a few exceptions. Almost all of the logic related to both
41flushing and knowledge of how to handle the 32-bit ABI issues (described
42in the next section) can be found in flush.c.
43
44-----------------------------
45struct __FILE_TAG and the ABI
46-----------------------------
47
48The definition of the 'FILE *' is a pointer to a 'struct __FILE_TAG'.
49The 'struct __FILE_TAG' structure has a long history that dates back to
50historical UNIX. For better or for worse, we have inherited some of the
51design decisions of the past, it's important to understand what those
52are as they have profound impact on the stdio design and serve as a good
53cautionary tale for future ABI decisions.
54
55In the original UNIX designs, the 'struct __FILE_TAG' was exposed as a
56non-opaque structure. This was also true on other platforms. This had a
57couple of challenges:
58
59* It meant the size of the 'struct __FILE_TAG' was part of the ABI
60* Consumers would access the members directly. You can find examples of
61  this in our public headers where things like getc() are inlined in
62  terms of the implementation. Various 3rd-party software that has
63  existed for quite some time knows the offset of members and directly
64  manipulates them. This is still true as of 2020.
65* The 'struct __FILE_TAG' only used an unsigned char (uint8_t) for the
66  file descriptor in the 32-bit version. Other systems used a short, so
67  they were in better shape. This was changed in the 64-bit version to
68  use an int.
69* The main C stdio symbols 'stdin', 'stdout', and 'stderr', were (and
70  still are) exposed as an array. This means that while the 64-bit
71  structure is opaque, its size is actually part of the ABI.
72
73All of these issues have been dealt with in different ways in the
74system. The first thing that is a little confusing is where to find the
75definitions of the actual implementation. The 32-bit 'struct __FILE_BUF'
76is split into two different pieces, the part that is public and a
77secondary, private part.
78
79The public definition of the 'struct __FILE_TAG' for 32-bit code and the
80opaque definition for 64-bit code may be found in
81'usr/src/head/stdio_impl.h.'. The actual definition of the 64-bit
82structure and the 32-bit additions are all found in
83'usr/src/lib/libc/inc/file64.h.'
84
85In file64.h, one will find the 'struct xFILEdata' (extended FILE * data).
86This represents all of the data that has been added to stdio that is
87missing from the public structure. Whenever a 'FILE *' is allocated,
8832-bit code always ensures that there is a corresponding 'struct
89xFILEdata' that exists. Currently, we still have plenty of padding left
90in the 64-bit version of the structure for at least 3 pointers.
91
92To add a member to the structure, one has to add data to the structures
93in 'lib/libc/inc/file64.h'. If for some reason, all the padding would be
94used up, then you must stop. The size of the 64-bit structure _cannot_
95be extended, as noted earlier it is part of the ABI. If we hit this
96case, then one must introduce the struct xFILEdata for the lp64
97environment.
98
99--------------------------
100Allocating FILE Structures
101--------------------------
102
103libc defines a number of 'FILE *' structures by default. These can all
104be found in 'data.c'. The first _NFILE (20 or 60 depending on the
105platform) are defined statically. In the 32-bit case, the corresponding
106'struct _xFILEdata' is allocated along with it.
107
108To determine if a structure is free or not in the array, the `_flag`
109member is consulted. If the flag has been set to zero, then the STREAM
110is considered free and can be allocated. All of the allocated (whether
111used or not) 'FILE *' structures are present on a linked list which is
112found in 'flush.c' rooted at the symbol '__first_link'. This list is
113always scanned to try and reuse an existing 'FILE *' structure before
114allocating a new one. If all of the existing ones are in use, then one
115will be allocated.
116
117An important thing to understand is that once allocated, a 'FILE *' will
118never be freed by libc. It will always exist on the global list of
119structures to be reused.
120
121---------
122Buffering
123---------
124
125Every stream in stdio starts out as buffered. Buffering can be changed
126by calling either setbuf(3C) or setvbuf(3C). This buffer is stored in
127the `_base` member of the 'struct __FILE_TAG'. The amount of valid data
128in the buffer is maintained in the '_cnt' member of the structure. By
129default, there is no associated buffer with a stream. When the stream is
130first used, the buffer will be assigned by a call to _findbuf() in
131_findbuf.c.
132
133There are pre-allocated buffers that exist. There are two specifically
134for stdin and stdout (stderr is unbuffered). These include space for
135both the buffer and the pushback buffer. The pushback buffer is used so
136someone can call fungetc(3C) regardless of whether a buffering mode is
137enabled or not. Characters that we 'unget' are placed on the pushback
138buffer.
139
140For other buffering modes, we'll try and allocate an appropriate sized
141buffer. The buffer size defaults to BUFSIZ, but if the stream is backed
142by a file descriptor, we'll use fstat() to determine the appropriate
143size to use and match the file system block size. If we cannot allocate
144that, we'll fall back to trying to allocate a pushback buffer.
145
146libc defines static data for _NFILE worth of pushback buffers which are
147indexed based on the underlying file descriptor. This and the stdin and
148stdout buffers are all found in 'data.c' in  _smbuf, _sibuf, and _sobuf
149respectively.
150
151------------------------------
152Reading, Writing, and Flushing
153------------------------------
154
155By default, reads and writes on a stream, whether backed by a
156file-descriptor or not, go through the buffer described in the previous
157section. If a read or write can be satisfied by the buffer, then no
158underlying I/O will occur, unless buffering has been disabled.
159
160The various function entry points that read such as fread(3C) or
161fgetc(3C) will not call read() directly but will instead try to fill the
162buffer, which will cause a read if required. This is centralized in
163_filbuf(). When a read is required from the underlying file, it will
164call _xread() in flush.c. For more on _xread() see the operations vector
165section further along.
166
167Unlike reads, writes are much less centralized and each of the main
168writing entry points has reimplemented the path of writing to the buffer
169and flushing it. It would be good in the future to consolidate them. In
170general, data will be written directly to the stdio buffer. When that
171buffer needs to be flushed either the _flsbuf() or _xflsbuf() functions
172will be called to actually flush out the buffer.
173
174When data needs to be flushed from a buffer to its underlying file
175descriptor (or other backing store), all of the write family functions
176ultimately call _xwrite().
177
178Flushes can occur in a few different ways:
179
1801. A write has filled up the buffer.
1812. A new line ('\n') is written and new-line buffering is used.
1823. fflush(3C) or a similar function has been called.
1834. A read occurs on a buffer that has unflushed writes.
1845. The stream is being closed.
185
186Most of these methods are fairly similar; however, the fflush(3C) case
187is a little different. fflush() may be asked to flush all of the streams
188when it is passed a NULL stream. Even when that happens it will still
189utilize the same underlying mechanism via _xflsbuf() or _flsbuf().
190
191-----------
192Orientation
193-----------
194
195Streams handle both wide characters and narrow characters. There is an
196internal multi-byte conversion state buffer that is included with every
197stream. A stream may exist in one of three modes:
198
1991. It may have an explicit narrow orientation
2002. It may have an explicit wide orientation
2013. It may have no orientation
202
203When most streams are created, they have no orientation. The orientation
204can then be explicitly set by calling fwide(3C). Some streams are also
205created with an explicit orientation, for example, open_wmemstream(3C)
206always sets the stream to be wide.
207
208The C standard dictates that certain operations will actually cause a
209stream with no orientation to have an explicit orientation set. Calling
210a narrow or wide related character function, such as 'fgetc(3C)' or
211'fgetwc(3C)' respectively will then cause the orientation to be set if
212it has not been. Once an orientation for a stream has been set, it
213cannot be changed until the stream has been closed or it is reset by
214calling freopen(3C).
215
216There are a few functions that don't change this today. One example is
217ungetc(3C). Often this isn't indicative of whether it should or
218shouldn't change the orientation, but is a side effect of the history of
219the stdio implementation.
220
221-------------------------------------
222Operations Vectors and Memory Streams
223-------------------------------------
224
225Traditionally, stdio streams were always backed by a file descriptor of
226some kind and therefore always called out into functions like read(2),
227write(2), lseek(2), and close(2) directly. A series of new functions
228were introduced in POSIX 2008 that add support for streams backed by
229memory in the form of fmemopen(3C), open_memstream(3C), and
230open_wmemstream(3C).
231
232To deal with this and other possible designs, an operations vector was
233added to the stream represented by the 'stdio_ops_t' structure. This is
234stored in the '_ops' member of the 'struct __FILE_BUF'. For a normal
235stream backed by a file descriptor, this member will be NULL.
236
237In places where a normal system call would have been made there is now a
238call to a corresponding function such as _xread(), _xwrite(), xseek(),
239_xseek64(), and _xclose(). If an operations vector is defined, it will
240call into the corresponding operation vector. If not, it will perform
241the traditional system call. This design choice consolidates all of the
242work required to implement non-file descriptor backed streams.
243
244When creating a non-file backed stream there are several expectations in
245the system:
246
247* The stream code should obtain a stream normally through a call to
248  _findiop().
249* If one needs to translate the normal fopen(3C) arguments, they should
250  use the _stdio_flags() function. This will also construct the
251  appropriate internal stdio flags for the stream.
252* The stream code must call _xassoc() to set the file operations vector
253  before return a 'FILE *' out of libc.
254* All of the operations vectors must be implemented.
255* If the stream is seekable, it must explicitly use the SET_SEEKABLE()
256  macro before return the stream.
257* If the stream is supposed to have a default orientation, it must set
258  it by calling _setorientation(). Not all streams have a default
259  orientation.
260* In the stream's close entry point it should call _xunassoc().
261
262--------------------------
263Extended File and fileno()
264--------------------------
265
266The 32-bit libc has historically been limited to 255 open streams
267because of the use of an unsigned char. This problem does not impact the
26864-bit libc. To deal with this, libc uses a series of techniques which
269are summarized for users in extendedFILE(7). The usage of extendedFILE
270can also be enabled by passing the special 'F' character to fopen(3C).
271
272The '_magic' member in the 32-bit 'struct __FILE_TAG' contains what used
273to be the file descriptor. When extended file is not in use, the
274_magic member still does contain the file descriptor. However, when
275extendedFILE is enabled, then the _magic member contains a sentinel
276value and the actual value is stored in the 'struct xFILEdata' _magic
277member.
278
279The act of getting the correct file descriptor has been centralized in a
280function called _get_fd(). This function knows how to handle the special
28132-bit case and the normal case. It also centralizes the logic of
282checking for a non-file backed stream. There are many cases in libc
283where we want to know the file descriptor to perform some operation;
284however, non-file backed streams do not have a corresponding file
285descriptor. When such a stream is detected, we will explicitly return
286-1. This ensures that a bad file descriptor will be used if someone
287mistakenly calls a system call. Functions like _fileno() call this
288directly.
289
290-------
291Testing
292-------
293
294There is a burgeoning test suite for stdio in
295usr/src/test/libc-tests/tests/stdio. If working in stdio (or libc more
296generally) it is recommended that you run this test suite and add new
297tests to it where appropriate. For most new functionality it is
298encouraged that you both import test suites that may already exist and
299that you also write your own test suites to properly cover a number of
300error and corner cases.
301
302Tests should also be written against libumem(3LIB), and umem debugging
303should be explicitly enabled in the program. Enabling umem debugging can
304catch a number of common memory usage errors. It also makes it easier to
305test for memory leaks by taking a core file and used the mdb
306'::findleaks' dcmd. A good starting point is to place the following in
307the program:
308
309const char *
310_umem_debug_init(void)
311{
312	return ("default,verbose");
313}
314
315const char *
316_umem_logging_init(void)
317{
318	return ("fail,contents");
319}
320
321For the definition of these flags, see umem_debug(3MALLOC).
322
323In addition, by leveraging umem debugging it becomes very easy to
324simulate malloc failure when required. This can be enabled by calling
325umem_setmtbf(1), which ensures that any subsequent memory requests
326through malloc(), including those made indirectly by libc, will fail. To
327restore the behavior after a test, one can simply call umem_setmtbf(0).
328