xref: /freebsd/lib/libc/stdio/scanf.3 (revision 22cf89c938886d14f5796fc49f9f020c23ea8eaf)
1.\" Copyright (c) 1990, 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" This code is derived from software contributed to Berkeley by
5.\" Chris Torek and the American National Standards Committee X3,
6.\" on Information Processing Systems.
7.\"
8.\" Redistribution and use in source and binary forms, with or without
9.\" modification, are permitted provided that the following conditions
10.\" are met:
11.\" 1. Redistributions of source code must retain the above copyright
12.\"    notice, this list of conditions and the following disclaimer.
13.\" 2. Redistributions in binary form must reproduce the above copyright
14.\"    notice, this list of conditions and the following disclaimer in the
15.\"    documentation and/or other materials provided with the distribution.
16.\" 3. Neither the name of the University nor the names of its contributors
17.\"    may be used to endorse or promote products derived from this software
18.\"    without specific prior written permission.
19.\"
20.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
21.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
23.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
24.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
26.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
27.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
28.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
29.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
30.\" SUCH DAMAGE.
31.\"
32.\"     @(#)scanf.3	8.2 (Berkeley) 12/11/93
33.\"
34.Dd August 21, 2023
35.Dt SCANF 3
36.Os
37.Sh NAME
38.Nm scanf ,
39.Nm fscanf ,
40.Nm sscanf ,
41.Nm vscanf ,
42.Nm vsscanf ,
43.Nm vfscanf
44.Nd input format conversion
45.Sh LIBRARY
46.Lb libc
47.Sh SYNOPSIS
48.In stdio.h
49.Ft int
50.Fn scanf "const char * restrict format" ...
51.Ft int
52.Fn fscanf "FILE * restrict stream" "const char * restrict format" ...
53.Ft int
54.Fn sscanf "const char * restrict str" "const char * restrict format" ...
55.In stdarg.h
56.Ft int
57.Fn vscanf "const char * restrict format" "va_list ap"
58.Ft int
59.Fn vsscanf "const char * restrict str" "const char * restrict format" "va_list ap"
60.Ft int
61.Fn vfscanf "FILE * restrict stream" "const char * restrict format" "va_list ap"
62.Sh DESCRIPTION
63The
64.Fn scanf
65family of functions scans input according to a
66.Fa format
67as described below.
68This format may contain
69.Em conversion specifiers ;
70the results from such conversions, if any,
71are stored through the
72.Em pointer
73arguments.
74The
75.Fn scanf
76function
77reads input from the standard input stream
78.Dv stdin ,
79.Fn fscanf
80reads input from the stream pointer
81.Fa stream ,
82and
83.Fn sscanf
84reads its input from the character string pointed to by
85.Fa str .
86The
87.Fn vfscanf
88function
89is analogous to
90.Xr vfprintf 3
91and reads input from the stream pointer
92.Fa stream
93using a variable argument list of pointers (see
94.Xr stdarg 3 ) .
95The
96.Fn vscanf
97function scans a variable argument list from the standard input and
98the
99.Fn vsscanf
100function scans it from a string;
101these are analogous to
102the
103.Fn vprintf
104and
105.Fn vsprintf
106functions respectively.
107Each successive
108.Em pointer
109argument must correspond properly with
110each successive conversion specifier
111(but see the
112.Cm *
113conversion below).
114All conversions are introduced by the
115.Cm %
116(percent sign) character.
117The
118.Fa format
119string
120may also contain other characters.
121White space (such as blanks, tabs, or newlines) in the
122.Fa format
123string match any amount of white space, including none, in the input.
124Everything else
125matches only itself.
126Scanning stops
127when an input character does not match such a format character.
128Scanning also stops
129when an input conversion cannot be made (see below).
130.Sh CONVERSIONS
131Following the
132.Cm %
133character introducing a conversion
134there may be a number of
135.Em flag
136characters, as follows:
137.Bl -tag -width ".Cm l No (ell)"
138.It Cm *
139Suppresses assignment.
140The conversion that follows occurs as usual, but no pointer is used;
141the result of the conversion is simply discarded.
142.It Cm hh
143Indicates that the conversion will be one of
144.Cm bdioux
145or
146.Cm n
147and the next pointer is a pointer to a
148.Vt char
149(rather than
150.Vt int ) .
151.It Cm h
152Indicates that the conversion will be one of
153.Cm bdioux
154or
155.Cm n
156and the next pointer is a pointer to a
157.Vt "short int"
158(rather than
159.Vt int ) .
160.It Cm l No (ell)
161Indicates that the conversion will be one of
162.Cm bdioux
163or
164.Cm n
165and the next pointer is a pointer to a
166.Vt "long int"
167(rather than
168.Vt int ) ,
169that the conversion will be one of
170.Cm a , e , f ,
171or
172.Cm g
173and the next pointer is a pointer to
174.Vt double
175(rather than
176.Vt float ) ,
177or that the conversion will be one of
178.Cm c ,
179.Cm s
180or
181.Cm \&[
182and the next pointer is a pointer to an array of
183.Vt wchar_t
184(rather than
185.Vt char ) .
186.It Cm ll No (ell ell)
187Indicates that the conversion will be one of
188.Cm bdioux
189or
190.Cm n
191and the next pointer is a pointer to a
192.Vt "long long int"
193(rather than
194.Vt int ) .
195.It Cm L
196Indicates that the conversion will be one of
197.Cm a , e , f ,
198or
199.Cm g
200and the next pointer is a pointer to
201.Vt "long double" .
202.It Cm j
203Indicates that the conversion will be one of
204.Cm bdioux
205or
206.Cm n
207and the next pointer is a pointer to a
208.Vt intmax_t
209(rather than
210.Vt int ) .
211.It Cm t
212Indicates that the conversion will be one of
213.Cm bdioux
214or
215.Cm n
216and the next pointer is a pointer to a
217.Vt ptrdiff_t
218(rather than
219.Vt int ) .
220.It Cm z
221Indicates that the conversion will be one of
222.Cm bdioux
223or
224.Cm n
225and the next pointer is a pointer to a
226.Vt size_t
227(rather than
228.Vt int ) .
229.It Cm q
230(deprecated.)
231Indicates that the conversion will be one of
232.Cm bdioux
233or
234.Cm n
235and the next pointer is a pointer to a
236.Vt "long long int"
237(rather than
238.Vt int ) .
239.El
240.Pp
241In addition to these flags,
242there may be an optional maximum field width,
243expressed as a decimal integer,
244between the
245.Cm %
246and the conversion.
247If no width is given,
248a default of
249.Dq infinity
250is used (with one exception, below);
251otherwise at most this many bytes are scanned
252in processing the conversion.
253In the case of the
254.Cm lc ,
255.Cm ls
256and
257.Cm l[
258conversions, the field width specifies the maximum number
259of multibyte characters that will be scanned.
260Before conversion begins,
261most conversions skip white space;
262this white space is not counted against the field width.
263.Pp
264The following conversions are available:
265.Bl -tag -width XXXX
266.It Cm %
267Matches a literal
268.Ql % .
269That is,
270.Dq Li %%
271in the format string
272matches a single input
273.Ql %
274character.
275No conversion is done, and assignment does not occur.
276.It Cm b , B
277Matches an optionally signed binary integer;
278the next pointer must be a pointer to
279.Vt "unsigned int" .
280.It Cm d
281Matches an optionally signed decimal integer;
282the next pointer must be a pointer to
283.Vt int .
284.It Cm i
285Matches an optionally signed integer;
286the next pointer must be a pointer to
287.Vt int .
288The integer is read
289in base 2 if it begins with
290.Ql 0b
291or
292.Ql 0B ,
293in base 16 if it begins
294with
295.Ql 0x
296or
297.Ql 0X ,
298in base 8 if it begins with
299.Ql 0 ,
300and in base 10 otherwise.
301Only characters that correspond to the base are used.
302.It Cm o
303Matches an octal integer;
304the next pointer must be a pointer to
305.Vt "unsigned int" .
306.It Cm u
307Matches an optionally signed decimal integer;
308the next pointer must be a pointer to
309.Vt "unsigned int" .
310.It Cm x , X
311Matches an optionally signed hexadecimal integer;
312the next pointer must be a pointer to
313.Vt "unsigned int" .
314.It Cm a , A , e , E , f , F , g , G
315Matches a floating-point number in the style of
316.Xr strtod 3 .
317The next pointer must be a pointer to
318.Vt float
319(unless
320.Cm l
321or
322.Cm L
323is specified.)
324.It Cm s
325Matches a sequence of non-white-space characters;
326the next pointer must be a pointer to
327.Vt char ,
328and the array must be large enough to accept all the sequence and the
329terminating
330.Dv NUL
331character.
332The input string stops at white space
333or at the maximum field width, whichever occurs first.
334.Pp
335If an
336.Cm l
337qualifier is present, the next pointer must be a pointer to
338.Vt wchar_t ,
339into which the input will be placed after conversion by
340.Xr mbrtowc 3 .
341.It Cm S
342The same as
343.Cm ls .
344.It Cm c
345Matches a sequence of
346.Em width
347count
348characters (default 1);
349the next pointer must be a pointer to
350.Vt char ,
351and there must be enough room for all the characters
352(no terminating
353.Dv NUL
354is added).
355The usual skip of leading white space is suppressed.
356To skip white space first, use an explicit space in the format.
357.Pp
358If an
359.Cm l
360qualifier is present, the next pointer must be a pointer to
361.Vt wchar_t ,
362into which the input will be placed after conversion by
363.Xr mbrtowc 3 .
364.It Cm C
365The same as
366.Cm lc .
367.It Cm \&[
368Matches a nonempty sequence of characters from the specified set
369of accepted characters;
370the next pointer must be a pointer to
371.Vt char ,
372and there must be enough room for all the characters in the string,
373plus a terminating
374.Dv NUL
375character.
376The usual skip of leading white space is suppressed.
377The string is to be made up of characters in
378(or not in)
379a particular set;
380the set is defined by the characters between the open bracket
381.Cm \&[
382character
383and a close bracket
384.Cm \&]
385character.
386The set
387.Em excludes
388those characters
389if the first character after the open bracket is a circumflex
390.Cm ^ .
391To include a close bracket in the set,
392make it the first character after the open bracket
393or the circumflex;
394any other position will end the set.
395The hyphen character
396.Cm -
397is also special;
398when placed between two other characters,
399it adds all intervening characters to the set.
400To include a hyphen,
401make it the last character before the final close bracket.
402For instance,
403.Ql [^]0-9-]
404means the set
405.Dq "everything except close bracket, zero through nine, and hyphen" .
406The string ends with the appearance of a character not in the
407(or, with a circumflex, in) set
408or when the field width runs out.
409.Pp
410If an
411.Cm l
412qualifier is present, the next pointer must be a pointer to
413.Vt wchar_t ,
414into which the input will be placed after conversion by
415.Xr mbrtowc 3 .
416.It Cm p
417Matches a pointer value (as printed by
418.Ql %p
419in
420.Xr printf 3 ) ;
421the next pointer must be a pointer to
422.Vt void .
423.It Cm n
424Nothing is expected;
425instead, the number of characters consumed thus far from the input
426is stored through the next pointer,
427which must be a pointer to
428.Vt int .
429This is
430.Em not
431a conversion, although it can be suppressed with the
432.Cm *
433flag.
434.El
435.Pp
436The decimal point
437character is defined in the program's locale (category
438.Dv LC_NUMERIC ) .
439.Pp
440For backwards compatibility, a
441.Dq conversion
442of
443.Ql %\e0
444causes an immediate return of
445.Dv EOF .
446.Sh RETURN VALUES
447These
448functions
449return
450the number of input items assigned, which can be fewer than provided
451for, or even zero, in the event of a matching failure.
452Zero
453indicates that, while there was input available,
454no conversions were assigned;
455typically this is due to an invalid input character,
456such as an alphabetic character for a
457.Ql %d
458conversion.
459The value
460.Dv EOF
461is returned if an input failure occurs before any conversion such as an
462end-of-file occurs.
463If an error or end-of-file occurs after conversion
464has begun,
465the number of conversions which were successfully completed is returned.
466.Sh SEE ALSO
467.Xr getc 3 ,
468.Xr mbrtowc 3 ,
469.Xr printf 3 ,
470.Xr strtod 3 ,
471.Xr strtol 3 ,
472.Xr strtoul 3 ,
473.Xr wscanf 3
474.Sh STANDARDS
475The functions
476.Fn fscanf ,
477.Fn scanf ,
478.Fn sscanf ,
479.Fn vfscanf ,
480.Fn vscanf
481and
482.Fn vsscanf
483conform to
484.St -isoC-99 .
485.Sh HISTORY
486The functions
487.Fn scanf ,
488.Fn fscanf ,
489and
490.Fn sscanf
491first appeared in
492.At v7 ,
493and
494.Fn vscanf ,
495.Fn vsscanf ,
496and
497.Fn vfscanf
498in
499.Bx 4.3 Reno .
500.Sh BUGS
501Earlier implementations of
502.Nm
503treated
504.Cm \&%D , \&%E , \&%F , \&%O
505and
506.Cm \&%X
507as their lowercase equivalents with an
508.Cm l
509modifier.
510In addition,
511.Nm
512treated an unknown conversion character as
513.Cm \&%d
514or
515.Cm \&%D ,
516depending on its case.
517This functionality has been removed.
518.Pp
519Numerical strings are truncated to 512 characters; for example,
520.Cm %f
521and
522.Cm %d
523are implicitly
524.Cm %512f
525and
526.Cm %512d .
527.Pp
528The
529.Cm %n$
530modifiers for positional arguments are not implemented.
531.Pp
532The
533.Nm
534family of functions do not correctly handle multibyte characters in the
535.Fa format
536argument.
537