xref: /freebsd/lib/libc/locale/multibyte.3 (revision dc36d6f9bb1753f3808552f3afd30eda9a7b206a)
18b810927STim J. Robbins.\" Copyright (c) 2002-2004 Tim J. Robbins. All rights reserved.
258f0484fSRodney W. Grimes.\" Copyright (c) 1993
358f0484fSRodney W. Grimes.\"	The Regents of the University of California.  All rights reserved.
458f0484fSRodney W. Grimes.\"
558f0484fSRodney W. Grimes.\" This code is derived from software contributed to Berkeley by
658f0484fSRodney W. Grimes.\" Donn Seeley of BSDI.
758f0484fSRodney W. Grimes.\"
858f0484fSRodney W. Grimes.\" Redistribution and use in source and binary forms, with or without
958f0484fSRodney W. Grimes.\" modification, are permitted provided that the following conditions
1058f0484fSRodney W. Grimes.\" are met:
1158f0484fSRodney W. Grimes.\" 1. Redistributions of source code must retain the above copyright
1258f0484fSRodney W. Grimes.\"    notice, this list of conditions and the following disclaimer.
1358f0484fSRodney W. Grimes.\" 2. Redistributions in binary form must reproduce the above copyright
1458f0484fSRodney W. Grimes.\"    notice, this list of conditions and the following disclaimer in the
1558f0484fSRodney W. Grimes.\"    documentation and/or other materials provided with the distribution.
16fbbd9655SWarner Losh.\" 3. Neither the name of the University nor the names of its contributors
1758f0484fSRodney W. Grimes.\"    may be used to endorse or promote products derived from this software
1858f0484fSRodney W. Grimes.\"    without specific prior written permission.
1958f0484fSRodney W. Grimes.\"
2058f0484fSRodney W. Grimes.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
2158f0484fSRodney W. Grimes.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
2258f0484fSRodney W. Grimes.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
2358f0484fSRodney W. Grimes.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
2458f0484fSRodney W. Grimes.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
2558f0484fSRodney W. Grimes.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
2658f0484fSRodney W. Grimes.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
2758f0484fSRodney W. Grimes.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
2858f0484fSRodney W. Grimes.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
2958f0484fSRodney W. Grimes.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
3058f0484fSRodney W. Grimes.\" SUCH DAMAGE.
3158f0484fSRodney W. Grimes.\"
32*b3f9b738SBaptiste Daroussin.Dd September 9, 2019
3358f0484fSRodney W. Grimes.Dt MULTIBYTE 3
3458f0484fSRodney W. Grimes.Os
3558f0484fSRodney W. Grimes.Sh NAME
36bc0b3a18STim J. Robbins.Nm multibyte
37bc0b3a18STim J. Robbins.Nd multibyte and wide character manipulation functions
3825bb73e0SAlexey Zelkin.Sh LIBRARY
3925bb73e0SAlexey Zelkin.Lb libc
4058f0484fSRodney W. Grimes.Sh SYNOPSIS
41bc0b3a18STim J. Robbins.In limits.h
4232eef9aeSRuslan Ermilov.In stdlib.h
43bc0b3a18STim J. Robbins.In wchar.h
4458f0484fSRodney W. Grimes.Sh DESCRIPTION
45bc0b3a18STim J. RobbinsThe basic elements of some written natural languages, such as Chinese,
4658f0484fSRodney W. Grimescannot be represented uniquely with single C
4733992dc0SRuslan Ermilov.Vt char Ns s .
4858f0484fSRodney W. GrimesThe C standard supports two different ways of dealing with
49bc0b3a18STim J. Robbinsextended natural language encodings:
50bc0b3a18STim J. Robbinswide characters and
51bc0b3a18STim J. Robbinsmultibyte characters.
5258f0484fSRodney W. GrimesWide characters are an internal representation
5358f0484fSRodney W. Grimeswhich allows each basic element to map
5458f0484fSRodney W. Grimesto a single object of type
5533992dc0SRuslan Ermilov.Vt wchar_t .
5658f0484fSRodney W. GrimesMultibyte characters are used for input and output
5758f0484fSRodney W. Grimesand code each basic element as a sequence of C
5833992dc0SRuslan Ermilov.Vt char Ns s .
5958f0484fSRodney W. GrimesIndividual basic elements may map into one or more
60c4d9468eSRuslan Ermilov(up to
61d384a679STim J. Robbins.Dv MB_LEN_MAX )
6258f0484fSRodney W. Grimesbytes in a multibyte character.
6358f0484fSRodney W. Grimes.Pp
6458f0484fSRodney W. GrimesThe current locale
6558f0484fSRodney W. Grimes.Pq Xr setlocale 3
6658f0484fSRodney W. Grimesgoverns the interpretation of wide and multibyte characters.
6758f0484fSRodney W. GrimesThe locale category
6858f0484fSRodney W. Grimes.Dv LC_CTYPE
6958f0484fSRodney W. Grimesspecifically controls this interpretation.
7058f0484fSRodney W. GrimesThe
7133992dc0SRuslan Ermilov.Vt wchar_t
7258f0484fSRodney W. Grimestype is wide enough to hold the largest value
7358f0484fSRodney W. Grimesin the wide character representations for all locales.
7458f0484fSRodney W. Grimes.Pp
7558f0484fSRodney W. GrimesMultibyte strings may contain
7658f0484fSRodney W. Grimes.Sq shift
7758f0484fSRodney W. Grimesindicators to switch to and from
7858f0484fSRodney W. Grimesparticular modes within the given representation.
7958f0484fSRodney W. GrimesIf explicit bytes are used to signal shifting,
8058f0484fSRodney W. Grimesthese are not recognized as separate characters
8158f0484fSRodney W. Grimesbut are lumped with a neighboring character.
8258f0484fSRodney W. GrimesThere is always a distinguished
8358f0484fSRodney W. Grimes.Sq initial
8458f0484fSRodney W. Grimesshift state.
8533992dc0SRuslan ErmilovSome functions (e.g.,
8633992dc0SRuslan Ermilov.Xr mblen 3 ,
8733992dc0SRuslan Ermilov.Xr mbtowc 3
8858f0484fSRodney W. Grimesand
8933992dc0SRuslan Ermilov.Xr wctomb 3 )
90bc0b3a18STim J. Robbinsmaintain static shift state internally, whereas
9133992dc0SRuslan Ermilovothers store it in an
92bc0b3a18STim J. Robbins.Vt mbstate_t
93bc0b3a18STim J. Robbinsobject passed by the caller.
94bc0b3a18STim J. RobbinsShift states are undefined after a call to
9533992dc0SRuslan Ermilov.Xr setlocale 3
9658f0484fSRodney W. Grimeswith the
9758f0484fSRodney W. Grimes.Dv LC_CTYPE
9858f0484fSRodney W. Grimesor
9958f0484fSRodney W. Grimes.Dv LC_ALL
10058f0484fSRodney W. Grimescategories.
10158f0484fSRodney W. Grimes.Pp
10258f0484fSRodney W. GrimesFor convenience in processing,
10358f0484fSRodney W. Grimesthe wide character with value 0
104c4d9468eSRuslan Ermilov(the null wide character)
10558f0484fSRodney W. Grimesis recognized as the wide character string terminator,
10658f0484fSRodney W. Grimesand the character with value 0
107c4d9468eSRuslan Ermilov(the null byte)
10858f0484fSRodney W. Grimesis recognized as the multibyte character string terminator.
10958f0484fSRodney W. GrimesNull bytes are not permitted within multibyte characters.
11058f0484fSRodney W. Grimes.Pp
111bc0b3a18STim J. RobbinsThe C library provides the following functions for dealing with
112bc0b3a18STim J. Robbinsmultibyte characters:
113bc0b3a18STim J. Robbins.Bl -column "Description"
114bc0b3a18STim J. Robbins.It Sy "Function	Description"
11533992dc0SRuslan Ermilov.It Xr mblen 3 Ta "get number of bytes in a character"
11633992dc0SRuslan Ermilov.It Xr mbrlen 3 Ta "get number of bytes in a character (restartable)"
11733992dc0SRuslan Ermilov.It Xr mbrtowc 3 Ta "convert a character to a wide-character code (restartable)"
11833992dc0SRuslan Ermilov.It Xr mbsrtowcs 3 Ta "convert a character string to a wide-character string (restartable)"
11933992dc0SRuslan Ermilov.It Xr mbstowcs 3 Ta "convert a character string to a wide-character string"
12033992dc0SRuslan Ermilov.It Xr mbtowc 3 Ta "convert a character to a wide-character code"
12133992dc0SRuslan Ermilov.It Xr wcrtomb 3 Ta "convert a wide-character code to a character (restartable)"
12233992dc0SRuslan Ermilov.It Xr wcstombs 3 Ta "convert a wide-character string to a character string"
12333992dc0SRuslan Ermilov.It Xr wcsrtombs 3 Ta "convert a wide-character string to a character string (restartable)"
12433992dc0SRuslan Ermilov.It Xr wctomb 3 Ta "convert a wide-character code to a character"
125bc0b3a18STim J. Robbins.El
126d6498251SPhilippe Charnier.Sh SEE ALSO
127*b3f9b738SBaptiste Daroussin.Xr localedef 1 ,
12858f0484fSRodney W. Grimes.Xr setlocale 3 ,
12933992dc0SRuslan Ermilov.Xr stdio 3 ,
13039e2a81eSTim J. Robbins.Xr big5 5 ,
1318962b7a5STim J. Robbins.Xr euc 5 ,
13239e2a81eSTim J. Robbins.Xr gb18030 5 ,
133cc7a3285STim J. Robbins.Xr gb2312 5 ,
134dcb2df4cSTim J. Robbins.Xr gbk 5 ,
13539e2a81eSTim J. Robbins.Xr mskanji 5 ,
136972baa37STim J. Robbins.Xr utf8 5
13758f0484fSRodney W. Grimes.Sh STANDARDS
138bc0b3a18STim J. RobbinsThese functions conform to
1398b810927STim J. Robbins.St -isoC-99 .
140