18b810927STim J. Robbins.\" Copyright (c) 2002-2004 Tim J. Robbins. All rights reserved. 258f0484fSRodney W. Grimes.\" Copyright (c) 1993 358f0484fSRodney W. Grimes.\" The Regents of the University of California. All rights reserved. 458f0484fSRodney W. Grimes.\" 558f0484fSRodney W. Grimes.\" This code is derived from software contributed to Berkeley by 658f0484fSRodney W. Grimes.\" Donn Seeley of BSDI. 758f0484fSRodney W. Grimes.\" 858f0484fSRodney W. Grimes.\" Redistribution and use in source and binary forms, with or without 958f0484fSRodney W. Grimes.\" modification, are permitted provided that the following conditions 1058f0484fSRodney W. Grimes.\" are met: 1158f0484fSRodney W. Grimes.\" 1. Redistributions of source code must retain the above copyright 1258f0484fSRodney W. Grimes.\" notice, this list of conditions and the following disclaimer. 1358f0484fSRodney W. Grimes.\" 2. Redistributions in binary form must reproduce the above copyright 1458f0484fSRodney W. Grimes.\" notice, this list of conditions and the following disclaimer in the 1558f0484fSRodney W. Grimes.\" documentation and/or other materials provided with the distribution. 16fbbd9655SWarner Losh.\" 3. Neither the name of the University nor the names of its contributors 1758f0484fSRodney W. Grimes.\" may be used to endorse or promote products derived from this software 1858f0484fSRodney W. Grimes.\" without specific prior written permission. 1958f0484fSRodney W. Grimes.\" 2058f0484fSRodney W. Grimes.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 2158f0484fSRodney W. Grimes.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 2258f0484fSRodney W. Grimes.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 2358f0484fSRodney W. Grimes.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 2458f0484fSRodney W. Grimes.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 2558f0484fSRodney W. Grimes.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 2658f0484fSRodney W. Grimes.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 2758f0484fSRodney W. Grimes.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 2858f0484fSRodney W. Grimes.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 2958f0484fSRodney W. Grimes.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 3058f0484fSRodney W. Grimes.\" SUCH DAMAGE. 3158f0484fSRodney W. Grimes.\" 32*b3f9b738SBaptiste Daroussin.Dd September 9, 2019 3358f0484fSRodney W. Grimes.Dt MULTIBYTE 3 3458f0484fSRodney W. Grimes.Os 3558f0484fSRodney W. Grimes.Sh NAME 36bc0b3a18STim J. Robbins.Nm multibyte 37bc0b3a18STim J. Robbins.Nd multibyte and wide character manipulation functions 3825bb73e0SAlexey Zelkin.Sh LIBRARY 3925bb73e0SAlexey Zelkin.Lb libc 4058f0484fSRodney W. Grimes.Sh SYNOPSIS 41bc0b3a18STim J. Robbins.In limits.h 4232eef9aeSRuslan Ermilov.In stdlib.h 43bc0b3a18STim J. Robbins.In wchar.h 4458f0484fSRodney W. Grimes.Sh DESCRIPTION 45bc0b3a18STim J. RobbinsThe basic elements of some written natural languages, such as Chinese, 4658f0484fSRodney W. Grimescannot be represented uniquely with single C 4733992dc0SRuslan Ermilov.Vt char Ns s . 4858f0484fSRodney W. GrimesThe C standard supports two different ways of dealing with 49bc0b3a18STim J. Robbinsextended natural language encodings: 50bc0b3a18STim J. Robbinswide characters and 51bc0b3a18STim J. Robbinsmultibyte characters. 5258f0484fSRodney W. GrimesWide characters are an internal representation 5358f0484fSRodney W. Grimeswhich allows each basic element to map 5458f0484fSRodney W. Grimesto a single object of type 5533992dc0SRuslan Ermilov.Vt wchar_t . 5658f0484fSRodney W. GrimesMultibyte characters are used for input and output 5758f0484fSRodney W. Grimesand code each basic element as a sequence of C 5833992dc0SRuslan Ermilov.Vt char Ns s . 5958f0484fSRodney W. GrimesIndividual basic elements may map into one or more 60c4d9468eSRuslan Ermilov(up to 61d384a679STim J. Robbins.Dv MB_LEN_MAX ) 6258f0484fSRodney W. Grimesbytes in a multibyte character. 6358f0484fSRodney W. Grimes.Pp 6458f0484fSRodney W. GrimesThe current locale 6558f0484fSRodney W. Grimes.Pq Xr setlocale 3 6658f0484fSRodney W. Grimesgoverns the interpretation of wide and multibyte characters. 6758f0484fSRodney W. GrimesThe locale category 6858f0484fSRodney W. Grimes.Dv LC_CTYPE 6958f0484fSRodney W. Grimesspecifically controls this interpretation. 7058f0484fSRodney W. GrimesThe 7133992dc0SRuslan Ermilov.Vt wchar_t 7258f0484fSRodney W. Grimestype is wide enough to hold the largest value 7358f0484fSRodney W. Grimesin the wide character representations for all locales. 7458f0484fSRodney W. Grimes.Pp 7558f0484fSRodney W. GrimesMultibyte strings may contain 7658f0484fSRodney W. Grimes.Sq shift 7758f0484fSRodney W. Grimesindicators to switch to and from 7858f0484fSRodney W. Grimesparticular modes within the given representation. 7958f0484fSRodney W. GrimesIf explicit bytes are used to signal shifting, 8058f0484fSRodney W. Grimesthese are not recognized as separate characters 8158f0484fSRodney W. Grimesbut are lumped with a neighboring character. 8258f0484fSRodney W. GrimesThere is always a distinguished 8358f0484fSRodney W. Grimes.Sq initial 8458f0484fSRodney W. Grimesshift state. 8533992dc0SRuslan ErmilovSome functions (e.g., 8633992dc0SRuslan Ermilov.Xr mblen 3 , 8733992dc0SRuslan Ermilov.Xr mbtowc 3 8858f0484fSRodney W. Grimesand 8933992dc0SRuslan Ermilov.Xr wctomb 3 ) 90bc0b3a18STim J. Robbinsmaintain static shift state internally, whereas 9133992dc0SRuslan Ermilovothers store it in an 92bc0b3a18STim J. Robbins.Vt mbstate_t 93bc0b3a18STim J. Robbinsobject passed by the caller. 94bc0b3a18STim J. RobbinsShift states are undefined after a call to 9533992dc0SRuslan Ermilov.Xr setlocale 3 9658f0484fSRodney W. Grimeswith the 9758f0484fSRodney W. Grimes.Dv LC_CTYPE 9858f0484fSRodney W. Grimesor 9958f0484fSRodney W. Grimes.Dv LC_ALL 10058f0484fSRodney W. Grimescategories. 10158f0484fSRodney W. Grimes.Pp 10258f0484fSRodney W. GrimesFor convenience in processing, 10358f0484fSRodney W. Grimesthe wide character with value 0 104c4d9468eSRuslan Ermilov(the null wide character) 10558f0484fSRodney W. Grimesis recognized as the wide character string terminator, 10658f0484fSRodney W. Grimesand the character with value 0 107c4d9468eSRuslan Ermilov(the null byte) 10858f0484fSRodney W. Grimesis recognized as the multibyte character string terminator. 10958f0484fSRodney W. GrimesNull bytes are not permitted within multibyte characters. 11058f0484fSRodney W. Grimes.Pp 111bc0b3a18STim J. RobbinsThe C library provides the following functions for dealing with 112bc0b3a18STim J. Robbinsmultibyte characters: 113bc0b3a18STim J. Robbins.Bl -column "Description" 114bc0b3a18STim J. Robbins.It Sy "Function Description" 11533992dc0SRuslan Ermilov.It Xr mblen 3 Ta "get number of bytes in a character" 11633992dc0SRuslan Ermilov.It Xr mbrlen 3 Ta "get number of bytes in a character (restartable)" 11733992dc0SRuslan Ermilov.It Xr mbrtowc 3 Ta "convert a character to a wide-character code (restartable)" 11833992dc0SRuslan Ermilov.It Xr mbsrtowcs 3 Ta "convert a character string to a wide-character string (restartable)" 11933992dc0SRuslan Ermilov.It Xr mbstowcs 3 Ta "convert a character string to a wide-character string" 12033992dc0SRuslan Ermilov.It Xr mbtowc 3 Ta "convert a character to a wide-character code" 12133992dc0SRuslan Ermilov.It Xr wcrtomb 3 Ta "convert a wide-character code to a character (restartable)" 12233992dc0SRuslan Ermilov.It Xr wcstombs 3 Ta "convert a wide-character string to a character string" 12333992dc0SRuslan Ermilov.It Xr wcsrtombs 3 Ta "convert a wide-character string to a character string (restartable)" 12433992dc0SRuslan Ermilov.It Xr wctomb 3 Ta "convert a wide-character code to a character" 125bc0b3a18STim J. Robbins.El 126d6498251SPhilippe Charnier.Sh SEE ALSO 127*b3f9b738SBaptiste Daroussin.Xr localedef 1 , 12858f0484fSRodney W. Grimes.Xr setlocale 3 , 12933992dc0SRuslan Ermilov.Xr stdio 3 , 13039e2a81eSTim J. Robbins.Xr big5 5 , 1318962b7a5STim J. Robbins.Xr euc 5 , 13239e2a81eSTim J. Robbins.Xr gb18030 5 , 133cc7a3285STim J. Robbins.Xr gb2312 5 , 134dcb2df4cSTim J. Robbins.Xr gbk 5 , 13539e2a81eSTim J. Robbins.Xr mskanji 5 , 136972baa37STim J. Robbins.Xr utf8 5 13758f0484fSRodney W. Grimes.Sh STANDARDS 138bc0b3a18STim J. RobbinsThese functions conform to 1398b810927STim J. Robbins.St -isoC-99 . 140