Copyright (c) 2007, Sun Microsystems Inc. All Rights Reserved.
The contents of this file are subject to the terms of the Common Development and Distribution License (the "License"). You may not use this file except in compliance with the License.
You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE or http://www.opensolaris.org/os/licensing. See the License for the specific language governing permissions and limitations under the License.
When distributing Covered Code, include this CDDL HEADER in each file and include the License file at usr/src/OPENSOLARIS.LICENSE. If applicable, add the following below this CDDL HEADER, with the fields enclosed by brackets "[]" replaced with your own identifying information: Portions Copyright [yyyy] [name of copyright owner]
#include <sys/types.h> #include <sys/errno.h> #include <sys/sunddi.h> size_t u8_textprep_str(char *inarray, size_t *inlen, char *outarray, size_t *outlen, int flag, size_t unicode_version, int *errno);
A pointer to a byte array containing a sequence of UTF-8 character bytes to be prepared.
As input argument, the number of bytes to be prepared in inarray. As output argument, the number of bytes in inarray still not consumed.
A pointer to a byte array where prepared UTF-8 character bytes can be saved.
As input argument, the number of available bytes at outarray where prepared character bytes can be saved. As output argument, after the conversion, the number of bytes still available at outarray.
The possible preparation options constructed by a bitwise-inclusive-OR of the following values: U8_TEXTPREP_IGNORE_NULL
Normally u8_textprep_str() stops the preparation if it encounters null byte even if the current inlen is pointing to a value bigger than zero. With this option, null byte does not stop the preparation and the preparation continues until inlen specified amount of inarray bytes are all consumed for preparation or an error happened.
Normally u8_textprep_str() stops the preparation if it encounters illegal or incomplete characters with corresponding errno values. When this option is set, u8_textprep_str() does not stop the preparation and instead treats such characters as no need to do any preparation.
Map lowercase characters to uppercase characters if applicable.
Map uppercase characters to lowercase characters if applicable.
Apply Unicode Normalization Form D.
Apply Unicode Normalization Form C.
Apply Unicode Normalization Form KD.
Apply Unicode Normalization Form KC.
The version of Unicode data that should be used during UTF-8 text preparation. The following values are supported: U8_UNICODE_320
Use Unicode 3.2.0 data during comparison.
Use Unicode 5.0.0 data during comparison.
Use the latest Unicode version data available which is Unicode 5.0.0 currently.
The error value when preparation is not completed or fails. The following values are supported: E2BIG
Text preparation stopped due to lack of space in the output array.
Specified option values are conflicting and cannot be supported.
Text preparation stopped due to an input byte that does not belong to UTF-8.
Text preparation stopped due to an incomplete UTF-8 character at the end of the input array.
The specified Unicode version value is not a supported version.
If flag is U8_TEXTPREP_IGNORE_INVALID and a sequence of input bytes does not form a valid UTF-8 character, preparation stops after the previous successfully prepared character. If flag is U8_TEXTPREP_IGNORE_INVALID and the input array ends with an incomplete UTF-8 character, preparation stops after the previous successfully prepared bytes. If the output array is not large enough to hold the entire prepared text, preparation stops just prior to the input bytes that would cause the output array to overflow. The value pointed to by inlen is decremented to reflect the number of bytes still not prepared in the input array. The value pointed to by outlen is decremented to reflect the number of bytes still available in the output array.
#include <sys/types.h> #include <sys/errno.h> #include <sys/sunddi.h> . . . size_t ret; char ib[MAXPATHLEN]; char ob[MAXPATHLEN]; size_t il, ol; int err; . . . /* * We got a UTF-8 pathname from somewhere. * * Calculate the length of input string including the terminating * NULL byte and prepare other arguments. */ (void) strlcpy(ib, pathname, MAXPATHLEN); il = strlen(ib) + 1; ol = MAXPATHLEN; /* * Do toupper case folding, apply Unicode Normalization Form D, * ignore NULL byte, and ignore any illegal/incomplete characters. */ ret = u8_textprep_str(ib, &il, ob, &ol, (U8_TEXTPREP_IGNORE_NULL|U8_TEXTPREP_IGNORE_INVALID| U8_TEXTPREP_TOUPPER|U8_TEXTPREP_NFD), U8_UNICODE_LATEST, &err); if (ret == (size_t)-1) { if (err == E2BIG) return (-1); if (err == EBADF) return (-2); if (err == ERANGE) return (-3); return (-4); }
ATTRIBUTE TYPE ATTRIBUTE VALUE |
Interface Stability Committed |
The Unicode Standard (http://www.unicode.org)