xref: /illumos-gate/usr/src/man/man3c/string.3c (revision c057d312c6f715bb3aeadb653466e7046f26c4af)

Sun Microsystems, Inc. gratefully acknowledges The Open Group for
permission to reproduce portions of its copyrighted documentation.
Original documentation from The Open Group can be obtained online at
http://www.opengroup.org/bookstore/.

The Institute of Electrical and Electronics Engineers and The Open
Group, have given us permission to reprint portions of their
documentation.

In the following statement, the phrase ``this text'' refers to portions
of the system documentation.

Portions of this text are reprinted and reproduced in electronic form
in the SunOS Reference Manual, from IEEE Std 1003.1, 2004 Edition,
Standard for Information Technology -- Portable Operating System
Interface (POSIX), The Open Group Base Specifications Issue 6,
Copyright (C) 2001-2004 by the Institute of Electrical and Electronics
Engineers, Inc and The Open Group. In the event of any discrepancy
between these versions and the original IEEE and The Open Group
Standard, the original IEEE and The Open Group Standard is the referee
document. The original Standard can be obtained online at
http://www.opengroup.org/unix/online.html.

This notice shall appear on any product containing this material.

The contents of this file are subject to the terms of the
Common Development and Distribution License (the "License").
You may not use this file except in compliance with the License.

You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
or http://www.opensolaris.org/os/licensing.
See the License for the specific language governing permissions
and limitations under the License.

When distributing Covered Code, include this CDDL HEADER in each
file and include the License file at usr/src/OPENSOLARIS.LICENSE.
If applicable, add the following below this CDDL HEADER, with the
fields enclosed by brackets "[]" replaced with your own identifying
information: Portions Copyright [yyyy] [name of copyright owner]


Copyright 1989 AT&T
Portions Copyright (c) 1992, X/Open Company Limited. All Rights Reserved.
Portions Copyright (c) 1994 Man-cgi 1.15, Panagiotis Christias (christia@softlab.ntua.gr)
Portions Copyright (c) 1996-2008 Modified for NetBSD by Kimmo Suominen (kimmo@suominen.com)
Copyright (c) 2008, Sun Microsystems, Inc. All Rights Reserved.
Copyright 2014 Garrett D'Amore <garrett@damore.org>
Copyright (c) 2014, Joyent, Inc.

STRING 3C "Mar 23, 2016"
NAME
string, strcasecmp, strcasecmp_l, strncasecmp, strncasecmp_l, strcat, strncat, strlcat, strchr, strchrnul, strrchr, strcmp, strncmp, stpcpy, stpncpy, strcpy, strncpy, strlcpy, strcspn, strspn, strdup, strndup, strdupa, strndupa, strlen, strnlen, strpbrk, strsep, strstr, strnstr, strcasestr, strtok, strtok_r - string operations
SYNOPSIS

#include <strings.h>

int strcasecmp(const char *s1, const char *s2);

int strcasecmp_l(const char *s1, const char *s2, locale_t loc);

int strncasecmp(const char *s1, const char *s2, size_t n);

int strncasecmp_l(const char *s1, const char *s2, size_t n, locale_t loc);

#include <string.h>

char *strcat(char *restrict s1, const char *restrict s2);

char *strncat(char *restrict s1, const char *restrict s2, size_t n);

size_t strlcat(char *dst, const char *src, size_t dstsize);

char *strchr(const char *s, int c);

char *strrchr(const char *s, int c);

int strcmp(const char *s1, const char *s2);

int strncmp(const char *s1, const char *s2, size_t n);

char *stpcpy(char *restrict s1, const char *restrict s2);

char *stpncpy(char *restrict s1, const char *restrict s2, size_t n);

char *strcpy(char *restrict s1, const char *restrict s2);

char *strncpy(char *restrict s1, const char *restrict s2, size_t n);

size_t strlcpy(char *dst, const char *src, size_t dstsize);

size_t strcspn(const char *s1, const char *s2);

size_t strspn(const char *s1, const char *s2);

char *strdup(const char *s1);

char *strndup(const char *s1, size_t n);

char *strdupa(const char *s1);

char *strndupa(const char *s1, size_t n);

size_t strlen(const char *s);

size_t strnlen(const char *s, size_t n);

char *strpbrk(const char *s1, const char *s2);

char *strsep(char **stringp, const char *delim);

char *strstr(const char *s1, const char *s2);

char *strnstr(const char *s1, const char *s2, size_t n);

char *strcasestr(const char *s1, const char *s2);

char *strcasestr_l(const char *s1, const char *s2, locale_t loc);

char *strtok(char *restrict s1, const char *restrict s2);

char *strtok_r(char *restrict s1, const char *restrict s2,
 char **restrict lasts);
"ISO C++"

#include <string.h>

const char *strchr(const char *s, int c);

const char *strchrnul(const char *s, int c);

const char *strpbrk(const char *s1, const char *s2);

const char *strrchr(const char *s, int c);

const char *strstr(const char *s1, const char *s2);

#include <cstring>

char *std::strchr(char *s, int c);

char *std::strpbrk(char *s1, const char *s2);

char *std::strrchr(char *s, int c);

char *std::strstr(char *s1, const char *s2);
DESCRIPTION

The arguments s, s1, and s2 point to strings (arrays of characters terminated by a null character). The strcat(), strncat(), strlcat(), strcpy(), stpcpy(), stpncpy(), strncpy(), strlcpy(), strsep(), strtok(), and strtok_r() functions all alter their first argument. Additionally, the strcat(), stpcpy(), and strcpy() functions do not check for overflow of the array.

"strcasecmp(), strncasecmp()"

The strcasecmp() and strncasecmp() functions are case-insensitive versions of strcmp() and strncmp() respectively, described below.

The strcasecmp() and strncasecmp() functions compare two strings byte-by-byte, after converting each upper-case character to lower-case (as determined by the LC_CTYPE category of the current locale). Note that neither the contents pointed to by s1 nor s2 are modified.

The functions return an integer greater than, equal to, or less than 0, if the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2 respectively. The sign of a non-zero return value is determined by the sign of the difference between the values of the first pair of bytes that differ in the

The strncasecmp() function examines at most n bytes from each string.

"strcasecmp_l(), strncasecmp_l()"

The strcasecmp_l() and strncasecmp_l() functions behave identically to strcasecmp() and strncasecmp(), except instead of operating in the current locale, they instead operate in the locale specified by loc.

"strcat(), strncat(), strlcat()"

The strcat() function appends a copy of string s2, including the terminating null character, to the end of string s1. The strncat() function appends at most n characters of s2 to s1, not including any terminating null character, and then appends a null character. Each returns a pointer to the null-terminated result. The initial character of s2 overrides the null character at the end of s1. If copying takes place between objects that overlap, the behavior of strcat(), strncat(), and strlcat() is undefined.

The strlcat() function appends at most (dstsize-strlen(dst)-1) characters of src to dst (dstsize being the size of the string buffer dst). If the string pointed to by dst contains a null-terminated string that fits into dstsize bytes when strlcat() is called, the string pointed to by dst will be a null-terminated string that fits in dstsize bytes (including the terminating null character) when it completes, and the initial character of src will override the null character at the end of dst. If the string pointed to by dst is longer than dstsize bytes when strlcat() is called, the string pointed to by dst will not be changed. The function returns min{dstsize,strlen(dst)}+strlen(src). Buffer overflow can be checked as follows:

if (strlcat(dst, src, dstsize) >= dstsize)
 return -1;
"strchr(), strrchr(), strchrnul()"

The strchr() function returns a pointer to the first occurrence of c (converted to a char) in string s, or a null pointer if c does not occur in the string. The strrchr() function returns a pointer to the last occurrence of c. The null character terminating a string is considered to be part of the string. The strchrnul() function behaves similarly to strchr(), except when the character c is not found, it returns a pointer to the null terminator of the string s and not a null pointer.

"strcmp(), strncmp()"

The strcmp() function compares two strings byte-by-byte, according to the ordering of your machine's character set. The function returns an integer greater than, equal to, or less than 0, if the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2 respectively. The sign of a non-zero return value is determined by the sign of the difference between the values of the first pair of bytes that differ in the strings being compared. The strncmp() function makes the same comparison but looks at a maximum of n bytes. Bytes following a null byte are not compared.

"strcpy(), strncpy(), strlcpy()"

The strcpy() function copies string s2 to s1, including the terminating null character, stopping after the null character has been copied. The strncpy() function copies exactly n bytes, truncating s2 or adding null characters to s1 if necessary. The result will not be null-terminated if the length of s2 is n or more. Both the strcpy() and strncpy() functions return s1. If copying takes place between objects that overlap, the behavior of strcpy(), strncpy(), and strlcpy() is undefined.

The strlcpy() function copies at most dstsize-1 characters (dstsize being the size of the string buffer dst) from src to dst, truncating src if necessary. The result is always null-terminated. The function returns strlen(src). Buffer overflow can be checked as follows:

if (strlcpy(dst, src, dstsize) >= dstsize)
 return -1;
"stpcpy(), stpncpy()"

The stpcpy() and stpncpy() functions behave identically to strcpy() and strncpy() respectively; however, instead of returning a pointer to the beginning of s1, they return a pointer to the terminating null character.

"strcspn(), strspn()"

The strcspn() function returns the length of the initial segment of string s1 that consists entirely of characters not from string s2. The strspn() function returns the length of the initial segment of string s1 that consists entirely of characters from string s2.

"strdup(), strndup(), strdupa(), strndupa()"

The strdup() function returns a pointer to a new string that is a duplicate of the string pointed to by s1. The returned pointer can be passed to free(). The space for the new string is obtained using malloc(3C). If the new string cannot be created, a null pointer is returned and errno may be set to ENOMEM to indicate that the storage space available is insufficient. The strndup() function is identical to strdup(), except it copies at most n bytes from s1 and ensures the copied string is always null terminated.

The functions strdupa() and strndupa() behave identically to strdup() and strndup() respectively; however, instead of allocating memory using malloc(3C), they use alloca(3C). These functions are provided for compatibility only, their use is strongly discouraged due to their use of alloca(3C).

"strlen(), strnlen()"
The strlen() function returns the number of bytes in s, not including the terminating null character.

The strnlen() function returns the smaller of n or the number of bytes in s, not including the terminating null character. The strnlen() function never examines more than n bytes of the string pointed to by s.

"strpbrk()"

The strpbrk() function returns a pointer to the first occurrence in string s1 of any character from string s2, or a null pointer if no character from s2 exists in s1.

"strsep()"

The strsep() function locates, in the null-terminated string referenced by *stringp, the first occurrence of any character in the string delim (or the terminating `\e0' character) and replaces it with a `\e0'. The location of the next character after the delimiter character (or NULL, if the end of the string was reached) is stored in *stringp. The original value of *stringp is returned.

An ``empty'' field (one caused by two adjacent delimiter characters) can be detected by comparing the location referenced by the pointer returned by strsep() to `\e0'.

If *stringp is initially NULL, strsep() returns NULL.

"strstr(), strnstr(), strcasestr(), strcasestr_l()"

The strstr() function locates the first occurrence of the string s2 (excluding the terminating null character) in string s1 and returns a pointer to the located string, or a null pointer if the string is not found. If s2 points to a string with zero length (that is, the string ""), the function returns s1. The strnstr() function performs the same search as strstr(), but only considers up to n bytes of s1. Bytes following a null byte are not compared.

The strcasestr() and strcasestr_l() functions are similar to strstr(), but both functions ignore the case of both s1 and s2. Where as the strcasestr() function operates in the current locale, the strcasestr_l() function operates in the locale specified by loc.

"strtok()"

A sequence of calls to strtok() breaks the string pointed to by s1 into a sequence of tokens, each of which is delimited by a byte from the string pointed to by s2. The first call in the sequence has s1 as its first argument, and is followed by calls with a null pointer as their first argument. The separator string pointed to by s2 can be different from call to call.

The first call in the sequence searches the string pointed to by s1 for the first byte that is not contained in the current separator string pointed to by s2. If no such byte is found, then there are no tokens in the string pointed to by s1 and strtok() returns a null pointer. If such a byte is found, it is the start of the first token.

The strtok() function then searches from there for a byte that is contained in the current separator string. If no such byte is found, the current token extends to the end of the string pointed to by s1, and subsequent searches for a token return a null pointer. If such a byte is found, it is overwritten by a null byte that terminates the current token. The strtok() function saves a pointer to the following byte in thread-specific data, from which the next search for a token starts.

Each subsequent call, with a null pointer as the value of the first argument, starts searching from the saved pointer and behaves as described above.

See Example 1, 2, and 3 in the EXAMPLES section for examples of strtok() usage and the explanation in NOTES.

"strtok_r()"

The strtok_r() function considers the null-terminated string s1 as a sequence of zero or more text tokens separated by spans of one or more characters from the separator string s2. The argument lasts points to a user-provided pointer which points to stored information necessary for strtok_r() to continue scanning the same string.

In the first call to strtok_r(), s1 points to a null-terminated string, s2 to a null-terminated string of separator characters, and the value pointed to by lasts is ignored. The strtok_r() function returns a pointer to the first character of the first token, writes a null character into s1 immediately following the returned token, and updates the pointer to which lasts points.

In subsequent calls, s1 is a null pointer and lasts is unchanged from the previous call so that subsequent calls move through the string s1, returning successive tokens until no tokens remain. The separator string s2 can be different from call to call. When no token remains in s1, a null pointer is returned.

See Example 3 in the EXAMPLES section for an example of strtok_r() usage and the explanation in NOTES.

EXAMPLES

Example 1 Search for word separators.

The following example searches for tokens separated by space characters.

#include <string.h>
...
char *token;
char line[] = "LINE TO BE SEPARATED";
char *search = " ";

/* Token will point to "LINE". */
token = strtok(line, search);

/* Token will point to "TO". */
token = strtok(NULL, search);

Example 2 Break a Line.

The following example uses strtok to break a line into two character strings separated by any combination of SPACEs, TABs, or NEWLINEs.

#include <string.h>
...
struct element {
 char *key;
 char *data;
};
...
char line[LINE_MAX];
char *key, *data;
...
key = strtok(line, " \en");
data = strtok(NULL, " \en");

Example 3 Search for tokens.

The following example uses both strtok() and strtok_r() to search for tokens separated by one or more characters from the string pointed to by the second argument, "/".

#define __EXTENSIONS__
#include <stdio.h>
#include <string.h>

int
main() {
 char *buf="5/90/45";
 char *token;
 char *lasts;

 printf("tokenizing \e"%s\e" with strtok():\en", buf);
 if ((token = strtok(buf, "/")) != NULL) {
 printf("token = "%s\e"\en", token);
 while ((token = strtok(NULL, "/")) != NULL) {
 printf("token = \e"%s\e"\en", token);
 }
 }

 buf = "//5//90//45//";
 printf("\entokenizing \e"%s\e" with strtok_r():\en", buf);
 if ((token = strtok_r(buf, "/", &lasts)) != NULL) {
 printf("token = \e"%s\e"\en", token);
 while ((token = strtok_r(NULL, "/", &lasts)) != NULL) {
 printf("token = \e"%s\e"\en", token);
 }
 }
}

When compiled and run, this example produces the following output:

tokenizing "5/90/45" with strtok():
token = "5"
token = "90"
token = "45"

tokenizing "//5//90//45//" with strtok_r():
token = "5"
token = "90"
token = "45"
ATTRIBUTES

See attributes(5) for descriptions of the following attributes:

ATTRIBUTE TYPE ATTRIBUTE VALUE
Interface Stability See below.
MT-Level See below.
Standard See below.

The strlcat(), strlcpy(), and strsep() functions are Committed. All the rest are Standard.

The strtok(), strdup(), and strndup() functions are MT-Safe. The remaining functions are Async-Signal-Safe.

For all except strlcat(), strlcpy(), and strsep(), see standards(5).

SEE ALSO

malloc(3C), newlocale(3C), setlocale(3C), strxfrm(3C), uselocale(3C), attributes(5), standards(5)

NOTES

When compiling multithreaded applications, the _REENTRANT flag must be defined on the compile line. This flag should only be used in multithreaded applications.

A single-threaded application can gain access to strtok_r() only by defining __EXTENSIONS__ or by defining _POSIX_C_SOURCE to a value greater than or equal to 199506L.

Except where noted otherwise, all of these functions assume the default locale ``C.'' For some locales, strxfrm(3C) should be applied to the strings before they are passed to the functions.

The strtok() function is safe to use in multithreaded applications because it saves its internal state in a thread-specific data area. However, its use is discouraged, even for single-threaded applications. The strtok_r() function should be used instead.

Do not pass the address of a character string literal as the argument s1 to either strtok() or strtok_r(). Similarly, do not pass a pointer to the address of a character string literal as the argument stringp to strsep(). These functions can modify the storage pointed to by s1 in the case of strtok() and strtok_r() or *stringp in the case of strsep(). The C99 standard specifies that attempting to modify the storage occupied by a string literal results in undefined behavior. This allows compilers (including gcc and the Sun Studio compilers when the -xstrconst flag is used) to place string literals in read-only memory. Note that in Example 1 above, this problem is avoided because the variable line is declared as a writable array of type char that is initialized by a string literal rather than a pointer to char that points to a string literal.