xref: /illumos-gate/usr/src/man/man3c/string.3c (revision 5a45682c3e7b01faa1761ab8d86f0bed4cc1d363)
te
Copyright (c) 2008, Sun Microsystems, Inc. All Rights Reserved.
Copyright 1989 AT&T
Portions Copyright (c) 1994 Man-cgi 1.15, Panagiotis Christias (christia@softlab.ntua.gr)
Portions Copyright (c) 1996-2008 Modified for NetBSD by Kimmo Suominen (kimmo@suominen.com)
Portions Copyright (c) 1992, X/Open Company Limited. All Rights Reserved.
Sun Microsystems, Inc. gratefully acknowledges The Open Group for permission to reproduce portions of its copyrighted documentation. Original documentation from The Open Group can be obtained online at
http://www.opengroup.org/bookstore/.
The Institute of Electrical and Electronics Engineers and The Open Group, have given us permission to reprint portions of their documentation. In the following statement, the phrase "this text" refers to portions of the system documentation. Portions of this text are reprinted and reproduced in electronic form in the Sun OS Reference Manual, from IEEE Std 1003.1, 2004 Edition, Standard for Information Technology -- Portable Operating System Interface (POSIX), The Open Group Base Specifications Issue 6, Copyright (C) 2001-2004 by the Institute of Electrical and Electronics Engineers, Inc and The Open Group. In the event of any discrepancy between these versions and the original IEEE and The Open Group Standard, the original IEEE and The Open Group Standard is the referee document. The original Standard can be obtained online at http://www.opengroup.org/unix/online.html.
This notice shall appear on any product containing this material.
The contents of this file are subject to the terms of the Common Development and Distribution License (the "License"). You may not use this file except in compliance with the License.
You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE or http://www.opensolaris.org/os/licensing. See the License for the specific language governing permissions and limitations under the License.
When distributing Covered Code, include this CDDL HEADER in each file and include the License file at usr/src/OPENSOLARIS.LICENSE. If applicable, add the following below this CDDL HEADER, with the fields enclosed by brackets "[]" replaced with your own identifying information: Portions Copyright [yyyy] [name of copyright owner]
STRING 3C "Aug 1, 2008"
NAME
string, strcasecmp, strncasecmp, strcat, strncat, strlcat, strchr, strrchr, strcmp, strncmp, strcpy, strncpy, strlcpy, strcspn, strspn, strdup, strlen, strnlen, strpbrk, strsep, strstr, strtok, strtok_r - string operations
SYNOPSIS

#include <strings.h>

int strcasecmp(const char *s1, const char *s2);

int strncasecmp(const char *s1, const char *s2, size_t n);

#include <string.h>

char *strcat(char *restrict s1, const char *restrict s2);

char *strncat(char *restrict s1, const char *restrict s2, size_t n);

size_t strlcat(char *dst, const char *src, size_t dstsize);

char *strchr(const char *s, int c);

char *strrchr(const char *s, int c);

int strcmp(const char *s1, const char *s2);

int strncmp(const char *s1, const char *s2, size_t n);

char *strcpy(char *restrict s1, const char *restrict s2);

char *strncpy(char *restrict s1, const char *restrict s2, size_t n);

size_t strlcpy(char *dst, const char *src, size_t dstsize);

size_t strcspn(const char *s1, const char *s2);

size_t strspn(const char *s1, const char *s2);

char *strdup(const char *s1);

size_t strlen(const char *s);

size_t strnlen(const char *s, size_t n);

char *strpbrk(const char *s1, const char *s2);

char *strsep(char **stringp, const char *delim);

char *strstr(const char *s1, const char *s2);

char *strtok(char *restrict s1, const char *restrict s2);

char *strtok_r(char *s1, const char *s2, char **lasts);
"ISO C++"

#include <string.h>

const char *strchr(const char *s, int c);

const char *strpbrk(const char *s1, const char *s2);

const char *strrchr(const char *s, int c);

const char *strstr(const char *s1, const char *s2);

#include <cstring>

char *std::strchr(char *s, int c);

char *std::strpbrk(char *s1, const char *s2);

char *std::strrchr(char *s, int c);

char *std::strstr(char *s1, const char *s2);
DESCRIPTION

The arguments s, s1, and s2 point to strings (arrays of characters terminated by a null character). The strcat(), strncat(), strlcat(), strcpy(), strncpy(), strlcpy(), strsep(), strtok(), and strtok_r() functions all alter their first argument. Additionally, the strcat() and strcpy() functions do not check for overflow of the array.

"strcasecmp(), strncasecmp()"

The strcasecmp() and strncasecmp() functions are case-insensitive versions of strcmp() and strncmp() respectively, described below. They assume the ASCII character set and ignore differences in case when comparing lower and upper case characters.

"strcat(), strncat(), strlcat()"

The strcat() function appends a copy of string s2, including the terminating null character, to the end of string s1. The strncat() function appends at most n characters. Each returns a pointer to the null-terminated result. The initial character of s2 overrides the null character at the end of s1. If copying takes place between objects that overlap, the behavior of strcat(), strncat(), and strlcat() is undefined.

The strlcat() function appends at most (dstsize-strlen(dst)-1) characters of src to dst (dstsize being the size of the string buffer dst). If the string pointed to by dst contains a null-terminated string that fits into dstsize bytes when strlcat() is called, the string pointed to by dst will be a null-terminated string that fits in dstsize bytes (including the terminating null character) when it completes, and the initial character of src will override the null character at the end of dst. If the string pointed to by dst is longer than dstsize bytes when strlcat() is called, the string pointed to by dst will not be changed. The function returns min{dstsize,strlen(dst)}+strlen(src). Buffer overflow can be checked as follows:

if (strlcat(dst, src, dstsize) >= dstsize)
 return -1;
"strchr(), strrchr()"

The strchr() function returns a pointer to the first occurrence of c (converted to a char) in string s, or a null pointer if c does not occur in the string. The strrchr() function returns a pointer to the last occurrence of c. The null character terminating a string is considered to be part of the string.

"strcmp(), strncmp()"

The strcmp() function compares two strings byte-by-byte, according to the ordering of your machine's character set. The function returns an integer greater than, equal to, or less than 0, if the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2 respectively. The sign of a non-zero return value is determined by the sign of the difference between the values of the first pair of bytes that differ in the strings being compared. The strncmp() function makes the same comparison but looks at a maximum of n bytes. Bytes following a null byte are not compared.

"strcpy(), strncpy(), strlcpy()"

The strcpy() function copies string s2 to s1, including the terminating null character, stopping after the null character has been copied. The strncpy() function copies exactly n bytes, truncating s2 or adding null characters to s1 if necessary. The result will not be null-terminated if the length of s2 is n or more. Each function returns s1. If copying takes place between objects that overlap, the behavior of strcpy(), strncpy(), and strlcpy() is undefined.

The strlcpy() function copies at most dstsize-1 characters (dstsize being the size of the string buffer dst) from src to dst, truncating src if necessary. The result is always null-terminated. The function returns strlen(src). Buffer overflow can be checked as follows:

if (strlcpy(dst, src, dstsize) >= dstsize)
 return -1;
"strcspn(), strspn()"

The strcspn() function returns the length of the initial segment of string s1 that consists entirely of characters not from string s2. The strspn() function returns the length of the initial segment of string s1 that consists entirely of characters from string s2.

"strdup()"

The strdup() function returns a pointer to a new string that is a duplicate of the string pointed to by s1. The returned pointer can be passed to free(). The space for the new string is obtained using malloc(3C). If the new string cannot be created, a null pointer is returned and errno may be set to ENOMEM to indicate that the storage space available is insufficient.

"strlen(), strnlen()"

The strlen() function returns the number of bytes in s, not including the terminating null character.

The strnlen() function returns the smaller of n or the number of bytes in s, not including the terminating null character. The strnlen() function never examines more than n bytes of the string pointed to by s.

"strpbrk()"

The strpbrk() function returns a pointer to the first occurrence in string s1 of any character from string s2, or a null pointer if no character from s2 exists in s1.

"strsep()"

The strsep() function locates, in the null-terminated string referenced by *stringp, the first occurrence of any character in the string delim (or the terminating `\e0' character) and replaces it with a `\e0'. The location of the next character after the delimiter character (or NULL, if the end of the string was reached) is stored in *stringp. The original value of *stringp is returned.

An ``empty'' field (one caused by two adjacent delimiter characters) can be detected by comparing the location referenced by the pointer returned by strsep() to `\e0'.

If *stringp is initially NULL, strsep() returns NULL.

"strstr()"

The strstr() function locates the first occurrence of the string s2 (excluding the terminating null character) in string s1 and returns a pointer to the located string, or a null pointer if the string is not found. If s2 points to a string with zero length (that is, the string ""), the function returns s1.

"strtok()"

A sequence of calls to strtok() breaks the string pointed to by s1 into a sequence of tokens, each of which is delimited by a byte from the string pointed to by s2. The first call in the sequence has s1 as its first argument, and is followed by calls with a null pointer as their first argument. The separator string pointed to by s2 can be different from call to call.

The first call in the sequence searches the string pointed to by s1 for the first byte that is not contained in the current separator string pointed to by s2. If no such byte is found, then there are no tokens in the string pointed to by s1 and strtok() returns a null pointer. If such a byte is found, it is the start of the first token.

The strtok() function then searches from there for a byte that is contained in the current separator string. If no such byte is found, the current token extends to the end of the string pointed to by s1, and subsequent searches for a token return a null pointer. If such a byte is found, it is overwritten by a null byte that terminates the current token. The strtok() function saves a pointer to the following byte in thread-specific data, from which the next search for a token starts.

Each subsequent call, with a null pointer as the value of the first argument, starts searching from the saved pointer and behaves as described above.

See Example 1, 2, and 3 in the EXAMPLES section for examples of strtok() usage and the explanation in NOTES.

"strtok_r()"

The strtok_r() function considers the null-terminated string s1 as a sequence of zero or more text tokens separated by spans of one or more characters from the separator string s2. The argument lasts points to a user-provided pointer which points to stored information necessary for strtok_r() to continue scanning the same string.

In the first call to strtok_r(), s1 points to a null-terminated string, s2 to a null-terminated string of separator characters, and the value pointed to by lasts is ignored. The strtok_r() function returns a pointer to the first character of the first token, writes a null character into s1 immediately following the returned token, and updates the pointer to which lasts points.

In subsequent calls, s1 is a null pointer and lasts is unchanged from the previous call so that subsequent calls move through the string s1, returning successive tokens until no tokens remain. The separator string s2 can be different from call to call. When no token remains in s1, a null pointer is returned.

See Example 3 in the EXAMPLES section for an example of strtok_r() usage and the explanation in NOTES.

EXAMPLES

Example 1 Search for word separators.

The following example searches for tokens separated by space characters.

#include <string.h>
...
char *token;
char line[] = "LINE TO BE SEPARATED";
char *search = " ";

/* Token will point to "LINE". */
token = strtok(line, search);

/* Token will point to "TO". */
token = strtok(NULL, search);

Example 2 Break a Line.

The following example uses strtok to break a line into two character strings separated by any combination of SPACEs, TABs, or NEWLINEs.

#include <string.h>
...
struct element {
 char *key;
 char *data;
};
...
char line[LINE_MAX];
char *key, *data;
...
key = strtok(line, " \en");
data = strtok(NULL, " \en");

Example 3 Search for tokens.

The following example uses both strtok() and strtok_r() to search for tokens separated by one or more characters from the string pointed to by the second argument, "/".

#define __EXTENSIONS__
#include <stdio.h>
#include <string.h>

int
main() {
 char *buf="5/90/45";
 char *token;
 char *lasts;

 printf("tokenizing \e"%s\e" with strtok():\en", buf);
 if ((token = strtok(buf, "/")) != NULL) {
 printf("token = "%s\e"\en", token);
 while ((token = strtok(NULL, "/")) != NULL) {
 printf("token = \e"%s\e"\en", token);
 }
 }

 buf = "//5//90//45//";
 printf("\entokenizing \e"%s\e" with strtok_r():\en", buf);
 if ((token = strtok_r(buf, "/", &lasts)) != NULL) {
 printf("token = \e"%s\e"\en", token);
 while ((token = strtok_r(NULL, "/", &lasts)) != NULL) {
 printf("token = \e"%s\e"\en", token);
 }
 }
}

When compiled and run, this example produces the following output:

tokenizing "5/90/45" with strtok():
token = "5"
token = "90"
token = "45"

tokenizing "//5//90//45//" with strtok_r():
token = "5"
token = "90"
token = "45"
ATTRIBUTES

See attributes(5) for descriptions of the following attributes:

ATTRIBUTE TYPE ATTRIBUTE VALUE
Interface Stability Committed
MT-Level See below.
Standard See below.

The strtok() and strdup() functions are MT-Safe. The remaining functions are Async-Signal-Safe.

For all except strlcat(), strlcpy(), and strsep(), see standards(5).

SEE ALSO

malloc(3C), setlocale(3C), strxfrm(3C), attributes(5), standards(5)

NOTES

When compiling multithreaded applications, the _REENTRANT flag must be defined on the compile line. This flag should only be used in multithreaded applications.

A single-threaded application can gain access to strtok_r() only by defining __EXTENSIONS__ or by defining _POSIX_C_SOURCE to a value greater than or equal to 199506L.

All of these functions assume the default locale ``C.'' For some locales, strxfrm(3C) should be applied to the strings before they are passed to the functions.

The strtok() function is safe to use in multithreaded applications because it saves its internal state in a thread-specific data area. However, its use is discouraged, even for single-threaded applications. The strtok_r() function should be used instead.

Do not pass the address of a character string literal as the argument s1 to either strtok() or strtok_r(). Similarly, do not pass a pointer to the address of a character string literal as the argument stringp to strsep(). These functions can modify the storage pointed to by s1 in the case of strtok() and strtok_r() or *stringp in the case of strsep(). The C99 standard specifies that attempting to modify the storage occupied by a string literal results in undefined behavior. This allows compilers (including gcc and the Sun Studio compilers when the -xstrconst flag is used) to place string literals in read-only memory. Note that in Example 1 above, this problem is avoided because the variable line is declared as a writable array of type char that is initialized by a string literal rather than a pointer to char that points to a string literal.