xref: /illumos-gate/usr/src/man/man3c/regcomp.3c (revision bbf215553c7233fbab8a0afdf1fac74c44781867)
17edb9f69SYuri Pankov.\" Copyright (c) 1992, 1993, 1994 Henry Spencer.
27edb9f69SYuri Pankov.\" Copyright (c) 1992, 1993, 1994
37edb9f69SYuri Pankov.\"	The Regents of the University of California.  All rights reserved.
47edb9f69SYuri Pankov.\"
57edb9f69SYuri Pankov.\" This code is derived from software contributed to Berkeley by
67edb9f69SYuri Pankov.\" Henry Spencer.
77edb9f69SYuri Pankov.\"
87edb9f69SYuri Pankov.\" Redistribution and use in source and binary forms, with or without
97edb9f69SYuri Pankov.\" modification, are permitted provided that the following conditions
107edb9f69SYuri Pankov.\" are met:
117edb9f69SYuri Pankov.\" 1. Redistributions of source code must retain the above copyright
127edb9f69SYuri Pankov.\"    notice, this list of conditions and the following disclaimer.
137edb9f69SYuri Pankov.\" 2. Redistributions in binary form must reproduce the above copyright
147edb9f69SYuri Pankov.\"    notice, this list of conditions and the following disclaimer in the
157edb9f69SYuri Pankov.\"    documentation and/or other materials provided with the distribution.
167edb9f69SYuri Pankov.\" 3. Neither the name of the University nor the names of its contributors
177edb9f69SYuri Pankov.\"    may be used to endorse or promote products derived from this software
187edb9f69SYuri Pankov.\"    without specific prior written permission.
197edb9f69SYuri Pankov.\"
207edb9f69SYuri Pankov.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
217edb9f69SYuri Pankov.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
227edb9f69SYuri Pankov.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
237edb9f69SYuri Pankov.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
247edb9f69SYuri Pankov.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
257edb9f69SYuri Pankov.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
267edb9f69SYuri Pankov.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
277edb9f69SYuri Pankov.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
287edb9f69SYuri Pankov.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
297edb9f69SYuri Pankov.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
307edb9f69SYuri Pankov.\" SUCH DAMAGE.
317edb9f69SYuri Pankov.\"
3266492cf0SYuri Pankov.\" Sun Microsystems, Inc. gratefully acknowledges The Open Group for
3366492cf0SYuri Pankov.\" permission to reproduce portions of its copyrighted documentation.
347edb9f69SYuri Pankov.\" Original documentation from The Open Group can be obtained online at
35c10c16deSRichard Lowe.\" http://www.opengroup.org/bookstore/.
367edb9f69SYuri Pankov.\"
3766492cf0SYuri Pankov.\" The Institute of Electrical and Electronics Engineers and The Open
3866492cf0SYuri Pankov.\" Group, have given us permission to reprint portions of their
397edb9f69SYuri Pankov.\" documentation.
407edb9f69SYuri Pankov.\"
4166492cf0SYuri Pankov.\" In the following statement, the phrase ``this text'' refers to portions
4266492cf0SYuri Pankov.\" of the system documentation.
437edb9f69SYuri Pankov.\"
4466492cf0SYuri Pankov.\" Portions of this text are reprinted and reproduced in electronic form
4566492cf0SYuri Pankov.\" in the SunOS Reference Manual, from IEEE Std 1003.1, 2004 Edition,
4666492cf0SYuri Pankov.\" Standard for Information Technology -- Portable Operating System
4766492cf0SYuri Pankov.\" Interface (POSIX), The Open Group Base Specifications Issue 6,
4866492cf0SYuri Pankov.\" Copyright (C) 2001-2004 by the Institute of Electrical and Electronics
4966492cf0SYuri Pankov.\" Engineers, Inc and The Open Group.  In the event of any discrepancy
5066492cf0SYuri Pankov.\" between these versions and the original IEEE and The Open Group
5166492cf0SYuri Pankov.\" Standard, the original IEEE and The Open Group Standard is the referee
5266492cf0SYuri Pankov.\" document.  The original Standard can be obtained online at
537edb9f69SYuri Pankov.\" http://www.opengroup.org/unix/online.html.
547edb9f69SYuri Pankov.\"
55c10c16deSRichard Lowe.\" This notice shall appear on any product containing this material.
567edb9f69SYuri Pankov.\"
577edb9f69SYuri Pankov.\" The contents of this file are subject to the terms of the
587edb9f69SYuri Pankov.\" Common Development and Distribution License (the "License").
597edb9f69SYuri Pankov.\" You may not use this file except in compliance with the License.
607edb9f69SYuri Pankov.\"
617edb9f69SYuri Pankov.\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
627edb9f69SYuri Pankov.\" or http://www.opensolaris.org/os/licensing.
637edb9f69SYuri Pankov.\" See the License for the specific language governing permissions
647edb9f69SYuri Pankov.\" and limitations under the License.
657edb9f69SYuri Pankov.\"
667edb9f69SYuri Pankov.\" When distributing Covered Code, include this CDDL HEADER in each
677edb9f69SYuri Pankov.\" file and include the License file at usr/src/OPENSOLARIS.LICENSE.
687edb9f69SYuri Pankov.\" If applicable, add the following below this CDDL HEADER, with the
697edb9f69SYuri Pankov.\" fields enclosed by brackets "[]" replaced with your own identifying
707edb9f69SYuri Pankov.\" information: Portions Copyright [yyyy] [name of copyright owner]
717edb9f69SYuri Pankov.\"
727edb9f69SYuri Pankov.\"
737edb9f69SYuri Pankov.\" Copyright (c) 1992, X/Open Company Limited. All Rights Reserved.
747edb9f69SYuri Pankov.\" Portions Copyright (c) 2003, Sun Microsystems, Inc.  All Rights Reserved.
757edb9f69SYuri Pankov.\" Copyright 2017 Nexenta Systems, Inc.
767edb9f69SYuri Pankov.\"
777edb9f69SYuri Pankov.Dd June 14, 2017
787edb9f69SYuri Pankov.Dt REGCOMP 3C
797edb9f69SYuri Pankov.Os
807edb9f69SYuri Pankov.Sh NAME
817edb9f69SYuri Pankov.Nm regcomp ,
827edb9f69SYuri Pankov.Nm regexec ,
837edb9f69SYuri Pankov.Nm regerror ,
847edb9f69SYuri Pankov.Nm regfree
857edb9f69SYuri Pankov.Nd regular-expression library
867edb9f69SYuri Pankov.Sh LIBRARY
877edb9f69SYuri Pankov.Lb libc
887edb9f69SYuri Pankov.Sh SYNOPSIS
897edb9f69SYuri Pankov.In regex.h
907edb9f69SYuri Pankov.Ft int
917edb9f69SYuri Pankov.Fo regcomp
927edb9f69SYuri Pankov.Fa "regex_t *restrict preg" "const char *restrict pattern" "int cflags"
937edb9f69SYuri Pankov.Fc
947edb9f69SYuri Pankov.Ft int
957edb9f69SYuri Pankov.Fo regexec
967edb9f69SYuri Pankov.Fa "const regex_t *restrict preg" "const char *restrict string"
977edb9f69SYuri Pankov.Fa "size_t nmatch" "regmatch_t pmatch[restrict]" "int eflags"
987edb9f69SYuri Pankov.Fc
997edb9f69SYuri Pankov.Ft size_t
1007edb9f69SYuri Pankov.Fo regerror
1017edb9f69SYuri Pankov.Fa "int errcode" "const regex_t *restrict preg"
1027edb9f69SYuri Pankov.Fa "char *restrict errbuf" "size_t errbuf_size"
1037edb9f69SYuri Pankov.Fc
1047edb9f69SYuri Pankov.Ft void
1057edb9f69SYuri Pankov.Fn regfree "regex_t *preg"
1067edb9f69SYuri Pankov.Sh DESCRIPTION
1077edb9f69SYuri PankovThese routines implement
1087edb9f69SYuri Pankov.St -p1003.2
1097edb9f69SYuri Pankovregular expressions; see
110*bbf21555SRichard Lowe.Xr regex 7 .
1117edb9f69SYuri PankovThe
1127edb9f69SYuri Pankov.Fn regcomp
1137edb9f69SYuri Pankovfunction compiles an RE written as a string into an internal form,
1147edb9f69SYuri Pankov.Fn regexec
1157edb9f69SYuri Pankovmatches that internal form against a string and reports results,
1167edb9f69SYuri Pankov.Fn regerror
1177edb9f69SYuri Pankovtransforms error codes from either into human-readable messages,
1187edb9f69SYuri Pankovand
1197edb9f69SYuri Pankov.Fn regfree
1207edb9f69SYuri Pankovfrees any dynamically-allocated storage used by the internal form
1217edb9f69SYuri Pankovof an RE.
1227edb9f69SYuri Pankov.Pp
1237edb9f69SYuri PankovThe header
1247edb9f69SYuri Pankov.In regex.h
1257edb9f69SYuri Pankovdeclares two structure types,
1267edb9f69SYuri Pankov.Ft regex_t
1277edb9f69SYuri Pankovand
1287edb9f69SYuri Pankov.Ft regmatch_t ,
1297edb9f69SYuri Pankovthe former for compiled internal forms and the latter for match reporting.
1307edb9f69SYuri PankovIt also declares the four functions, a type
1317edb9f69SYuri Pankov.Ft regoff_t ,
1327edb9f69SYuri Pankovand a number of constants with names starting with
1337edb9f69SYuri Pankov.Qq Dv REG_ .
1347edb9f69SYuri Pankov.Ss Fn regcomp
1357edb9f69SYuri PankovThe
1367edb9f69SYuri Pankov.Fn regcomp
1377edb9f69SYuri Pankovfunction compiles the regular expression contained in the
1387edb9f69SYuri Pankov.Fa pattern
1397edb9f69SYuri Pankovstring, subject to the flags in
1407edb9f69SYuri Pankov.Fa cflags ,
1417edb9f69SYuri Pankovand places the results in the
1427edb9f69SYuri Pankov.Ft regex_t
1437edb9f69SYuri Pankovstructure pointed to by
1447edb9f69SYuri Pankov.Fa preg .
1457edb9f69SYuri PankovThe
1467edb9f69SYuri Pankov.Fa cflags
1477edb9f69SYuri Pankovargument is the bitwise OR of zero or more of the following flags:
1487edb9f69SYuri Pankov.Bl -tag -width REG_EXTENDED
1497edb9f69SYuri Pankov.It Dv REG_EXTENDED
1507edb9f69SYuri PankovCompile extended regular expressions
1517edb9f69SYuri Pankov.Pq EREs ,
1527edb9f69SYuri Pankovrather than the basic regular expressions
1537edb9f69SYuri Pankov.Pq BREs
1547edb9f69SYuri Pankovthat are the default.
1557edb9f69SYuri Pankov.It Dv REG_BASIC
1567edb9f69SYuri PankovThis is a synonym for 0, provided as a counterpart to
1577edb9f69SYuri Pankov.Dv REG_EXTENDED
1587edb9f69SYuri Pankovto improve readability.
1597edb9f69SYuri Pankov.It Dv REG_NOSPEC
1607edb9f69SYuri PankovCompile with recognition of all special characters turned off.
1617edb9f69SYuri PankovAll characters are thus considered ordinary, so the RE is a literal string.
1627edb9f69SYuri PankovThis is an extension, compatible with but not specified by
1637edb9f69SYuri Pankov.St -p1003.2 ,
1647edb9f69SYuri Pankovand should be used with caution in software intended to be portable to other
1657edb9f69SYuri Pankovsystems.
1667edb9f69SYuri Pankov.Dv REG_EXTENDED
1677edb9f69SYuri Pankovand
1687edb9f69SYuri Pankov.Dv REG_NOSPEC
1697edb9f69SYuri Pankovmay not be used in the same call to
1707edb9f69SYuri Pankov.Fn regcomp .
1717edb9f69SYuri Pankov.It Dv REG_ICASE
1727edb9f69SYuri PankovCompile for matching that ignores upper/lower case distinctions.
1737edb9f69SYuri PankovSee
174*bbf21555SRichard Lowe.Xr regex 7 .
1757edb9f69SYuri Pankov.It Dv REG_NOSUB
1767edb9f69SYuri PankovCompile for matching that need only report success or failure,
1777edb9f69SYuri Pankovnot what was matched.
1787edb9f69SYuri Pankov.It Dv REG_NEWLINE
1797edb9f69SYuri PankovCompile for newline-sensitive matching.
1807edb9f69SYuri PankovBy default, newline is a completely ordinary character with no special
1817edb9f69SYuri Pankovmeaning in either REs or strings.
1827edb9f69SYuri PankovWith this flag,
1837edb9f69SYuri Pankov.Qq [^
1847edb9f69SYuri Pankovbracket expressions and
1857edb9f69SYuri Pankov.Qq \&.
1867edb9f69SYuri Pankovnever match newline,
1877edb9f69SYuri Pankova
1887edb9f69SYuri Pankov.Qq \&^
1897edb9f69SYuri Pankovanchor matches the null string after any newline in the string in addition to
1907edb9f69SYuri Pankovits normal function, and the
1917edb9f69SYuri Pankov.Qq \&$
1927edb9f69SYuri Pankovanchor matches the null string before any newline in the string in addition to
1937edb9f69SYuri Pankovits normal function.
1947edb9f69SYuri Pankov.It Dv REG_PEND
1957edb9f69SYuri PankovThe regular expression ends, not at the first NUL, but just before the character
1967edb9f69SYuri Pankovpointed to by the
1977edb9f69SYuri Pankov.Va re_endp
1987edb9f69SYuri Pankovmember of the structure pointed to by
1997edb9f69SYuri Pankov.Fa preg .
2007edb9f69SYuri PankovThe
2017edb9f69SYuri Pankov.Va re_endp
2027edb9f69SYuri Pankovmember is of type
2037edb9f69SYuri Pankov.Vt "const char *" .
2047edb9f69SYuri PankovThis flag permits inclusion of NULs in the RE; they are considered ordinary
2057edb9f69SYuri Pankovcharacters.
2067edb9f69SYuri PankovThis is an extension, compatible with but not specified by
2077edb9f69SYuri Pankov.St -p1003.2 ,
2087edb9f69SYuri Pankovand should be used with caution in software intended to be portable to other
2097edb9f69SYuri Pankovsystems.
2107edb9f69SYuri Pankov.El
2117edb9f69SYuri Pankov.Pp
2127edb9f69SYuri PankovWhen successful,
2137edb9f69SYuri Pankov.Fn regcomp
2147edb9f69SYuri Pankovreturns 0 and fills in the structure pointed to by
2157edb9f69SYuri Pankov.Fa preg .
2167edb9f69SYuri PankovOne member of that structure
2177edb9f69SYuri Pankov.Po other than
2187edb9f69SYuri Pankov.Va re_endp
2197edb9f69SYuri Pankov.Pc
2207edb9f69SYuri Pankovis publicized:
2217edb9f69SYuri Pankov.Va re_nsub ,
2227edb9f69SYuri Pankovof type
2237edb9f69SYuri Pankov.Ft size_t ,
2247edb9f69SYuri Pankovcontains the number of parenthesized subexpressions within the RE
2257edb9f69SYuri Pankov.Po except that the value of this member is undefined if the
2267edb9f69SYuri Pankov.Dv REG_NOSUB
2277edb9f69SYuri Pankovflag was used
2287edb9f69SYuri Pankov.Pc .
2297edb9f69SYuri Pankov.Ss Fn regexec
2307edb9f69SYuri PankovThe
2317edb9f69SYuri Pankov.Fn regexec
2327edb9f69SYuri Pankovfunction matches the compiled RE pointed to by
2337edb9f69SYuri Pankov.Fa preg
2347edb9f69SYuri Pankovagainst the
2357edb9f69SYuri Pankov.Fa string ,
2367edb9f69SYuri Pankovsubject to the flags in
2377edb9f69SYuri Pankov.Fa eflags ,
2387edb9f69SYuri Pankovand reports results using
2397edb9f69SYuri Pankov.Fa nmatch ,
2407edb9f69SYuri Pankov.Fa pmatch ,
2417edb9f69SYuri Pankovand the returned value.
2427edb9f69SYuri PankovThe RE must have been compiled by a previous invocation of
2437edb9f69SYuri Pankov.Fn regcomp .
2447edb9f69SYuri PankovThe compiled form is not altered during execution of
2457edb9f69SYuri Pankov.Fn regexec ,
2467edb9f69SYuri Pankovso a single compiled RE can be used simultaneously by multiple threads.
2477edb9f69SYuri Pankov.Pp
2487edb9f69SYuri PankovBy default, the NUL-terminated string pointed to by
2497edb9f69SYuri Pankov.Fa string
2507edb9f69SYuri Pankovis considered to be the text of an entire line, minus any terminating
2517edb9f69SYuri Pankovnewline.
2527edb9f69SYuri PankovThe
2537edb9f69SYuri Pankov.Fa eflags
2547edb9f69SYuri Pankovargument is the bitwise OR of zero or more of the following flags:
2557edb9f69SYuri Pankov.Bl -tag -width REG_STARTEND
2567edb9f69SYuri Pankov.It Dv REG_NOTBOL
2577edb9f69SYuri PankovThe first character of the string is treated as the continuation
2587edb9f69SYuri Pankovof a line.
2597edb9f69SYuri PankovThis means that the anchors
2607edb9f69SYuri Pankov.Qq \&^ ,
2617edb9f69SYuri Pankov.Qq [[:<:]] ,
2627edb9f69SYuri Pankovand
2637edb9f69SYuri Pankov.Qq \e<
2647edb9f69SYuri Pankovdo not match before it; but see
2657edb9f69SYuri Pankov.Dv REG_STARTEND
2667edb9f69SYuri Pankovbelow.
2677edb9f69SYuri PankovThis does not affect the behavior of newlines under
2687edb9f69SYuri Pankov.Dv REG_NEWLINE .
2697edb9f69SYuri Pankov.It Dv REG_NOTEOL
2707edb9f69SYuri PankovThe NUL terminating the string does not end a line, so the
2717edb9f69SYuri Pankov.Qq \&$
2727edb9f69SYuri Pankovanchor does not match before it.
2737edb9f69SYuri PankovThis does not affect the behavior of newlines under
2747edb9f69SYuri Pankov.Dv REG_NEWLINE .
2757edb9f69SYuri Pankov.It Dv REG_STARTEND
2767edb9f69SYuri PankovThe string is considered to start at
2777edb9f69SYuri Pankov.Fa string No +
2787edb9f69SYuri Pankov.Fa pmatch Ns [0]. Ns Fa rm_so
2797edb9f69SYuri Pankovand to end before the byte located at
2807edb9f69SYuri Pankov.Fa string No +
2817edb9f69SYuri Pankov.Fa pmatch Ns [0]. Ns Fa rm_eo ,
2827edb9f69SYuri Pankovregardless of the value of
2837edb9f69SYuri Pankov.Fa nmatch .
2847edb9f69SYuri PankovSee below for the definition of
2857edb9f69SYuri Pankov.Fa pmatch
2867edb9f69SYuri Pankovand
2877edb9f69SYuri Pankov.Fa nmatch .
2887edb9f69SYuri PankovThis is an extension, compatible with but not specified by
2897edb9f69SYuri Pankov.St -p1003.2 ,
2907edb9f69SYuri Pankovand should be used with caution in software intended to be portable to other
2917edb9f69SYuri Pankovsystems.
2927edb9f69SYuri Pankov.Pp
2937edb9f69SYuri PankovWithout
2947edb9f69SYuri Pankov.Dv REG_NOTBOL ,
2957edb9f69SYuri Pankovthe position
2967edb9f69SYuri Pankov.Fa rm_so
2977edb9f69SYuri Pankovis considered the beginning of a line, such that
2987edb9f69SYuri Pankov.Qq \&^
2997edb9f69SYuri Pankovmatches before it, and the beginning of a word if there is a word character at
3007edb9f69SYuri Pankovthis position, such that
3017edb9f69SYuri Pankov.Qq [[:<:]]
3027edb9f69SYuri Pankovand
3037edb9f69SYuri Pankov.Qq \e<
3047edb9f69SYuri Pankovmatch before it.
3057edb9f69SYuri Pankov.Pp
3067edb9f69SYuri PankovWith
3077edb9f69SYuri Pankov.Dv REG_NOTBOL ,
3087edb9f69SYuri Pankovthe character at position
3097edb9f69SYuri Pankov.Fa rm_so
3107edb9f69SYuri Pankovis treated as the continuation of a line, and if
3117edb9f69SYuri Pankov.Fa rm_so
3127edb9f69SYuri Pankovis greater than 0, the preceding character is taken into consideration.
3137edb9f69SYuri PankovIf the preceding character is a newline and the regular expression was compiled
3147edb9f69SYuri Pankovwith
3157edb9f69SYuri Pankov.Dv REG_NEWLINE ,
3167edb9f69SYuri Pankov.Qq ^
3177edb9f69SYuri Pankovmatches before the string; if the preceding character is not a word character
3187edb9f69SYuri Pankovbut the string starts with a word character,
3197edb9f69SYuri Pankov.Qq [[:<:]]
3207edb9f69SYuri Pankovand
3217edb9f69SYuri Pankov.Qq \e<
3227edb9f69SYuri Pankovmatch before the string.
3237edb9f69SYuri Pankov.El
3247edb9f69SYuri Pankov.Pp
3257edb9f69SYuri PankovSee
326*bbf21555SRichard Lowe.Xr regex 7
3277edb9f69SYuri Pankovfor a discussion of what is matched in situations where an RE or a portion
3287edb9f69SYuri Pankovthereof could match any of several substrings of
3297edb9f69SYuri Pankov.Fa string .
3307edb9f69SYuri Pankov.Pp
3317edb9f69SYuri PankovIf
3327edb9f69SYuri Pankov.Dv REG_NOSUB
3337edb9f69SYuri Pankovwas specified in the compilation of the RE, or if
3347edb9f69SYuri Pankov.Fa nmatch
3357edb9f69SYuri Pankovis 0,
3367edb9f69SYuri Pankov.Fn regexec
3377edb9f69SYuri Pankovignores the
3387edb9f69SYuri Pankov.Fa pmatch
3397edb9f69SYuri Pankovargument
3407edb9f69SYuri Pankov.Po but see below for the case where
3417edb9f69SYuri Pankov.Dv REG_STARTEND
3427edb9f69SYuri Pankovis specified
3437edb9f69SYuri Pankov.Pc .
3447edb9f69SYuri PankovOtherwise,
3457edb9f69SYuri Pankov.Fa pmatch
3467edb9f69SYuri Pankovpoints to an array of
3477edb9f69SYuri Pankov.Fa nmatch
3487edb9f69SYuri Pankovstructures of type
3497edb9f69SYuri Pankov.Ft regmatch_t .
3507edb9f69SYuri PankovSuch a structure has at least the members
3517edb9f69SYuri Pankov.Va rm_so
3527edb9f69SYuri Pankovand
3537edb9f69SYuri Pankov.Va rm_eo ,
3547edb9f69SYuri Pankovboth of type
3557edb9f69SYuri Pankov.Ft regoff_t
3567edb9f69SYuri Pankov.Po a signed arithmetic type at least as large as an
3577edb9f69SYuri Pankov.Ft off_t
3587edb9f69SYuri Pankovand a
3597edb9f69SYuri Pankov.Ft ssize_t
3607edb9f69SYuri Pankov.Pc ,
3617edb9f69SYuri Pankovcontaining respectively the offset of the first character of a substring
3627edb9f69SYuri Pankovand the offset of the first character after the end of the substring.
3637edb9f69SYuri PankovOffsets are measured from the beginning of the
3647edb9f69SYuri Pankov.Fa string
3657edb9f69SYuri Pankovargument given to
3667edb9f69SYuri Pankov.Fn regexec .
3677edb9f69SYuri PankovAn empty substring is denoted by equal offsets, both indicating the character
3687edb9f69SYuri Pankovfollowing the empty substring.
3697edb9f69SYuri Pankov.Pp
3707edb9f69SYuri PankovThe 0th member of the
3717edb9f69SYuri Pankov.Fa pmatch
3727edb9f69SYuri Pankovarray is filled in to indicate what substring of
3737edb9f69SYuri Pankov.Fa string
3747edb9f69SYuri Pankovwas matched by the entire RE.
3757edb9f69SYuri PankovRemaining members report what substring was matched by parenthesized
3767edb9f69SYuri Pankovsubexpressions within the RE; member
3777edb9f69SYuri Pankov.Va i
3787edb9f69SYuri Pankovreports subexpression
3797edb9f69SYuri Pankov.Va i ,
3807edb9f69SYuri Pankovwith subexpressions counted
3817edb9f69SYuri Pankov.Pq starting at 1
3827edb9f69SYuri Pankovby the order of their opening parentheses in the RE, left to right.
3837edb9f69SYuri PankovUnused entries in the array
3847edb9f69SYuri Pankov.Po corresponding either to subexpressions that did not participate in the match
3857edb9f69SYuri Pankovat all, or to subexpressions that do not exist in the RE
3867edb9f69SYuri Pankov.Po that is,
3877edb9f69SYuri Pankov.Va i
3887edb9f69SYuri Pankov>
3897edb9f69SYuri Pankov.Fa preg Ns -> Ns Va re_nsub
3907edb9f69SYuri Pankov.Pc
3917edb9f69SYuri Pankov.Pc
3927edb9f69SYuri Pankovhave both
3937edb9f69SYuri Pankov.Va rm_so
3947edb9f69SYuri Pankovand
3957edb9f69SYuri Pankov.Va rm_eo
3967edb9f69SYuri Pankovset to -1.
3977edb9f69SYuri PankovIf a subexpression participated in the match several times,
3987edb9f69SYuri Pankovthe reported substring is the last one it matched.
3997edb9f69SYuri Pankov.Po Note, as an example in particular, that when the RE
4007edb9f69SYuri Pankov.Qq (b*)+
4017edb9f69SYuri Pankovmatches
4027edb9f69SYuri Pankov.Qq bbb ,
4037edb9f69SYuri Pankovthe parenthesized subexpression matches each of the three
4047edb9f69SYuri Pankov.So Li b Sc Ns s
4057edb9f69SYuri Pankovand then an infinite number of empty strings following the last
4067edb9f69SYuri Pankov.Qq b ,
4077edb9f69SYuri Pankovso the reported substring is one of the empties.
4087edb9f69SYuri Pankov.Pc
4097edb9f69SYuri Pankov.Pp
4107edb9f69SYuri PankovIf
4117edb9f69SYuri Pankov.Dv REG_STARTEND
4127edb9f69SYuri Pankovis specified,
4137edb9f69SYuri Pankov.Fa pmatch
4147edb9f69SYuri Pankovmust point to at least one
4157edb9f69SYuri Pankov.Ft regmatch_t
4167edb9f69SYuri Pankov.Po even if
4177edb9f69SYuri Pankov.Fa nmatch
4187edb9f69SYuri Pankovis 0 or
4197edb9f69SYuri Pankov.Dv REG_NOSUB
4207edb9f69SYuri Pankovwas specified
4217edb9f69SYuri Pankov.Pc ,
4227edb9f69SYuri Pankovto hold the input offsets for
4237edb9f69SYuri Pankov.Dv REG_STARTEND .
4247edb9f69SYuri PankovUse for output is still entirely controlled by
4257edb9f69SYuri Pankov.Fa nmatch ;
4267edb9f69SYuri Pankovif
4277edb9f69SYuri Pankov.Fa nmatch
4287edb9f69SYuri Pankovis 0 or
4297edb9f69SYuri Pankov.Dv REG_NOSUB
4307edb9f69SYuri Pankovwas specified,
4317edb9f69SYuri Pankovthe value of
4327edb9f69SYuri Pankov.Fa pmatch Ns [0]
4337edb9f69SYuri Pankovwill not be changed by a successful
4347edb9f69SYuri Pankov.Fn regexec .
4357edb9f69SYuri Pankov.Ss Fn regerror
4367edb9f69SYuri PankovThe
4377edb9f69SYuri Pankov.Fn regerror
4387edb9f69SYuri Pankovfunction maps a non-zero
4397edb9f69SYuri Pankov.Fa errcode
4407edb9f69SYuri Pankovfrom either
4417edb9f69SYuri Pankov.Fn regcomp
442c10c16deSRichard Loweor
4437edb9f69SYuri Pankov.Fn regexec
4447edb9f69SYuri Pankovto a human-readable, printable message.
4457edb9f69SYuri PankovIf
4467edb9f69SYuri Pankov.Fa preg
4477edb9f69SYuri Pankovis non-NULL, the error code should have arisen from use of the
4487edb9f69SYuri Pankov.Ft regex_t
4497edb9f69SYuri Pankovpointed to by
4507edb9f69SYuri Pankov.Fa preg ,
4517edb9f69SYuri Pankovand if the error code came from
4527edb9f69SYuri Pankov.Fn regcomp ,
4537edb9f69SYuri Pankovit should have been the result from the most recent
4547edb9f69SYuri Pankov.Fn regcomp
4557edb9f69SYuri Pankovusing that
4567edb9f69SYuri Pankov.Ft regex_t .
4577edb9f69SYuri PankovThe
4587edb9f69SYuri Pankov.Po
4597edb9f69SYuri Pankov.Fn regerror
4607edb9f69SYuri Pankovmay be able to supply a more detailed message using information
4617edb9f69SYuri Pankovfrom the
4627edb9f69SYuri Pankov.Ft regex_t .
4637edb9f69SYuri Pankov.Pc
4647edb9f69SYuri PankovThe
4657edb9f69SYuri Pankov.Fn regerror
4667edb9f69SYuri Pankovfunction places the NUL-terminated message into the buffer pointed to by
4677edb9f69SYuri Pankov.Fa errbuf ,
4687edb9f69SYuri Pankovlimiting the length
4697edb9f69SYuri Pankov.Pq including the NUL
4707edb9f69SYuri Pankovto at most
4717edb9f69SYuri Pankov.Fa errbuf_size
4727edb9f69SYuri Pankovbytes.
4737edb9f69SYuri PankovIf the whole message will not fit, as much of it as will fit before the
4747edb9f69SYuri Pankovterminating NUL is supplied.
4757edb9f69SYuri PankovIn any case, the returned value is the size of buffer needed to hold the whole
4767edb9f69SYuri Pankovmessage
4777edb9f69SYuri Pankov.Pq including terminating NUL .
4787edb9f69SYuri PankovIf
4797edb9f69SYuri Pankov.Fa errbuf_size
4807edb9f69SYuri Pankovis 0,
4817edb9f69SYuri Pankov.Fa errbuf
4827edb9f69SYuri Pankovis ignored but the return value is still correct.
4837edb9f69SYuri Pankov.Pp
4847edb9f69SYuri PankovIf the
4857edb9f69SYuri Pankov.Fa errcode
4867edb9f69SYuri Pankovgiven to
4877edb9f69SYuri Pankov.Fn regerror
4887edb9f69SYuri Pankovis first ORed with
4897edb9f69SYuri Pankov.Dv REG_ITOA ,
4907edb9f69SYuri Pankovthe
4917edb9f69SYuri Pankov.Qq message
4927edb9f69SYuri Pankovthat results is the printable name of the error code, e.g.
4937edb9f69SYuri Pankov.Qq Dv REG_NOMATCH ,
4947edb9f69SYuri Pankovrather than an explanation thereof.
4957edb9f69SYuri PankovIf
4967edb9f69SYuri Pankov.Fa errcode
4977edb9f69SYuri Pankovis
4987edb9f69SYuri Pankov.Dv REG_ATOI ,
4997edb9f69SYuri Pankovthen
5007edb9f69SYuri Pankov.Fa preg
5017edb9f69SYuri Pankovshall be non-NULL and the
5027edb9f69SYuri Pankov.Va re_endp
5037edb9f69SYuri Pankovmember of the structure it points to must point to the printable name of an
5047edb9f69SYuri Pankoverror code; in this case, the result in
5057edb9f69SYuri Pankov.Fa errbuf
5067edb9f69SYuri Pankovis the decimal digits of the numeric value of the error code
5077edb9f69SYuri Pankov.Pq 0 if the name is not recognized .
5087edb9f69SYuri Pankov.Dv REG_ITOA
5097edb9f69SYuri Pankovand
5107edb9f69SYuri Pankov.Dv REG_ATOI
5117edb9f69SYuri Pankovare intended primarily as debugging facilities; they are extensions,
5127edb9f69SYuri Pankovcompatible with but not specified by
5137edb9f69SYuri Pankov.St -p1003.2 ,
5147edb9f69SYuri Pankovand should be used with caution in software intended to be portable to other
5157edb9f69SYuri Pankovsystems.
5167edb9f69SYuri Pankov.Ss Fn regfree
5177edb9f69SYuri PankovThe
5187edb9f69SYuri Pankov.Fn regfree
5197edb9f69SYuri Pankovfunction frees any dynamically-allocated storage associated with the compiled RE
5207edb9f69SYuri Pankovpointed to by
5217edb9f69SYuri Pankov.Fa preg .
5227edb9f69SYuri PankovThe remaining
5237edb9f69SYuri Pankov.Ft regex_t
5247edb9f69SYuri Pankovis no longer a valid compiled RE and the effect of supplying it to
5257edb9f69SYuri Pankov.Fn regexec
5267edb9f69SYuri Pankovor
5277edb9f69SYuri Pankov.Fn regerror
5287edb9f69SYuri Pankovis undefined.
5297edb9f69SYuri Pankov.Sh IMPLEMENTATION NOTES
5307edb9f69SYuri PankovThere are a number of decisions that
5317edb9f69SYuri Pankov.St -p1003.2
5327edb9f69SYuri Pankovleaves up to the implementor,
5337edb9f69SYuri Pankoveither by explicitly saying
5347edb9f69SYuri Pankov.Qq undefined
5357edb9f69SYuri Pankovor by virtue of them being forbidden by the RE grammar.
5367edb9f69SYuri PankovThis implementation treats them as follows.
5377edb9f69SYuri Pankov.Pp
5387edb9f69SYuri PankovThere is no particular limit on the length of REs, except insofar as memory is
5397edb9f69SYuri Pankovlimited.
5407edb9f69SYuri PankovMemory usage is approximately linear in RE size, and largely insensitive
5417edb9f69SYuri Pankovto RE complexity, except for bounded repetitions.
5427edb9f69SYuri Pankov.Pp
5437edb9f69SYuri PankovA backslashed character other than one specifically given a magic meaning by
5447edb9f69SYuri Pankov.St -p1003.2
5457edb9f69SYuri Pankov.Pq such magic meanings occur only in BREs
5467edb9f69SYuri Pankovis taken as an ordinary character.
5477edb9f69SYuri Pankov.Pp
5487edb9f69SYuri PankovAny unmatched
5497edb9f69SYuri Pankov.Qq \&[
5507edb9f69SYuri Pankovis a
5517edb9f69SYuri Pankov.Dv REG_EBRACK
5527edb9f69SYuri Pankoverror.
5537edb9f69SYuri Pankov.Pp
5547edb9f69SYuri PankovEquivalence classes cannot begin or end bracket-expression ranges.
5557edb9f69SYuri PankovThe endpoint of one range cannot begin another.
5567edb9f69SYuri Pankov.Pp
5577edb9f69SYuri Pankov.Dv RE_DUP_MAX ,
5587edb9f69SYuri Pankovthe limit on repetition counts in bounded repetitions, is 255.
5597edb9f69SYuri Pankov.Pp
5607edb9f69SYuri PankovA repetition operator
5617edb9f69SYuri Pankov.Po
5627edb9f69SYuri Pankov.Qq \&? ,
5637edb9f69SYuri Pankov.Qq \&* ,
5647edb9f69SYuri Pankov.Qq \&+ ,
5657edb9f69SYuri Pankovor bounds
5667edb9f69SYuri Pankov.Pc
5677edb9f69SYuri Pankovcannot follow another repetition operator.
5687edb9f69SYuri PankovA repetition operator cannot begin an expression or subexpression
5697edb9f69SYuri Pankovor follow
5707edb9f69SYuri Pankov.Qq \&^
5717edb9f69SYuri Pankovor
5727edb9f69SYuri Pankov.Qq \&| .
5737edb9f69SYuri Pankov.Pp
5747edb9f69SYuri Pankov.Qq \&|
5757edb9f69SYuri Pankovcannot appear first or last in a (sub)expression or after another
5767edb9f69SYuri Pankov.Qq \&| ,
5777edb9f69SYuri Pankovi.e., an operand of
5787edb9f69SYuri Pankov.Qq \&|
5797edb9f69SYuri Pankovcannot be an empty subexpression.
5807edb9f69SYuri PankovAn empty parenthesized subexpression,
5817edb9f69SYuri Pankov.Qq () ,
5827edb9f69SYuri Pankovis legal and matches an empty (sub)string.
5837edb9f69SYuri PankovAn empty string is not a legal RE.
5847edb9f69SYuri Pankov.Pp
5857edb9f69SYuri PankovA
5867edb9f69SYuri Pankov.Qq \&{
5877edb9f69SYuri Pankovfollowed by a digit is considered the beginning of bounds for a bounded
5887edb9f69SYuri Pankovrepetition, which must then follow the syntax for bounds.
5897edb9f69SYuri PankovA
5907edb9f69SYuri Pankov.Qq \&{
5917edb9f69SYuri Pankov.Em not
5927edb9f69SYuri Pankovfollowed by a digit is considered an ordinary character.
5937edb9f69SYuri Pankov.Pp
5947edb9f69SYuri Pankov.Qq \&^
5957edb9f69SYuri Pankovand
5967edb9f69SYuri Pankov.Qq \&$
5977edb9f69SYuri Pankovbeginning and ending subexpressions in BREs are anchors, not ordinary
5987edb9f69SYuri Pankovcharacters.
5997edb9f69SYuri Pankov.Sh RETURN VALUES
6007edb9f69SYuri PankovOn successful completion, the
6017edb9f69SYuri Pankov.Fn regcomp
6027edb9f69SYuri Pankovfunction returns 0.
603c10c16deSRichard LoweOtherwise, it returns an integer value indicating an error as described in
6047edb9f69SYuri Pankov.In regex.h ,
6057edb9f69SYuri Pankovand the content of preg is undefined.
6067edb9f69SYuri Pankov.Pp
6077edb9f69SYuri PankovOn successful completion, the
6087edb9f69SYuri Pankov.Fn regexec
6097edb9f69SYuri Pankovfunction returns 0.
6107edb9f69SYuri PankovOtherwise it returns
6117edb9f69SYuri Pankov.Dv REG_NOMATCH
6127edb9f69SYuri Pankovto indicate no match, or
6137edb9f69SYuri Pankov.Dv REG_ENOSYS
6147edb9f69SYuri Pankovto indicate that the function is not supported.
6157edb9f69SYuri Pankov.Pp
6167edb9f69SYuri PankovUpon successful completion, the
6177edb9f69SYuri Pankov.Fn regerror
6187edb9f69SYuri Pankovfunction returns the number of bytes needed to hold the entire generated string.
6197edb9f69SYuri PankovOtherwise, it returns 0 to indicate that the function is not implemented.
6207edb9f69SYuri Pankov.Pp
6217edb9f69SYuri PankovThe
6227edb9f69SYuri Pankov.Fn regfree
6237edb9f69SYuri Pankovfunction returns no value.
6247edb9f69SYuri Pankov.Pp
6257edb9f69SYuri PankovThe following constants are defined as error return values:
6267edb9f69SYuri Pankov.Pp
6277edb9f69SYuri Pankov.Bl -tag -width "REG_ECOLLATE" -compact
6287edb9f69SYuri Pankov.It Dv REG_NOMATCH
6297edb9f69SYuri PankovThe
6307edb9f69SYuri Pankov.Fn regexec
6317edb9f69SYuri Pankovfunction failed to match.
6327edb9f69SYuri Pankov.It Dv REG_BADPAT
6337edb9f69SYuri PankovInvalid regular expression.
6347edb9f69SYuri Pankov.It Dv REG_ECOLLATE
6357edb9f69SYuri PankovInvalid collating element referenced.
6367edb9f69SYuri Pankov.It Dv REG_ECTYPE
6377edb9f69SYuri PankovInvalid character class type referenced.
6387edb9f69SYuri Pankov.It Dv REG_EESCAPE
6397edb9f69SYuri PankovTrailing
6407edb9f69SYuri Pankov.Qq \&\e
641c10c16deSRichard Lowein pattern.
6427edb9f69SYuri Pankov.It Dv REG_ESUBREG
6437edb9f69SYuri PankovNumber in
6447edb9f69SYuri Pankov.Qq \&\e Ns Em digit
6457edb9f69SYuri Pankovinvalid or in error.
6467edb9f69SYuri Pankov.It Dv REG_EBRACK
6477edb9f69SYuri Pankov.Qq []
6487edb9f69SYuri Pankovimbalance.
6497edb9f69SYuri Pankov.It Dv REG_ENOSYS
6507edb9f69SYuri PankovThe function is not supported.
6517edb9f69SYuri Pankov.It Dv REG_EPAREN
6527edb9f69SYuri Pankov.Qq \e(\e)
6537edb9f69SYuri Pankovor
6547edb9f69SYuri Pankov.Qq ()
6557edb9f69SYuri Pankovimbalance.
6567edb9f69SYuri Pankov.It Dv REG_EBRACE
6577edb9f69SYuri Pankov.Qq \e{\e}
6587edb9f69SYuri Pankovimbalance.
6597edb9f69SYuri Pankov.It Dv REG_BADBR
6607edb9f69SYuri PankovContent of
6617edb9f69SYuri Pankov.Qq \e{\e}
6627edb9f69SYuri Pankovinvalid: not a number, number too large, more than two
6637edb9f69SYuri Pankovnumbers, first larger than second.
6647edb9f69SYuri Pankov.It Dv REG_ERANGE
6657edb9f69SYuri PankovInvalid endpoint in range expression.
6667edb9f69SYuri Pankov.It Dv REG_ESPACE
6677edb9f69SYuri PankovOut of memory.
6687edb9f69SYuri Pankov.It Dv REG_BADRPT
6697edb9f69SYuri Pankov.Qq \&? ,
6707edb9f69SYuri Pankov.Qq *
6717edb9f69SYuri Pankovor
6727edb9f69SYuri Pankov.Qq +
6737edb9f69SYuri Pankovnot preceded by valid regular expression.
6747edb9f69SYuri Pankov.El
6757edb9f69SYuri Pankov.Sh USAGE
6767edb9f69SYuri PankovAn application could use:
6777edb9f69SYuri Pankov.Bd -literal -offset Ds
6787edb9f69SYuri Pankovregerror(code, preg, (char *)NULL, (size_t)0)
6797edb9f69SYuri Pankov.Ed
6807edb9f69SYuri Pankov.Pp
6817edb9f69SYuri Pankovto find out how big a buffer is needed for the generated string,
6827edb9f69SYuri Pankov.Fn malloc
6837edb9f69SYuri Pankova buffer to hold the string, and then call
6847edb9f69SYuri Pankov.Fn regerror
6857edb9f69SYuri Pankovagain to get the string
6867edb9f69SYuri Pankov.Po see
6877edb9f69SYuri Pankov.Xr malloc 3C
6887edb9f69SYuri Pankov.Pc .
6897edb9f69SYuri PankovAlternately, it could allocate a fixed, static buffer that is big enough to hold
6907edb9f69SYuri Pankovmost strings, and then use
6917edb9f69SYuri Pankov.Fn malloc
6927edb9f69SYuri Pankovallocate a larger buffer if it finds that this is too small.
6937edb9f69SYuri Pankov.Sh EXAMPLES
6947edb9f69SYuri PankovMatching string against the extended regular expression in pattern.
6957edb9f69SYuri Pankov.Bd -literal -offset Ds
696c10c16deSRichard Lowe#include <regex.h>
6977edb9f69SYuri Pankov
698c10c16deSRichard Lowe/*
699c10c16deSRichard Lowe* Match string against the extended regular expression in
700c10c16deSRichard Lowe* pattern, treating errors as no match.
701c10c16deSRichard Lowe*
702c10c16deSRichard Lowe* return 1 for match, 0 for no match
703c10c16deSRichard Lowe*/
704c10c16deSRichard Loweint
705c10c16deSRichard Lowematch(const char *string, char *pattern)
706c10c16deSRichard Lowe{
707c10c16deSRichard Lowe	int status;
708c10c16deSRichard Lowe	regex_t re;
7097edb9f69SYuri Pankov
710c10c16deSRichard Lowe	if (regcomp(&re, pattern, REG_EXTENDED\||\|REG_NOSUB) != 0) {
711c10c16deSRichard Lowe		return(0);      /* report error */
712c10c16deSRichard Lowe	}
713c10c16deSRichard Lowe	status = regexec(&re, string, (size_t) 0, NULL, 0);
714c10c16deSRichard Lowe	regfree(&re);
715c10c16deSRichard Lowe	if (status != 0) {
716c10c16deSRichard Lowe		return(0);      /* report error */
717c10c16deSRichard Lowe	}
718c10c16deSRichard Lowe	return(1);
719c10c16deSRichard Lowe}
7207edb9f69SYuri Pankov.Ed
7217edb9f69SYuri Pankov.Pp
7227edb9f69SYuri PankovThe following demonstrates how the
7237edb9f69SYuri Pankov.Dv REG_NOTBOL
7247edb9f69SYuri Pankovflag could be used with
7257edb9f69SYuri Pankov.Fn regexec
7267edb9f69SYuri Pankovto find all substrings in a line that match a pattern supplied by a user.
7277edb9f69SYuri Pankov.Pq For simplicity of the example, very little error checking is done.
7287edb9f69SYuri Pankov.Bd -literal -offset Ds
729c10c16deSRichard Lowe(void) regcomp(&re, pattern, 0);
7307edb9f69SYuri Pankov/* this call to regexec() finds the first match on the line */
731c10c16deSRichard Loweerror = regexec(&re, &buffer[0], 1, &pm, 0);
732c10c16deSRichard Lowewhile (error == 0) {    /* while matches found */
733c10c16deSRichard Lowe	/* substring found between pm.rm_so and pm.rm_eo */
7347edb9f69SYuri Pankov	/* This call to regexec() finds the next match */
735c10c16deSRichard Lowe	error = regexec(&re, buffer + pm.rm_eo, 1, &pm, REG_NOTBOL);
736c10c16deSRichard Lowe}
7377edb9f69SYuri Pankov.Ed
7387edb9f69SYuri Pankov.Sh ERRORS
7397edb9f69SYuri PankovNo errors are defined.
7407edb9f69SYuri Pankov.Sh CODE SET INDEPENDENCE
7417edb9f69SYuri Pankov.Sy Enabled
7427edb9f69SYuri Pankov.Sh INTERFACE STABILITY
7437edb9f69SYuri Pankov.Sy Standard
7447edb9f69SYuri Pankov.Sh MT-LEVEL
7457edb9f69SYuri Pankov.Sy MT-Safe with exceptions
7467edb9f69SYuri Pankov.Pp
7477edb9f69SYuri PankovThe
7487edb9f69SYuri Pankov.Fn regcomp
7497edb9f69SYuri Pankovfunction can be used safely in a multithreaded application as long as
7507edb9f69SYuri Pankov.Xr setlocale 3C
7517edb9f69SYuri Pankovis not being called to change the locale.
7527edb9f69SYuri Pankov.Sh SEE ALSO
753*bbf21555SRichard Lowe.Xr attributes 7 ,
754*bbf21555SRichard Lowe.Xr regex 7 ,
755*bbf21555SRichard Lowe.Xr standards 7
7567edb9f69SYuri Pankov.Pp
7577edb9f69SYuri Pankov.St -p1003.2 ,
7587edb9f69SYuri Pankovsections 2.8
7597edb9f69SYuri Pankov.Pq Regular Expression Notation
7607edb9f69SYuri Pankovand
7617edb9f69SYuri PankovB.5
7627edb9f69SYuri Pankov.Pq C Binding for Regular Expression Matching .
763