17edb9f69SYuri Pankov.\" Copyright (c) 1992, 1993, 1994 Henry Spencer. 27edb9f69SYuri Pankov.\" Copyright (c) 1992, 1993, 1994 37edb9f69SYuri Pankov.\" The Regents of the University of California. All rights reserved. 47edb9f69SYuri Pankov.\" 57edb9f69SYuri Pankov.\" This code is derived from software contributed to Berkeley by 67edb9f69SYuri Pankov.\" Henry Spencer. 77edb9f69SYuri Pankov.\" 87edb9f69SYuri Pankov.\" Redistribution and use in source and binary forms, with or without 97edb9f69SYuri Pankov.\" modification, are permitted provided that the following conditions 107edb9f69SYuri Pankov.\" are met: 117edb9f69SYuri Pankov.\" 1. Redistributions of source code must retain the above copyright 127edb9f69SYuri Pankov.\" notice, this list of conditions and the following disclaimer. 137edb9f69SYuri Pankov.\" 2. Redistributions in binary form must reproduce the above copyright 147edb9f69SYuri Pankov.\" notice, this list of conditions and the following disclaimer in the 157edb9f69SYuri Pankov.\" documentation and/or other materials provided with the distribution. 167edb9f69SYuri Pankov.\" 3. Neither the name of the University nor the names of its contributors 177edb9f69SYuri Pankov.\" may be used to endorse or promote products derived from this software 187edb9f69SYuri Pankov.\" without specific prior written permission. 197edb9f69SYuri Pankov.\" 207edb9f69SYuri Pankov.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 217edb9f69SYuri Pankov.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 227edb9f69SYuri Pankov.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 237edb9f69SYuri Pankov.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 247edb9f69SYuri Pankov.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 257edb9f69SYuri Pankov.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 267edb9f69SYuri Pankov.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 277edb9f69SYuri Pankov.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 287edb9f69SYuri Pankov.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 297edb9f69SYuri Pankov.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 307edb9f69SYuri Pankov.\" SUCH DAMAGE. 317edb9f69SYuri Pankov.\" 3266492cf0SYuri Pankov.\" Sun Microsystems, Inc. gratefully acknowledges The Open Group for 3366492cf0SYuri Pankov.\" permission to reproduce portions of its copyrighted documentation. 347edb9f69SYuri Pankov.\" Original documentation from The Open Group can be obtained online at 35c10c16deSRichard Lowe.\" http://www.opengroup.org/bookstore/. 367edb9f69SYuri Pankov.\" 3766492cf0SYuri Pankov.\" The Institute of Electrical and Electronics Engineers and The Open 3866492cf0SYuri Pankov.\" Group, have given us permission to reprint portions of their 397edb9f69SYuri Pankov.\" documentation. 407edb9f69SYuri Pankov.\" 4166492cf0SYuri Pankov.\" In the following statement, the phrase ``this text'' refers to portions 4266492cf0SYuri Pankov.\" of the system documentation. 437edb9f69SYuri Pankov.\" 4466492cf0SYuri Pankov.\" Portions of this text are reprinted and reproduced in electronic form 4566492cf0SYuri Pankov.\" in the SunOS Reference Manual, from IEEE Std 1003.1, 2004 Edition, 4666492cf0SYuri Pankov.\" Standard for Information Technology -- Portable Operating System 4766492cf0SYuri Pankov.\" Interface (POSIX), The Open Group Base Specifications Issue 6, 4866492cf0SYuri Pankov.\" Copyright (C) 2001-2004 by the Institute of Electrical and Electronics 4966492cf0SYuri Pankov.\" Engineers, Inc and The Open Group. In the event of any discrepancy 5066492cf0SYuri Pankov.\" between these versions and the original IEEE and The Open Group 5166492cf0SYuri Pankov.\" Standard, the original IEEE and The Open Group Standard is the referee 5266492cf0SYuri Pankov.\" document. The original Standard can be obtained online at 537edb9f69SYuri Pankov.\" http://www.opengroup.org/unix/online.html. 547edb9f69SYuri Pankov.\" 55c10c16deSRichard Lowe.\" This notice shall appear on any product containing this material. 567edb9f69SYuri Pankov.\" 577edb9f69SYuri Pankov.\" The contents of this file are subject to the terms of the 587edb9f69SYuri Pankov.\" Common Development and Distribution License (the "License"). 597edb9f69SYuri Pankov.\" You may not use this file except in compliance with the License. 607edb9f69SYuri Pankov.\" 617edb9f69SYuri Pankov.\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE 627edb9f69SYuri Pankov.\" or http://www.opensolaris.org/os/licensing. 637edb9f69SYuri Pankov.\" See the License for the specific language governing permissions 647edb9f69SYuri Pankov.\" and limitations under the License. 657edb9f69SYuri Pankov.\" 667edb9f69SYuri Pankov.\" When distributing Covered Code, include this CDDL HEADER in each 677edb9f69SYuri Pankov.\" file and include the License file at usr/src/OPENSOLARIS.LICENSE. 687edb9f69SYuri Pankov.\" If applicable, add the following below this CDDL HEADER, with the 697edb9f69SYuri Pankov.\" fields enclosed by brackets "[]" replaced with your own identifying 707edb9f69SYuri Pankov.\" information: Portions Copyright [yyyy] [name of copyright owner] 717edb9f69SYuri Pankov.\" 727edb9f69SYuri Pankov.\" 737edb9f69SYuri Pankov.\" Copyright (c) 1992, X/Open Company Limited. All Rights Reserved. 747edb9f69SYuri Pankov.\" Portions Copyright (c) 2003, Sun Microsystems, Inc. All Rights Reserved. 757edb9f69SYuri Pankov.\" Copyright 2017 Nexenta Systems, Inc. 767edb9f69SYuri Pankov.\" 777edb9f69SYuri Pankov.Dd June 14, 2017 787edb9f69SYuri Pankov.Dt REGCOMP 3C 797edb9f69SYuri Pankov.Os 807edb9f69SYuri Pankov.Sh NAME 817edb9f69SYuri Pankov.Nm regcomp , 827edb9f69SYuri Pankov.Nm regexec , 837edb9f69SYuri Pankov.Nm regerror , 847edb9f69SYuri Pankov.Nm regfree 857edb9f69SYuri Pankov.Nd regular-expression library 867edb9f69SYuri Pankov.Sh LIBRARY 877edb9f69SYuri Pankov.Lb libc 887edb9f69SYuri Pankov.Sh SYNOPSIS 897edb9f69SYuri Pankov.In regex.h 907edb9f69SYuri Pankov.Ft int 917edb9f69SYuri Pankov.Fo regcomp 927edb9f69SYuri Pankov.Fa "regex_t *restrict preg" "const char *restrict pattern" "int cflags" 937edb9f69SYuri Pankov.Fc 947edb9f69SYuri Pankov.Ft int 957edb9f69SYuri Pankov.Fo regexec 967edb9f69SYuri Pankov.Fa "const regex_t *restrict preg" "const char *restrict string" 977edb9f69SYuri Pankov.Fa "size_t nmatch" "regmatch_t pmatch[restrict]" "int eflags" 987edb9f69SYuri Pankov.Fc 997edb9f69SYuri Pankov.Ft size_t 1007edb9f69SYuri Pankov.Fo regerror 1017edb9f69SYuri Pankov.Fa "int errcode" "const regex_t *restrict preg" 1027edb9f69SYuri Pankov.Fa "char *restrict errbuf" "size_t errbuf_size" 1037edb9f69SYuri Pankov.Fc 1047edb9f69SYuri Pankov.Ft void 1057edb9f69SYuri Pankov.Fn regfree "regex_t *preg" 1067edb9f69SYuri Pankov.Sh DESCRIPTION 1077edb9f69SYuri PankovThese routines implement 1087edb9f69SYuri Pankov.St -p1003.2 1097edb9f69SYuri Pankovregular expressions; see 110*bbf21555SRichard Lowe.Xr regex 7 . 1117edb9f69SYuri PankovThe 1127edb9f69SYuri Pankov.Fn regcomp 1137edb9f69SYuri Pankovfunction compiles an RE written as a string into an internal form, 1147edb9f69SYuri Pankov.Fn regexec 1157edb9f69SYuri Pankovmatches that internal form against a string and reports results, 1167edb9f69SYuri Pankov.Fn regerror 1177edb9f69SYuri Pankovtransforms error codes from either into human-readable messages, 1187edb9f69SYuri Pankovand 1197edb9f69SYuri Pankov.Fn regfree 1207edb9f69SYuri Pankovfrees any dynamically-allocated storage used by the internal form 1217edb9f69SYuri Pankovof an RE. 1227edb9f69SYuri Pankov.Pp 1237edb9f69SYuri PankovThe header 1247edb9f69SYuri Pankov.In regex.h 1257edb9f69SYuri Pankovdeclares two structure types, 1267edb9f69SYuri Pankov.Ft regex_t 1277edb9f69SYuri Pankovand 1287edb9f69SYuri Pankov.Ft regmatch_t , 1297edb9f69SYuri Pankovthe former for compiled internal forms and the latter for match reporting. 1307edb9f69SYuri PankovIt also declares the four functions, a type 1317edb9f69SYuri Pankov.Ft regoff_t , 1327edb9f69SYuri Pankovand a number of constants with names starting with 1337edb9f69SYuri Pankov.Qq Dv REG_ . 1347edb9f69SYuri Pankov.Ss Fn regcomp 1357edb9f69SYuri PankovThe 1367edb9f69SYuri Pankov.Fn regcomp 1377edb9f69SYuri Pankovfunction compiles the regular expression contained in the 1387edb9f69SYuri Pankov.Fa pattern 1397edb9f69SYuri Pankovstring, subject to the flags in 1407edb9f69SYuri Pankov.Fa cflags , 1417edb9f69SYuri Pankovand places the results in the 1427edb9f69SYuri Pankov.Ft regex_t 1437edb9f69SYuri Pankovstructure pointed to by 1447edb9f69SYuri Pankov.Fa preg . 1457edb9f69SYuri PankovThe 1467edb9f69SYuri Pankov.Fa cflags 1477edb9f69SYuri Pankovargument is the bitwise OR of zero or more of the following flags: 1487edb9f69SYuri Pankov.Bl -tag -width REG_EXTENDED 1497edb9f69SYuri Pankov.It Dv REG_EXTENDED 1507edb9f69SYuri PankovCompile extended regular expressions 1517edb9f69SYuri Pankov.Pq EREs , 1527edb9f69SYuri Pankovrather than the basic regular expressions 1537edb9f69SYuri Pankov.Pq BREs 1547edb9f69SYuri Pankovthat are the default. 1557edb9f69SYuri Pankov.It Dv REG_BASIC 1567edb9f69SYuri PankovThis is a synonym for 0, provided as a counterpart to 1577edb9f69SYuri Pankov.Dv REG_EXTENDED 1587edb9f69SYuri Pankovto improve readability. 1597edb9f69SYuri Pankov.It Dv REG_NOSPEC 1607edb9f69SYuri PankovCompile with recognition of all special characters turned off. 1617edb9f69SYuri PankovAll characters are thus considered ordinary, so the RE is a literal string. 1627edb9f69SYuri PankovThis is an extension, compatible with but not specified by 1637edb9f69SYuri Pankov.St -p1003.2 , 1647edb9f69SYuri Pankovand should be used with caution in software intended to be portable to other 1657edb9f69SYuri Pankovsystems. 1667edb9f69SYuri Pankov.Dv REG_EXTENDED 1677edb9f69SYuri Pankovand 1687edb9f69SYuri Pankov.Dv REG_NOSPEC 1697edb9f69SYuri Pankovmay not be used in the same call to 1707edb9f69SYuri Pankov.Fn regcomp . 1717edb9f69SYuri Pankov.It Dv REG_ICASE 1727edb9f69SYuri PankovCompile for matching that ignores upper/lower case distinctions. 1737edb9f69SYuri PankovSee 174*bbf21555SRichard Lowe.Xr regex 7 . 1757edb9f69SYuri Pankov.It Dv REG_NOSUB 1767edb9f69SYuri PankovCompile for matching that need only report success or failure, 1777edb9f69SYuri Pankovnot what was matched. 1787edb9f69SYuri Pankov.It Dv REG_NEWLINE 1797edb9f69SYuri PankovCompile for newline-sensitive matching. 1807edb9f69SYuri PankovBy default, newline is a completely ordinary character with no special 1817edb9f69SYuri Pankovmeaning in either REs or strings. 1827edb9f69SYuri PankovWith this flag, 1837edb9f69SYuri Pankov.Qq [^ 1847edb9f69SYuri Pankovbracket expressions and 1857edb9f69SYuri Pankov.Qq \&. 1867edb9f69SYuri Pankovnever match newline, 1877edb9f69SYuri Pankova 1887edb9f69SYuri Pankov.Qq \&^ 1897edb9f69SYuri Pankovanchor matches the null string after any newline in the string in addition to 1907edb9f69SYuri Pankovits normal function, and the 1917edb9f69SYuri Pankov.Qq \&$ 1927edb9f69SYuri Pankovanchor matches the null string before any newline in the string in addition to 1937edb9f69SYuri Pankovits normal function. 1947edb9f69SYuri Pankov.It Dv REG_PEND 1957edb9f69SYuri PankovThe regular expression ends, not at the first NUL, but just before the character 1967edb9f69SYuri Pankovpointed to by the 1977edb9f69SYuri Pankov.Va re_endp 1987edb9f69SYuri Pankovmember of the structure pointed to by 1997edb9f69SYuri Pankov.Fa preg . 2007edb9f69SYuri PankovThe 2017edb9f69SYuri Pankov.Va re_endp 2027edb9f69SYuri Pankovmember is of type 2037edb9f69SYuri Pankov.Vt "const char *" . 2047edb9f69SYuri PankovThis flag permits inclusion of NULs in the RE; they are considered ordinary 2057edb9f69SYuri Pankovcharacters. 2067edb9f69SYuri PankovThis is an extension, compatible with but not specified by 2077edb9f69SYuri Pankov.St -p1003.2 , 2087edb9f69SYuri Pankovand should be used with caution in software intended to be portable to other 2097edb9f69SYuri Pankovsystems. 2107edb9f69SYuri Pankov.El 2117edb9f69SYuri Pankov.Pp 2127edb9f69SYuri PankovWhen successful, 2137edb9f69SYuri Pankov.Fn regcomp 2147edb9f69SYuri Pankovreturns 0 and fills in the structure pointed to by 2157edb9f69SYuri Pankov.Fa preg . 2167edb9f69SYuri PankovOne member of that structure 2177edb9f69SYuri Pankov.Po other than 2187edb9f69SYuri Pankov.Va re_endp 2197edb9f69SYuri Pankov.Pc 2207edb9f69SYuri Pankovis publicized: 2217edb9f69SYuri Pankov.Va re_nsub , 2227edb9f69SYuri Pankovof type 2237edb9f69SYuri Pankov.Ft size_t , 2247edb9f69SYuri Pankovcontains the number of parenthesized subexpressions within the RE 2257edb9f69SYuri Pankov.Po except that the value of this member is undefined if the 2267edb9f69SYuri Pankov.Dv REG_NOSUB 2277edb9f69SYuri Pankovflag was used 2287edb9f69SYuri Pankov.Pc . 2297edb9f69SYuri Pankov.Ss Fn regexec 2307edb9f69SYuri PankovThe 2317edb9f69SYuri Pankov.Fn regexec 2327edb9f69SYuri Pankovfunction matches the compiled RE pointed to by 2337edb9f69SYuri Pankov.Fa preg 2347edb9f69SYuri Pankovagainst the 2357edb9f69SYuri Pankov.Fa string , 2367edb9f69SYuri Pankovsubject to the flags in 2377edb9f69SYuri Pankov.Fa eflags , 2387edb9f69SYuri Pankovand reports results using 2397edb9f69SYuri Pankov.Fa nmatch , 2407edb9f69SYuri Pankov.Fa pmatch , 2417edb9f69SYuri Pankovand the returned value. 2427edb9f69SYuri PankovThe RE must have been compiled by a previous invocation of 2437edb9f69SYuri Pankov.Fn regcomp . 2447edb9f69SYuri PankovThe compiled form is not altered during execution of 2457edb9f69SYuri Pankov.Fn regexec , 2467edb9f69SYuri Pankovso a single compiled RE can be used simultaneously by multiple threads. 2477edb9f69SYuri Pankov.Pp 2487edb9f69SYuri PankovBy default, the NUL-terminated string pointed to by 2497edb9f69SYuri Pankov.Fa string 2507edb9f69SYuri Pankovis considered to be the text of an entire line, minus any terminating 2517edb9f69SYuri Pankovnewline. 2527edb9f69SYuri PankovThe 2537edb9f69SYuri Pankov.Fa eflags 2547edb9f69SYuri Pankovargument is the bitwise OR of zero or more of the following flags: 2557edb9f69SYuri Pankov.Bl -tag -width REG_STARTEND 2567edb9f69SYuri Pankov.It Dv REG_NOTBOL 2577edb9f69SYuri PankovThe first character of the string is treated as the continuation 2587edb9f69SYuri Pankovof a line. 2597edb9f69SYuri PankovThis means that the anchors 2607edb9f69SYuri Pankov.Qq \&^ , 2617edb9f69SYuri Pankov.Qq [[:<:]] , 2627edb9f69SYuri Pankovand 2637edb9f69SYuri Pankov.Qq \e< 2647edb9f69SYuri Pankovdo not match before it; but see 2657edb9f69SYuri Pankov.Dv REG_STARTEND 2667edb9f69SYuri Pankovbelow. 2677edb9f69SYuri PankovThis does not affect the behavior of newlines under 2687edb9f69SYuri Pankov.Dv REG_NEWLINE . 2697edb9f69SYuri Pankov.It Dv REG_NOTEOL 2707edb9f69SYuri PankovThe NUL terminating the string does not end a line, so the 2717edb9f69SYuri Pankov.Qq \&$ 2727edb9f69SYuri Pankovanchor does not match before it. 2737edb9f69SYuri PankovThis does not affect the behavior of newlines under 2747edb9f69SYuri Pankov.Dv REG_NEWLINE . 2757edb9f69SYuri Pankov.It Dv REG_STARTEND 2767edb9f69SYuri PankovThe string is considered to start at 2777edb9f69SYuri Pankov.Fa string No + 2787edb9f69SYuri Pankov.Fa pmatch Ns [0]. Ns Fa rm_so 2797edb9f69SYuri Pankovand to end before the byte located at 2807edb9f69SYuri Pankov.Fa string No + 2817edb9f69SYuri Pankov.Fa pmatch Ns [0]. Ns Fa rm_eo , 2827edb9f69SYuri Pankovregardless of the value of 2837edb9f69SYuri Pankov.Fa nmatch . 2847edb9f69SYuri PankovSee below for the definition of 2857edb9f69SYuri Pankov.Fa pmatch 2867edb9f69SYuri Pankovand 2877edb9f69SYuri Pankov.Fa nmatch . 2887edb9f69SYuri PankovThis is an extension, compatible with but not specified by 2897edb9f69SYuri Pankov.St -p1003.2 , 2907edb9f69SYuri Pankovand should be used with caution in software intended to be portable to other 2917edb9f69SYuri Pankovsystems. 2927edb9f69SYuri Pankov.Pp 2937edb9f69SYuri PankovWithout 2947edb9f69SYuri Pankov.Dv REG_NOTBOL , 2957edb9f69SYuri Pankovthe position 2967edb9f69SYuri Pankov.Fa rm_so 2977edb9f69SYuri Pankovis considered the beginning of a line, such that 2987edb9f69SYuri Pankov.Qq \&^ 2997edb9f69SYuri Pankovmatches before it, and the beginning of a word if there is a word character at 3007edb9f69SYuri Pankovthis position, such that 3017edb9f69SYuri Pankov.Qq [[:<:]] 3027edb9f69SYuri Pankovand 3037edb9f69SYuri Pankov.Qq \e< 3047edb9f69SYuri Pankovmatch before it. 3057edb9f69SYuri Pankov.Pp 3067edb9f69SYuri PankovWith 3077edb9f69SYuri Pankov.Dv REG_NOTBOL , 3087edb9f69SYuri Pankovthe character at position 3097edb9f69SYuri Pankov.Fa rm_so 3107edb9f69SYuri Pankovis treated as the continuation of a line, and if 3117edb9f69SYuri Pankov.Fa rm_so 3127edb9f69SYuri Pankovis greater than 0, the preceding character is taken into consideration. 3137edb9f69SYuri PankovIf the preceding character is a newline and the regular expression was compiled 3147edb9f69SYuri Pankovwith 3157edb9f69SYuri Pankov.Dv REG_NEWLINE , 3167edb9f69SYuri Pankov.Qq ^ 3177edb9f69SYuri Pankovmatches before the string; if the preceding character is not a word character 3187edb9f69SYuri Pankovbut the string starts with a word character, 3197edb9f69SYuri Pankov.Qq [[:<:]] 3207edb9f69SYuri Pankovand 3217edb9f69SYuri Pankov.Qq \e< 3227edb9f69SYuri Pankovmatch before the string. 3237edb9f69SYuri Pankov.El 3247edb9f69SYuri Pankov.Pp 3257edb9f69SYuri PankovSee 326*bbf21555SRichard Lowe.Xr regex 7 3277edb9f69SYuri Pankovfor a discussion of what is matched in situations where an RE or a portion 3287edb9f69SYuri Pankovthereof could match any of several substrings of 3297edb9f69SYuri Pankov.Fa string . 3307edb9f69SYuri Pankov.Pp 3317edb9f69SYuri PankovIf 3327edb9f69SYuri Pankov.Dv REG_NOSUB 3337edb9f69SYuri Pankovwas specified in the compilation of the RE, or if 3347edb9f69SYuri Pankov.Fa nmatch 3357edb9f69SYuri Pankovis 0, 3367edb9f69SYuri Pankov.Fn regexec 3377edb9f69SYuri Pankovignores the 3387edb9f69SYuri Pankov.Fa pmatch 3397edb9f69SYuri Pankovargument 3407edb9f69SYuri Pankov.Po but see below for the case where 3417edb9f69SYuri Pankov.Dv REG_STARTEND 3427edb9f69SYuri Pankovis specified 3437edb9f69SYuri Pankov.Pc . 3447edb9f69SYuri PankovOtherwise, 3457edb9f69SYuri Pankov.Fa pmatch 3467edb9f69SYuri Pankovpoints to an array of 3477edb9f69SYuri Pankov.Fa nmatch 3487edb9f69SYuri Pankovstructures of type 3497edb9f69SYuri Pankov.Ft regmatch_t . 3507edb9f69SYuri PankovSuch a structure has at least the members 3517edb9f69SYuri Pankov.Va rm_so 3527edb9f69SYuri Pankovand 3537edb9f69SYuri Pankov.Va rm_eo , 3547edb9f69SYuri Pankovboth of type 3557edb9f69SYuri Pankov.Ft regoff_t 3567edb9f69SYuri Pankov.Po a signed arithmetic type at least as large as an 3577edb9f69SYuri Pankov.Ft off_t 3587edb9f69SYuri Pankovand a 3597edb9f69SYuri Pankov.Ft ssize_t 3607edb9f69SYuri Pankov.Pc , 3617edb9f69SYuri Pankovcontaining respectively the offset of the first character of a substring 3627edb9f69SYuri Pankovand the offset of the first character after the end of the substring. 3637edb9f69SYuri PankovOffsets are measured from the beginning of the 3647edb9f69SYuri Pankov.Fa string 3657edb9f69SYuri Pankovargument given to 3667edb9f69SYuri Pankov.Fn regexec . 3677edb9f69SYuri PankovAn empty substring is denoted by equal offsets, both indicating the character 3687edb9f69SYuri Pankovfollowing the empty substring. 3697edb9f69SYuri Pankov.Pp 3707edb9f69SYuri PankovThe 0th member of the 3717edb9f69SYuri Pankov.Fa pmatch 3727edb9f69SYuri Pankovarray is filled in to indicate what substring of 3737edb9f69SYuri Pankov.Fa string 3747edb9f69SYuri Pankovwas matched by the entire RE. 3757edb9f69SYuri PankovRemaining members report what substring was matched by parenthesized 3767edb9f69SYuri Pankovsubexpressions within the RE; member 3777edb9f69SYuri Pankov.Va i 3787edb9f69SYuri Pankovreports subexpression 3797edb9f69SYuri Pankov.Va i , 3807edb9f69SYuri Pankovwith subexpressions counted 3817edb9f69SYuri Pankov.Pq starting at 1 3827edb9f69SYuri Pankovby the order of their opening parentheses in the RE, left to right. 3837edb9f69SYuri PankovUnused entries in the array 3847edb9f69SYuri Pankov.Po corresponding either to subexpressions that did not participate in the match 3857edb9f69SYuri Pankovat all, or to subexpressions that do not exist in the RE 3867edb9f69SYuri Pankov.Po that is, 3877edb9f69SYuri Pankov.Va i 3887edb9f69SYuri Pankov> 3897edb9f69SYuri Pankov.Fa preg Ns -> Ns Va re_nsub 3907edb9f69SYuri Pankov.Pc 3917edb9f69SYuri Pankov.Pc 3927edb9f69SYuri Pankovhave both 3937edb9f69SYuri Pankov.Va rm_so 3947edb9f69SYuri Pankovand 3957edb9f69SYuri Pankov.Va rm_eo 3967edb9f69SYuri Pankovset to -1. 3977edb9f69SYuri PankovIf a subexpression participated in the match several times, 3987edb9f69SYuri Pankovthe reported substring is the last one it matched. 3997edb9f69SYuri Pankov.Po Note, as an example in particular, that when the RE 4007edb9f69SYuri Pankov.Qq (b*)+ 4017edb9f69SYuri Pankovmatches 4027edb9f69SYuri Pankov.Qq bbb , 4037edb9f69SYuri Pankovthe parenthesized subexpression matches each of the three 4047edb9f69SYuri Pankov.So Li b Sc Ns s 4057edb9f69SYuri Pankovand then an infinite number of empty strings following the last 4067edb9f69SYuri Pankov.Qq b , 4077edb9f69SYuri Pankovso the reported substring is one of the empties. 4087edb9f69SYuri Pankov.Pc 4097edb9f69SYuri Pankov.Pp 4107edb9f69SYuri PankovIf 4117edb9f69SYuri Pankov.Dv REG_STARTEND 4127edb9f69SYuri Pankovis specified, 4137edb9f69SYuri Pankov.Fa pmatch 4147edb9f69SYuri Pankovmust point to at least one 4157edb9f69SYuri Pankov.Ft regmatch_t 4167edb9f69SYuri Pankov.Po even if 4177edb9f69SYuri Pankov.Fa nmatch 4187edb9f69SYuri Pankovis 0 or 4197edb9f69SYuri Pankov.Dv REG_NOSUB 4207edb9f69SYuri Pankovwas specified 4217edb9f69SYuri Pankov.Pc , 4227edb9f69SYuri Pankovto hold the input offsets for 4237edb9f69SYuri Pankov.Dv REG_STARTEND . 4247edb9f69SYuri PankovUse for output is still entirely controlled by 4257edb9f69SYuri Pankov.Fa nmatch ; 4267edb9f69SYuri Pankovif 4277edb9f69SYuri Pankov.Fa nmatch 4287edb9f69SYuri Pankovis 0 or 4297edb9f69SYuri Pankov.Dv REG_NOSUB 4307edb9f69SYuri Pankovwas specified, 4317edb9f69SYuri Pankovthe value of 4327edb9f69SYuri Pankov.Fa pmatch Ns [0] 4337edb9f69SYuri Pankovwill not be changed by a successful 4347edb9f69SYuri Pankov.Fn regexec . 4357edb9f69SYuri Pankov.Ss Fn regerror 4367edb9f69SYuri PankovThe 4377edb9f69SYuri Pankov.Fn regerror 4387edb9f69SYuri Pankovfunction maps a non-zero 4397edb9f69SYuri Pankov.Fa errcode 4407edb9f69SYuri Pankovfrom either 4417edb9f69SYuri Pankov.Fn regcomp 442c10c16deSRichard Loweor 4437edb9f69SYuri Pankov.Fn regexec 4447edb9f69SYuri Pankovto a human-readable, printable message. 4457edb9f69SYuri PankovIf 4467edb9f69SYuri Pankov.Fa preg 4477edb9f69SYuri Pankovis non-NULL, the error code should have arisen from use of the 4487edb9f69SYuri Pankov.Ft regex_t 4497edb9f69SYuri Pankovpointed to by 4507edb9f69SYuri Pankov.Fa preg , 4517edb9f69SYuri Pankovand if the error code came from 4527edb9f69SYuri Pankov.Fn regcomp , 4537edb9f69SYuri Pankovit should have been the result from the most recent 4547edb9f69SYuri Pankov.Fn regcomp 4557edb9f69SYuri Pankovusing that 4567edb9f69SYuri Pankov.Ft regex_t . 4577edb9f69SYuri PankovThe 4587edb9f69SYuri Pankov.Po 4597edb9f69SYuri Pankov.Fn regerror 4607edb9f69SYuri Pankovmay be able to supply a more detailed message using information 4617edb9f69SYuri Pankovfrom the 4627edb9f69SYuri Pankov.Ft regex_t . 4637edb9f69SYuri Pankov.Pc 4647edb9f69SYuri PankovThe 4657edb9f69SYuri Pankov.Fn regerror 4667edb9f69SYuri Pankovfunction places the NUL-terminated message into the buffer pointed to by 4677edb9f69SYuri Pankov.Fa errbuf , 4687edb9f69SYuri Pankovlimiting the length 4697edb9f69SYuri Pankov.Pq including the NUL 4707edb9f69SYuri Pankovto at most 4717edb9f69SYuri Pankov.Fa errbuf_size 4727edb9f69SYuri Pankovbytes. 4737edb9f69SYuri PankovIf the whole message will not fit, as much of it as will fit before the 4747edb9f69SYuri Pankovterminating NUL is supplied. 4757edb9f69SYuri PankovIn any case, the returned value is the size of buffer needed to hold the whole 4767edb9f69SYuri Pankovmessage 4777edb9f69SYuri Pankov.Pq including terminating NUL . 4787edb9f69SYuri PankovIf 4797edb9f69SYuri Pankov.Fa errbuf_size 4807edb9f69SYuri Pankovis 0, 4817edb9f69SYuri Pankov.Fa errbuf 4827edb9f69SYuri Pankovis ignored but the return value is still correct. 4837edb9f69SYuri Pankov.Pp 4847edb9f69SYuri PankovIf the 4857edb9f69SYuri Pankov.Fa errcode 4867edb9f69SYuri Pankovgiven to 4877edb9f69SYuri Pankov.Fn regerror 4887edb9f69SYuri Pankovis first ORed with 4897edb9f69SYuri Pankov.Dv REG_ITOA , 4907edb9f69SYuri Pankovthe 4917edb9f69SYuri Pankov.Qq message 4927edb9f69SYuri Pankovthat results is the printable name of the error code, e.g. 4937edb9f69SYuri Pankov.Qq Dv REG_NOMATCH , 4947edb9f69SYuri Pankovrather than an explanation thereof. 4957edb9f69SYuri PankovIf 4967edb9f69SYuri Pankov.Fa errcode 4977edb9f69SYuri Pankovis 4987edb9f69SYuri Pankov.Dv REG_ATOI , 4997edb9f69SYuri Pankovthen 5007edb9f69SYuri Pankov.Fa preg 5017edb9f69SYuri Pankovshall be non-NULL and the 5027edb9f69SYuri Pankov.Va re_endp 5037edb9f69SYuri Pankovmember of the structure it points to must point to the printable name of an 5047edb9f69SYuri Pankoverror code; in this case, the result in 5057edb9f69SYuri Pankov.Fa errbuf 5067edb9f69SYuri Pankovis the decimal digits of the numeric value of the error code 5077edb9f69SYuri Pankov.Pq 0 if the name is not recognized . 5087edb9f69SYuri Pankov.Dv REG_ITOA 5097edb9f69SYuri Pankovand 5107edb9f69SYuri Pankov.Dv REG_ATOI 5117edb9f69SYuri Pankovare intended primarily as debugging facilities; they are extensions, 5127edb9f69SYuri Pankovcompatible with but not specified by 5137edb9f69SYuri Pankov.St -p1003.2 , 5147edb9f69SYuri Pankovand should be used with caution in software intended to be portable to other 5157edb9f69SYuri Pankovsystems. 5167edb9f69SYuri Pankov.Ss Fn regfree 5177edb9f69SYuri PankovThe 5187edb9f69SYuri Pankov.Fn regfree 5197edb9f69SYuri Pankovfunction frees any dynamically-allocated storage associated with the compiled RE 5207edb9f69SYuri Pankovpointed to by 5217edb9f69SYuri Pankov.Fa preg . 5227edb9f69SYuri PankovThe remaining 5237edb9f69SYuri Pankov.Ft regex_t 5247edb9f69SYuri Pankovis no longer a valid compiled RE and the effect of supplying it to 5257edb9f69SYuri Pankov.Fn regexec 5267edb9f69SYuri Pankovor 5277edb9f69SYuri Pankov.Fn regerror 5287edb9f69SYuri Pankovis undefined. 5297edb9f69SYuri Pankov.Sh IMPLEMENTATION NOTES 5307edb9f69SYuri PankovThere are a number of decisions that 5317edb9f69SYuri Pankov.St -p1003.2 5327edb9f69SYuri Pankovleaves up to the implementor, 5337edb9f69SYuri Pankoveither by explicitly saying 5347edb9f69SYuri Pankov.Qq undefined 5357edb9f69SYuri Pankovor by virtue of them being forbidden by the RE grammar. 5367edb9f69SYuri PankovThis implementation treats them as follows. 5377edb9f69SYuri Pankov.Pp 5387edb9f69SYuri PankovThere is no particular limit on the length of REs, except insofar as memory is 5397edb9f69SYuri Pankovlimited. 5407edb9f69SYuri PankovMemory usage is approximately linear in RE size, and largely insensitive 5417edb9f69SYuri Pankovto RE complexity, except for bounded repetitions. 5427edb9f69SYuri Pankov.Pp 5437edb9f69SYuri PankovA backslashed character other than one specifically given a magic meaning by 5447edb9f69SYuri Pankov.St -p1003.2 5457edb9f69SYuri Pankov.Pq such magic meanings occur only in BREs 5467edb9f69SYuri Pankovis taken as an ordinary character. 5477edb9f69SYuri Pankov.Pp 5487edb9f69SYuri PankovAny unmatched 5497edb9f69SYuri Pankov.Qq \&[ 5507edb9f69SYuri Pankovis a 5517edb9f69SYuri Pankov.Dv REG_EBRACK 5527edb9f69SYuri Pankoverror. 5537edb9f69SYuri Pankov.Pp 5547edb9f69SYuri PankovEquivalence classes cannot begin or end bracket-expression ranges. 5557edb9f69SYuri PankovThe endpoint of one range cannot begin another. 5567edb9f69SYuri Pankov.Pp 5577edb9f69SYuri Pankov.Dv RE_DUP_MAX , 5587edb9f69SYuri Pankovthe limit on repetition counts in bounded repetitions, is 255. 5597edb9f69SYuri Pankov.Pp 5607edb9f69SYuri PankovA repetition operator 5617edb9f69SYuri Pankov.Po 5627edb9f69SYuri Pankov.Qq \&? , 5637edb9f69SYuri Pankov.Qq \&* , 5647edb9f69SYuri Pankov.Qq \&+ , 5657edb9f69SYuri Pankovor bounds 5667edb9f69SYuri Pankov.Pc 5677edb9f69SYuri Pankovcannot follow another repetition operator. 5687edb9f69SYuri PankovA repetition operator cannot begin an expression or subexpression 5697edb9f69SYuri Pankovor follow 5707edb9f69SYuri Pankov.Qq \&^ 5717edb9f69SYuri Pankovor 5727edb9f69SYuri Pankov.Qq \&| . 5737edb9f69SYuri Pankov.Pp 5747edb9f69SYuri Pankov.Qq \&| 5757edb9f69SYuri Pankovcannot appear first or last in a (sub)expression or after another 5767edb9f69SYuri Pankov.Qq \&| , 5777edb9f69SYuri Pankovi.e., an operand of 5787edb9f69SYuri Pankov.Qq \&| 5797edb9f69SYuri Pankovcannot be an empty subexpression. 5807edb9f69SYuri PankovAn empty parenthesized subexpression, 5817edb9f69SYuri Pankov.Qq () , 5827edb9f69SYuri Pankovis legal and matches an empty (sub)string. 5837edb9f69SYuri PankovAn empty string is not a legal RE. 5847edb9f69SYuri Pankov.Pp 5857edb9f69SYuri PankovA 5867edb9f69SYuri Pankov.Qq \&{ 5877edb9f69SYuri Pankovfollowed by a digit is considered the beginning of bounds for a bounded 5887edb9f69SYuri Pankovrepetition, which must then follow the syntax for bounds. 5897edb9f69SYuri PankovA 5907edb9f69SYuri Pankov.Qq \&{ 5917edb9f69SYuri Pankov.Em not 5927edb9f69SYuri Pankovfollowed by a digit is considered an ordinary character. 5937edb9f69SYuri Pankov.Pp 5947edb9f69SYuri Pankov.Qq \&^ 5957edb9f69SYuri Pankovand 5967edb9f69SYuri Pankov.Qq \&$ 5977edb9f69SYuri Pankovbeginning and ending subexpressions in BREs are anchors, not ordinary 5987edb9f69SYuri Pankovcharacters. 5997edb9f69SYuri Pankov.Sh RETURN VALUES 6007edb9f69SYuri PankovOn successful completion, the 6017edb9f69SYuri Pankov.Fn regcomp 6027edb9f69SYuri Pankovfunction returns 0. 603c10c16deSRichard LoweOtherwise, it returns an integer value indicating an error as described in 6047edb9f69SYuri Pankov.In regex.h , 6057edb9f69SYuri Pankovand the content of preg is undefined. 6067edb9f69SYuri Pankov.Pp 6077edb9f69SYuri PankovOn successful completion, the 6087edb9f69SYuri Pankov.Fn regexec 6097edb9f69SYuri Pankovfunction returns 0. 6107edb9f69SYuri PankovOtherwise it returns 6117edb9f69SYuri Pankov.Dv REG_NOMATCH 6127edb9f69SYuri Pankovto indicate no match, or 6137edb9f69SYuri Pankov.Dv REG_ENOSYS 6147edb9f69SYuri Pankovto indicate that the function is not supported. 6157edb9f69SYuri Pankov.Pp 6167edb9f69SYuri PankovUpon successful completion, the 6177edb9f69SYuri Pankov.Fn regerror 6187edb9f69SYuri Pankovfunction returns the number of bytes needed to hold the entire generated string. 6197edb9f69SYuri PankovOtherwise, it returns 0 to indicate that the function is not implemented. 6207edb9f69SYuri Pankov.Pp 6217edb9f69SYuri PankovThe 6227edb9f69SYuri Pankov.Fn regfree 6237edb9f69SYuri Pankovfunction returns no value. 6247edb9f69SYuri Pankov.Pp 6257edb9f69SYuri PankovThe following constants are defined as error return values: 6267edb9f69SYuri Pankov.Pp 6277edb9f69SYuri Pankov.Bl -tag -width "REG_ECOLLATE" -compact 6287edb9f69SYuri Pankov.It Dv REG_NOMATCH 6297edb9f69SYuri PankovThe 6307edb9f69SYuri Pankov.Fn regexec 6317edb9f69SYuri Pankovfunction failed to match. 6327edb9f69SYuri Pankov.It Dv REG_BADPAT 6337edb9f69SYuri PankovInvalid regular expression. 6347edb9f69SYuri Pankov.It Dv REG_ECOLLATE 6357edb9f69SYuri PankovInvalid collating element referenced. 6367edb9f69SYuri Pankov.It Dv REG_ECTYPE 6377edb9f69SYuri PankovInvalid character class type referenced. 6387edb9f69SYuri Pankov.It Dv REG_EESCAPE 6397edb9f69SYuri PankovTrailing 6407edb9f69SYuri Pankov.Qq \&\e 641c10c16deSRichard Lowein pattern. 6427edb9f69SYuri Pankov.It Dv REG_ESUBREG 6437edb9f69SYuri PankovNumber in 6447edb9f69SYuri Pankov.Qq \&\e Ns Em digit 6457edb9f69SYuri Pankovinvalid or in error. 6467edb9f69SYuri Pankov.It Dv REG_EBRACK 6477edb9f69SYuri Pankov.Qq [] 6487edb9f69SYuri Pankovimbalance. 6497edb9f69SYuri Pankov.It Dv REG_ENOSYS 6507edb9f69SYuri PankovThe function is not supported. 6517edb9f69SYuri Pankov.It Dv REG_EPAREN 6527edb9f69SYuri Pankov.Qq \e(\e) 6537edb9f69SYuri Pankovor 6547edb9f69SYuri Pankov.Qq () 6557edb9f69SYuri Pankovimbalance. 6567edb9f69SYuri Pankov.It Dv REG_EBRACE 6577edb9f69SYuri Pankov.Qq \e{\e} 6587edb9f69SYuri Pankovimbalance. 6597edb9f69SYuri Pankov.It Dv REG_BADBR 6607edb9f69SYuri PankovContent of 6617edb9f69SYuri Pankov.Qq \e{\e} 6627edb9f69SYuri Pankovinvalid: not a number, number too large, more than two 6637edb9f69SYuri Pankovnumbers, first larger than second. 6647edb9f69SYuri Pankov.It Dv REG_ERANGE 6657edb9f69SYuri PankovInvalid endpoint in range expression. 6667edb9f69SYuri Pankov.It Dv REG_ESPACE 6677edb9f69SYuri PankovOut of memory. 6687edb9f69SYuri Pankov.It Dv REG_BADRPT 6697edb9f69SYuri Pankov.Qq \&? , 6707edb9f69SYuri Pankov.Qq * 6717edb9f69SYuri Pankovor 6727edb9f69SYuri Pankov.Qq + 6737edb9f69SYuri Pankovnot preceded by valid regular expression. 6747edb9f69SYuri Pankov.El 6757edb9f69SYuri Pankov.Sh USAGE 6767edb9f69SYuri PankovAn application could use: 6777edb9f69SYuri Pankov.Bd -literal -offset Ds 6787edb9f69SYuri Pankovregerror(code, preg, (char *)NULL, (size_t)0) 6797edb9f69SYuri Pankov.Ed 6807edb9f69SYuri Pankov.Pp 6817edb9f69SYuri Pankovto find out how big a buffer is needed for the generated string, 6827edb9f69SYuri Pankov.Fn malloc 6837edb9f69SYuri Pankova buffer to hold the string, and then call 6847edb9f69SYuri Pankov.Fn regerror 6857edb9f69SYuri Pankovagain to get the string 6867edb9f69SYuri Pankov.Po see 6877edb9f69SYuri Pankov.Xr malloc 3C 6887edb9f69SYuri Pankov.Pc . 6897edb9f69SYuri PankovAlternately, it could allocate a fixed, static buffer that is big enough to hold 6907edb9f69SYuri Pankovmost strings, and then use 6917edb9f69SYuri Pankov.Fn malloc 6927edb9f69SYuri Pankovallocate a larger buffer if it finds that this is too small. 6937edb9f69SYuri Pankov.Sh EXAMPLES 6947edb9f69SYuri PankovMatching string against the extended regular expression in pattern. 6957edb9f69SYuri Pankov.Bd -literal -offset Ds 696c10c16deSRichard Lowe#include <regex.h> 6977edb9f69SYuri Pankov 698c10c16deSRichard Lowe/* 699c10c16deSRichard Lowe* Match string against the extended regular expression in 700c10c16deSRichard Lowe* pattern, treating errors as no match. 701c10c16deSRichard Lowe* 702c10c16deSRichard Lowe* return 1 for match, 0 for no match 703c10c16deSRichard Lowe*/ 704c10c16deSRichard Loweint 705c10c16deSRichard Lowematch(const char *string, char *pattern) 706c10c16deSRichard Lowe{ 707c10c16deSRichard Lowe int status; 708c10c16deSRichard Lowe regex_t re; 7097edb9f69SYuri Pankov 710c10c16deSRichard Lowe if (regcomp(&re, pattern, REG_EXTENDED\||\|REG_NOSUB) != 0) { 711c10c16deSRichard Lowe return(0); /* report error */ 712c10c16deSRichard Lowe } 713c10c16deSRichard Lowe status = regexec(&re, string, (size_t) 0, NULL, 0); 714c10c16deSRichard Lowe regfree(&re); 715c10c16deSRichard Lowe if (status != 0) { 716c10c16deSRichard Lowe return(0); /* report error */ 717c10c16deSRichard Lowe } 718c10c16deSRichard Lowe return(1); 719c10c16deSRichard Lowe} 7207edb9f69SYuri Pankov.Ed 7217edb9f69SYuri Pankov.Pp 7227edb9f69SYuri PankovThe following demonstrates how the 7237edb9f69SYuri Pankov.Dv REG_NOTBOL 7247edb9f69SYuri Pankovflag could be used with 7257edb9f69SYuri Pankov.Fn regexec 7267edb9f69SYuri Pankovto find all substrings in a line that match a pattern supplied by a user. 7277edb9f69SYuri Pankov.Pq For simplicity of the example, very little error checking is done. 7287edb9f69SYuri Pankov.Bd -literal -offset Ds 729c10c16deSRichard Lowe(void) regcomp(&re, pattern, 0); 7307edb9f69SYuri Pankov/* this call to regexec() finds the first match on the line */ 731c10c16deSRichard Loweerror = regexec(&re, &buffer[0], 1, &pm, 0); 732c10c16deSRichard Lowewhile (error == 0) { /* while matches found */ 733c10c16deSRichard Lowe /* substring found between pm.rm_so and pm.rm_eo */ 7347edb9f69SYuri Pankov /* This call to regexec() finds the next match */ 735c10c16deSRichard Lowe error = regexec(&re, buffer + pm.rm_eo, 1, &pm, REG_NOTBOL); 736c10c16deSRichard Lowe} 7377edb9f69SYuri Pankov.Ed 7387edb9f69SYuri Pankov.Sh ERRORS 7397edb9f69SYuri PankovNo errors are defined. 7407edb9f69SYuri Pankov.Sh CODE SET INDEPENDENCE 7417edb9f69SYuri Pankov.Sy Enabled 7427edb9f69SYuri Pankov.Sh INTERFACE STABILITY 7437edb9f69SYuri Pankov.Sy Standard 7447edb9f69SYuri Pankov.Sh MT-LEVEL 7457edb9f69SYuri Pankov.Sy MT-Safe with exceptions 7467edb9f69SYuri Pankov.Pp 7477edb9f69SYuri PankovThe 7487edb9f69SYuri Pankov.Fn regcomp 7497edb9f69SYuri Pankovfunction can be used safely in a multithreaded application as long as 7507edb9f69SYuri Pankov.Xr setlocale 3C 7517edb9f69SYuri Pankovis not being called to change the locale. 7527edb9f69SYuri Pankov.Sh SEE ALSO 753*bbf21555SRichard Lowe.Xr attributes 7 , 754*bbf21555SRichard Lowe.Xr regex 7 , 755*bbf21555SRichard Lowe.Xr standards 7 7567edb9f69SYuri Pankov.Pp 7577edb9f69SYuri Pankov.St -p1003.2 , 7587edb9f69SYuri Pankovsections 2.8 7597edb9f69SYuri Pankov.Pq Regular Expression Notation 7607edb9f69SYuri Pankovand 7617edb9f69SYuri PankovB.5 7627edb9f69SYuri Pankov.Pq C Binding for Regular Expression Matching . 763