1*ea46e638SKyle Evans.\" $NetBSD: unvis.3,v 1.30 2019/05/08 15:37:41 bad Exp $ 28ccca122SBrooks Davis.\" 38ccca122SBrooks Davis.\" Copyright (c) 1989, 1991, 1993 48ccca122SBrooks Davis.\" The Regents of the University of California. All rights reserved. 58ccca122SBrooks Davis.\" 68ccca122SBrooks Davis.\" Redistribution and use in source and binary forms, with or without 78ccca122SBrooks Davis.\" modification, are permitted provided that the following conditions 88ccca122SBrooks Davis.\" are met: 98ccca122SBrooks Davis.\" 1. Redistributions of source code must retain the above copyright 108ccca122SBrooks Davis.\" notice, this list of conditions and the following disclaimer. 118ccca122SBrooks Davis.\" 2. Redistributions in binary form must reproduce the above copyright 128ccca122SBrooks Davis.\" notice, this list of conditions and the following disclaimer in the 138ccca122SBrooks Davis.\" documentation and/or other materials provided with the distribution. 148ccca122SBrooks Davis.\" 3. Neither the name of the University nor the names of its contributors 158ccca122SBrooks Davis.\" may be used to endorse or promote products derived from this software 168ccca122SBrooks Davis.\" without specific prior written permission. 178ccca122SBrooks Davis.\" 188ccca122SBrooks Davis.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 198ccca122SBrooks Davis.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 208ccca122SBrooks Davis.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 218ccca122SBrooks Davis.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 228ccca122SBrooks Davis.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 238ccca122SBrooks Davis.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 248ccca122SBrooks Davis.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 258ccca122SBrooks Davis.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 268ccca122SBrooks Davis.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 278ccca122SBrooks Davis.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 288ccca122SBrooks Davis.\" SUCH DAMAGE. 298ccca122SBrooks Davis.\" 308ccca122SBrooks Davis.\" @(#)unvis.3 8.2 (Berkeley) 12/11/93 318ccca122SBrooks Davis.\" 32*ea46e638SKyle Evans.Dd May 8, 2019 338ccca122SBrooks Davis.Dt UNVIS 3 348ccca122SBrooks Davis.Os 358ccca122SBrooks Davis.Sh NAME 368ccca122SBrooks Davis.Nm unvis , 37ff88ef41SBrooks Davis.Nm strunvis , 38ff88ef41SBrooks Davis.Nm strnunvis , 39ff88ef41SBrooks Davis.Nm strunvisx , 40ff88ef41SBrooks Davis.Nm strnunvisx 418ccca122SBrooks Davis.Nd decode a visual representation of characters 428ccca122SBrooks Davis.Sh LIBRARY 438ccca122SBrooks Davis.Lb libc 448ccca122SBrooks Davis.Sh SYNOPSIS 458ccca122SBrooks Davis.In vis.h 468ccca122SBrooks Davis.Ft int 478ccca122SBrooks Davis.Fn unvis "char *cp" "int c" "int *astate" "int flag" 488ccca122SBrooks Davis.Ft int 498ccca122SBrooks Davis.Fn strunvis "char *dst" "const char *src" 508ccca122SBrooks Davis.Ft int 518ccca122SBrooks Davis.Fn strnunvis "char *dst" "size_t dlen" "const char *src" 528ccca122SBrooks Davis.Ft int 538ccca122SBrooks Davis.Fn strunvisx "char *dst" "const char *src" "int flag" 548ccca122SBrooks Davis.Ft int 558ccca122SBrooks Davis.Fn strnunvisx "char *dst" "size_t dlen" "const char *src" "int flag" 568ccca122SBrooks Davis.Sh DESCRIPTION 578ccca122SBrooks DavisThe 588ccca122SBrooks Davis.Fn unvis , 598ccca122SBrooks Davis.Fn strunvis 608ccca122SBrooks Davisand 618ccca122SBrooks Davis.Fn strunvisx 628ccca122SBrooks Davisfunctions 638ccca122SBrooks Davisare used to decode a visual representation of characters, as produced 648ccca122SBrooks Davisby the 658ccca122SBrooks Davis.Xr vis 3 668ccca122SBrooks Davisfunction, back into 678ccca122SBrooks Davisthe original form. 688ccca122SBrooks Davis.Pp 698ccca122SBrooks DavisThe 708ccca122SBrooks Davis.Fn unvis 718ccca122SBrooks Davisfunction is called with successive characters in 728ccca122SBrooks Davis.Ar c 738ccca122SBrooks Davisuntil a valid sequence is recognized, at which time the decoded 748ccca122SBrooks Davischaracter is available at the character pointed to by 758ccca122SBrooks Davis.Ar cp . 768ccca122SBrooks Davis.Pp 778ccca122SBrooks DavisThe 788ccca122SBrooks Davis.Fn strunvis 798ccca122SBrooks Davisfunction decodes the characters pointed to by 808ccca122SBrooks Davis.Ar src 818ccca122SBrooks Davisinto the buffer pointed to by 828ccca122SBrooks Davis.Ar dst . 838ccca122SBrooks DavisThe 848ccca122SBrooks Davis.Fn strunvis 858ccca122SBrooks Davisfunction simply copies 868ccca122SBrooks Davis.Ar src 878ccca122SBrooks Davisto 888ccca122SBrooks Davis.Ar dst , 898ccca122SBrooks Davisdecoding any escape sequences along the way, 908ccca122SBrooks Davisand returns the number of characters placed into 918ccca122SBrooks Davis.Ar dst , 928ccca122SBrooks Davisor \-1 if an 938ccca122SBrooks Davisinvalid escape sequence was detected. 948ccca122SBrooks DavisThe size of 958ccca122SBrooks Davis.Ar dst 968ccca122SBrooks Davisshould be equal to the size of 978ccca122SBrooks Davis.Ar src 988ccca122SBrooks Davis(that is, no expansion takes place during decoding). 998ccca122SBrooks Davis.Pp 1008ccca122SBrooks DavisThe 1018ccca122SBrooks Davis.Fn strunvisx 102*ea46e638SKyle Evansand 103*ea46e638SKyle Evans.Fn strnunvisx 104*ea46e638SKyle Evansfunctions do the same as the 1058ccca122SBrooks Davis.Fn strunvis 106*ea46e638SKyle Evansand 107*ea46e638SKyle Evans.Fn strnunvis 108*ea46e638SKyle Evansfunctions, 109*ea46e638SKyle Evansbut take a flag that specifies the style the string 1108ccca122SBrooks Davis.Ar src 1118ccca122SBrooks Davisis encoded with. 112*ea46e638SKyle EvansThe meaning of the flag is the same as explained below for 113*ea46e638SKyle Evans.Fn unvis . 1148ccca122SBrooks Davis.Pp 1158ccca122SBrooks DavisThe 1168ccca122SBrooks Davis.Fn unvis 1178ccca122SBrooks Davisfunction implements a state machine that can be used to decode an 1188ccca122SBrooks Davisarbitrary stream of bytes. 1198ccca122SBrooks DavisAll state associated with the bytes being decoded is stored outside the 1208ccca122SBrooks Davis.Fn unvis 1218ccca122SBrooks Davisfunction (that is, a pointer to the state is passed in), so 1228ccca122SBrooks Daviscalls decoding different streams can be freely intermixed. 1238ccca122SBrooks DavisTo start decoding a stream of bytes, first initialize an integer to zero. 1248ccca122SBrooks DavisCall 1258ccca122SBrooks Davis.Fn unvis 1268ccca122SBrooks Daviswith each successive byte, along with a pointer 1278ccca122SBrooks Davisto this integer, and a pointer to a destination character. 1288ccca122SBrooks DavisThe 1298ccca122SBrooks Davis.Fn unvis 1308ccca122SBrooks Davisfunction has several return codes that must be handled properly. 1318ccca122SBrooks DavisThey are: 1328ccca122SBrooks Davis.Bl -tag -width UNVIS_VALIDPUSH 133778c12a6SBrooks Davis.It Li \&0 No (zero) 1348ccca122SBrooks DavisAnother character is necessary; nothing has been recognized yet. 1358ccca122SBrooks Davis.It Dv UNVIS_VALID 1368ccca122SBrooks DavisA valid character has been recognized and is available at the location 137778c12a6SBrooks Davispointed to by 138778c12a6SBrooks Davis.Fa cp . 1398ccca122SBrooks Davis.It Dv UNVIS_VALIDPUSH 1408ccca122SBrooks DavisA valid character has been recognized and is available at the location 141778c12a6SBrooks Davispointed to by 142778c12a6SBrooks Davis.Fa cp ; 143778c12a6SBrooks Davishowever, the character currently passed in should be passed in again. 1448ccca122SBrooks Davis.It Dv UNVIS_NOCHAR 1458ccca122SBrooks DavisA valid sequence was detected, but no character was produced. 1468ccca122SBrooks DavisThis return code is necessary to indicate a logical break between characters. 1478ccca122SBrooks Davis.It Dv UNVIS_SYNBAD 1488ccca122SBrooks DavisAn invalid escape sequence was detected, or the decoder is in an unknown state. 1498ccca122SBrooks DavisThe decoder is placed into the starting state. 1508ccca122SBrooks Davis.El 1518ccca122SBrooks Davis.Pp 1528ccca122SBrooks DavisWhen all bytes in the stream have been processed, call 1538ccca122SBrooks Davis.Fn unvis 1548ccca122SBrooks Davisone more time with flag set to 1558ccca122SBrooks Davis.Dv UNVIS_END 1568ccca122SBrooks Davisto extract any remaining character (the character passed in is ignored). 1578ccca122SBrooks Davis.Pp 1588ccca122SBrooks DavisThe 159778c12a6SBrooks Davis.Fa flag 1608ccca122SBrooks Davisargument is also used to specify the encoding style of the source. 1618ccca122SBrooks DavisIf set to 162*ea46e638SKyle Evans.Dv VIS_NOESCAPE 163*ea46e638SKyle Evans.Fn unvis 164*ea46e638SKyle Evanswill not decode backslash escapes. 165*ea46e638SKyle EvansIf set to 1668ccca122SBrooks Davis.Dv VIS_HTTPSTYLE 1678ccca122SBrooks Davisor 1688ccca122SBrooks Davis.Dv VIS_HTTP1808 , 1698ccca122SBrooks Davis.Fn unvis 1708ccca122SBrooks Daviswill decode URI strings as specified in RFC 1808. 1718ccca122SBrooks DavisIf set to 1728ccca122SBrooks Davis.Dv VIS_HTTP1866 , 1738ccca122SBrooks Davis.Fn unvis 174778c12a6SBrooks Daviswill decode entity references and numeric character references 175778c12a6SBrooks Davisas specified in RFC 1866. 1768ccca122SBrooks DavisIf set to 1778ccca122SBrooks Davis.Dv VIS_MIMESTYLE , 1788ccca122SBrooks Davis.Fn unvis 1798ccca122SBrooks Daviswill decode MIME Quoted-Printable strings as specified in RFC 2045. 1808ccca122SBrooks DavisIf set to 1818ccca122SBrooks Davis.Dv VIS_NOESCAPE , 1828ccca122SBrooks Davis.Fn unvis 183778c12a6SBrooks Daviswill not decode 184778c12a6SBrooks Davis.Ql \e 185778c12a6SBrooks Davisquoted characters. 1868ccca122SBrooks Davis.Pp 1878ccca122SBrooks DavisThe following code fragment illustrates a proper use of 1888ccca122SBrooks Davis.Fn unvis . 1898ccca122SBrooks Davis.Bd -literal -offset indent 1908ccca122SBrooks Davisint state = 0; 1918ccca122SBrooks Davischar out; 1928ccca122SBrooks Davis 1938ccca122SBrooks Daviswhile ((ch = getchar()) != EOF) { 1948ccca122SBrooks Davisagain: 195ff88ef41SBrooks Davis switch(unvis(&out, ch, &state, 0)) { 1968ccca122SBrooks Davis case 0: 1978ccca122SBrooks Davis case UNVIS_NOCHAR: 1988ccca122SBrooks Davis break; 1998ccca122SBrooks Davis case UNVIS_VALID: 2008ccca122SBrooks Davis (void)putchar(out); 2018ccca122SBrooks Davis break; 2028ccca122SBrooks Davis case UNVIS_VALIDPUSH: 2038ccca122SBrooks Davis (void)putchar(out); 2048ccca122SBrooks Davis goto again; 2058ccca122SBrooks Davis case UNVIS_SYNBAD: 2068ccca122SBrooks Davis errx(EXIT_FAILURE, "Bad character sequence!"); 2078ccca122SBrooks Davis } 2088ccca122SBrooks Davis} 209ff88ef41SBrooks Davisif (unvis(&out, '\e0', &state, UNVIS_END) == UNVIS_VALID) 2108ccca122SBrooks Davis (void)putchar(out); 2118ccca122SBrooks Davis.Ed 2128ccca122SBrooks Davis.Sh ERRORS 2138ccca122SBrooks DavisThe functions 2148ccca122SBrooks Davis.Fn strunvis , 2158ccca122SBrooks Davis.Fn strnunvis , 2168ccca122SBrooks Davis.Fn strunvisx , 2178ccca122SBrooks Davisand 2188ccca122SBrooks Davis.Fn strnunvisx 2198ccca122SBrooks Daviswill return \-1 on error and set 2208ccca122SBrooks Davis.Va errno 2218ccca122SBrooks Davisto: 2228ccca122SBrooks Davis.Bl -tag -width Er 2238ccca122SBrooks Davis.It Bq Er EINVAL 2248ccca122SBrooks DavisAn invalid escape sequence was detected, or the decoder is in an unknown state. 2258ccca122SBrooks Davis.El 2268ccca122SBrooks Davis.Pp 2278ccca122SBrooks DavisIn addition the functions 2288ccca122SBrooks Davis.Fn strnunvis 2298ccca122SBrooks Davisand 2308ccca122SBrooks Davis.Fn strnunvisx 2318ccca122SBrooks Daviswill can also set 2328ccca122SBrooks Davis.Va errno 2338ccca122SBrooks Davison error to: 2348ccca122SBrooks Davis.Bl -tag -width Er 2358ccca122SBrooks Davis.It Bq Er ENOSPC 2368ccca122SBrooks DavisNot enough space to perform the conversion. 2378ccca122SBrooks Davis.El 2388ccca122SBrooks Davis.Sh SEE ALSO 2398ccca122SBrooks Davis.Xr unvis 1 , 2408ccca122SBrooks Davis.Xr vis 1 , 2418ccca122SBrooks Davis.Xr vis 3 2428ccca122SBrooks Davis.Rs 2438ccca122SBrooks Davis.%A R. Fielding 2448ccca122SBrooks Davis.%T Relative Uniform Resource Locators 2458ccca122SBrooks Davis.%O RFC1808 2468ccca122SBrooks Davis.Re 2478ccca122SBrooks Davis.Sh HISTORY 2488ccca122SBrooks DavisThe 2498ccca122SBrooks Davis.Fn unvis 2508ccca122SBrooks Davisfunction 2518ccca122SBrooks Davisfirst appeared in 2528ccca122SBrooks Davis.Bx 4.4 . 2538ccca122SBrooks DavisThe 2548ccca122SBrooks Davis.Fn strnunvis 2558ccca122SBrooks Davisand 2568ccca122SBrooks Davis.Fn strnunvisx 2578ccca122SBrooks Davisfunctions appeared in 2588ccca122SBrooks Davis.Nx 6.0 2598ccca122SBrooks Davisand 260778c12a6SBrooks Davis.Fx 9.2 . 261778c12a6SBrooks Davis.Sh BUGS 262778c12a6SBrooks DavisThe names 263778c12a6SBrooks Davis.Dv VIS_HTTP1808 264778c12a6SBrooks Davisand 265778c12a6SBrooks Davis.Dv VIS_HTTP1866 266778c12a6SBrooks Davisare wrong. 267778c12a6SBrooks DavisPercent-encoding was defined in RFC 1738, the original RFC for URL. 268778c12a6SBrooks DavisRFC 1866 defines HTML 2.0, an application of SGML, from which it 269778c12a6SBrooks Davisinherits concepts of numeric character references and entity 270778c12a6SBrooks Davisreferences. 271