1.\" $NetBSD: vis.3,v 1.49 2017/08/05 20:22:29 wiz Exp $ 2.\" $FreeBSD$ 3.\" 4.\" Copyright (c) 1989, 1991, 1993 5.\" The Regents of the University of California. All rights reserved. 6.\" 7.\" Redistribution and use in source and binary forms, with or without 8.\" modification, are permitted provided that the following conditions 9.\" are met: 10.\" 1. Redistributions of source code must retain the above copyright 11.\" notice, this list of conditions and the following disclaimer. 12.\" 2. Redistributions in binary form must reproduce the above copyright 13.\" notice, this list of conditions and the following disclaimer in the 14.\" documentation and/or other materials provided with the distribution. 15.\" 3. Neither the name of the University nor the names of its contributors 16.\" may be used to endorse or promote products derived from this software 17.\" without specific prior written permission. 18.\" 19.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 20.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 21.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 22.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 23.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 24.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 25.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 26.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 27.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 28.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 29.\" SUCH DAMAGE. 30.\" 31.\" @(#)vis.3 8.1 (Berkeley) 6/9/93 32.\" 33.Dd April 22, 2017 34.Dt VIS 3 35.Os 36.Sh NAME 37.Nm vis , 38.Nm nvis , 39.Nm strvis , 40.Nm stravis , 41.Nm strnvis , 42.Nm strvisx , 43.Nm strnvisx , 44.Nm strenvisx , 45.Nm svis , 46.Nm snvis , 47.Nm strsvis , 48.Nm strsnvis , 49.Nm strsvisx , 50.Nm strsnvisx , 51.Nm strsenvisx 52.Nd visually encode characters 53.Sh LIBRARY 54.Lb libc 55.Sh SYNOPSIS 56.In vis.h 57.Ft char * 58.Fn vis "char *dst" "int c" "int flag" "int nextc" 59.Ft char * 60.Fn nvis "char *dst" "size_t dlen" "int c" "int flag" "int nextc" 61.Ft int 62.Fn strvis "char *dst" "const char *src" "int flag" 63.Ft int 64.Fn stravis "char **dst" "const char *src" "int flag" 65.Ft int 66.Fn strnvis "char *dst" "size_t dlen" "const char *src" "int flag" 67.Ft int 68.Fn strvisx "char *dst" "const char *src" "size_t len" "int flag" 69.Ft int 70.Fn strnvisx "char *dst" "size_t dlen" "const char *src" "size_t len" "int flag" 71.Ft int 72.Fn strenvisx "char *dst" "size_t dlen" "const char *src" "size_t len" "int flag" "int *cerr_ptr" 73.Ft char * 74.Fn svis "char *dst" "int c" "int flag" "int nextc" "const char *extra" 75.Ft char * 76.Fn snvis "char *dst" "size_t dlen" "int c" "int flag" "int nextc" "const char *extra" 77.Ft int 78.Fn strsvis "char *dst" "const char *src" "int flag" "const char *extra" 79.Ft int 80.Fn strsnvis "char *dst" "size_t dlen" "const char *src" "int flag" "const char *extra" 81.Ft int 82.Fn strsvisx "char *dst" "const char *src" "size_t len" "int flag" "const char *extra" 83.Ft int 84.Fn strsnvisx "char *dst" "size_t dlen" "const char *src" "size_t len" "int flag" "const char *extra" 85.Ft int 86.Fn strsenvisx "char *dst" "size_t dlen" "const char *src" "size_t len" "int flag" "const char *extra" "int *cerr_ptr" 87.Sh DESCRIPTION 88The 89.Fn vis 90function 91copies into 92.Fa dst 93a string which represents the character 94.Fa c . 95If 96.Fa c 97needs no encoding, it is copied in unaltered. 98The string is null terminated, and a pointer to the end of the string is 99returned. 100The maximum length of any encoding is four 101bytes (not including the trailing 102.Dv NUL ) ; 103thus, when 104encoding a set of characters into a buffer, the size of the buffer should 105be four times the number of bytes encoded, plus one for the trailing 106.Dv NUL . 107The flag parameter is used for altering the default range of 108characters considered for encoding and for altering the visual 109representation. 110The additional character, 111.Fa nextc , 112is only used when selecting the 113.Dv VIS_CSTYLE 114encoding format (explained below). 115.Pp 116The 117.Fn strvis , 118.Fn stravis , 119.Fn strnvis , 120.Fn strvisx , 121and 122.Fn strnvisx 123functions copy into 124.Fa dst 125a visual representation of 126the string 127.Fa src . 128The 129.Fn strvis 130and 131.Fn strnvis 132functions encode characters from 133.Fa src 134up to the 135first 136.Dv NUL . 137The 138.Fn strvisx 139and 140.Fn strnvisx 141functions encode exactly 142.Fa len 143characters from 144.Fa src 145(this 146is useful for encoding a block of data that may contain 147.Dv NUL Ns 's ) . 148Both forms 149.Dv NUL 150terminate 151.Fa dst . 152The size of 153.Fa dst 154must be four times the number 155of bytes encoded from 156.Fa src 157(plus one for the 158.Dv NUL ) . 159Both 160forms return the number of characters in 161.Fa dst 162(not including the trailing 163.Dv NUL ) . 164The 165.Fn stravis 166function allocates space dynamically to hold the string. 167The 168.Dq Nm n 169versions of the functions also take an additional argument 170.Fa dlen 171that indicates the length of the 172.Fa dst 173buffer. 174If 175.Fa dlen 176is not large enough to fit the converted string then the 177.Fn strnvis 178and 179.Fn strnvisx 180functions return \-1 and set 181.Va errno 182to 183.Dv ENOSPC . 184The 185.Fn strenvisx 186function takes an additional argument, 187.Fa cerr_ptr , 188that is used to pass in and out a multibyte conversion error flag. 189This is useful when processing single characters at a time when 190it is possible that the locale may be set to something other 191than the locale of the characters in the input data. 192.Pp 193The functions 194.Fn svis , 195.Fn snvis , 196.Fn strsvis , 197.Fn strsnvis , 198.Fn strsvisx , 199.Fn strsnvisx , 200and 201.Fn strsenvisx 202correspond to 203.Fn vis , 204.Fn nvis , 205.Fn strvis , 206.Fn strnvis , 207.Fn strvisx , 208.Fn strnvisx , 209and 210.Fn strenvisx 211but have an additional argument 212.Fa extra , 213pointing to a 214.Dv NUL 215terminated list of characters. 216These characters will be copied encoded or backslash-escaped into 217.Fa dst . 218These functions are useful e.g. to remove the special meaning 219of certain characters to shells. 220.Pp 221The encoding is a unique, invertible representation composed entirely of 222graphic characters; it can be decoded back into the original form using 223the 224.Xr unvis 3 , 225.Xr strunvis 3 226or 227.Xr strnunvis 3 228functions. 229.Pp 230There are two parameters that can be controlled: the range of 231characters that are encoded (applies only to 232.Fn vis , 233.Fn nvis , 234.Fn strvis , 235.Fn strnvis , 236.Fn strvisx , 237and 238.Fn strnvisx ) , 239and the type of representation used. 240By default, all non-graphic characters, 241except space, tab, and newline are encoded (see 242.Xr isgraph 3 ) . 243The following flags 244alter this: 245.Bl -tag -width VIS_WHITEX 246.It Dv VIS_DQ 247Also encode double quotes 248.It Dv VIS_GLOB 249Also encode the magic characters 250.Ql ( * , 251.Ql \&? , 252.Ql \&[ , 253and 254.Ql # ) 255recognized by 256.Xr glob 3 . 257.It Dv VIS_SHELL 258Also encode the meta characters used by shells (in addition to the glob 259characters): 260.Ql ( ' , 261.Ql ` , 262.Ql \&" , 263.Ql \&; , 264.Ql & , 265.Ql < , 266.Ql > , 267.Ql \&( , 268.Ql \&) , 269.Ql \&| , 270.Ql \&] , 271.Ql \e , 272.Ql $ , 273.Ql \&! , 274.Ql \&^ , 275and 276.Ql ~ ) . 277.It Dv VIS_SP 278Also encode space. 279.It Dv VIS_TAB 280Also encode tab. 281.It Dv VIS_NL 282Also encode newline. 283.It Dv VIS_WHITE 284Synonym for 285.Dv VIS_SP | VIS_TAB | VIS_NL . 286.It Dv VIS_META 287Synonym for 288.Dv VIS_WHITE | VIS_GLOB | VIS_SHELL . 289.It Dv VIS_SAFE 290Only encode 291.Dq unsafe 292characters. 293Unsafe means control characters which may cause common terminals to perform 294unexpected functions. 295Currently this form allows space, tab, newline, backspace, bell, and 296return \(em in addition to all graphic characters \(em unencoded. 297.El 298.Pp 299(The above flags have no effect for 300.Fn svis , 301.Fn snvis , 302.Fn strsvis , 303.Fn strsnvis , 304.Fn strsvisx , 305and 306.Fn strsnvisx . 307When using these functions, place all graphic characters to be 308encoded in an array pointed to by 309.Fa extra . 310In general, the backslash character should be included in this array, see the 311warning on the use of the 312.Dv VIS_NOSLASH 313flag below). 314.Pp 315There are six forms of encoding. 316All forms use the backslash character 317.Ql \e 318to introduce a special 319sequence; two backslashes are used to represent a real backslash, 320except 321.Dv VIS_HTTPSTYLE 322that uses 323.Ql % , 324or 325.Dv VIS_MIMESTYLE 326that uses 327.Ql = . 328These are the visual formats: 329.Bl -tag -width VIS_CSTYLE 330.It (default) 331Use an 332.Ql M 333to represent meta characters (characters with the 8th 334bit set), and use caret 335.Ql ^ 336to represent control characters (see 337.Xr iscntrl 3 ) . 338The following formats are used: 339.Bl -tag -width xxxxx 340.It Dv \e^C 341Represents the control character 342.Ql C . 343Spans characters 344.Ql \e000 345through 346.Ql \e037 , 347and 348.Ql \e177 349(as 350.Ql \e^? ) . 351.It Dv \eM-C 352Represents character 353.Ql C 354with the 8th bit set. 355Spans characters 356.Ql \e241 357through 358.Ql \e376 . 359.It Dv \eM^C 360Represents control character 361.Ql C 362with the 8th bit set. 363Spans characters 364.Ql \e200 365through 366.Ql \e237 , 367and 368.Ql \e377 369(as 370.Ql \eM^? ) . 371.It Dv \e040 372Represents 373.Tn ASCII 374space. 375.It Dv \e240 376Represents Meta-space. 377.El 378.It Dv VIS_CSTYLE 379Use C-style backslash sequences to represent standard non-printable 380characters. 381The following sequences are used to represent the indicated characters: 382.Bd -unfilled -offset indent 383.Li \ea Tn \(em BEL No (007) 384.Li \eb Tn \(em BS No (010) 385.Li \ef Tn \(em NP No (014) 386.Li \en Tn \(em NL No (012) 387.Li \er Tn \(em CR No (015) 388.Li \es Tn \(em SP No (040) 389.Li \et Tn \(em HT No (011) 390.Li \ev Tn \(em VT No (013) 391.Li \e0 Tn \(em NUL No (000) 392.Ed 393.Pp 394When using this format, the 395.Fa nextc 396parameter is looked at to determine if a 397.Dv NUL 398character can be encoded as 399.Ql \e0 400instead of 401.Ql \e000 . 402If 403.Fa nextc 404is an octal digit, the latter representation is used to 405avoid ambiguity. 406.Pp 407Non-printable characters without C-style 408backslash sequences use the default representation. 409.It Dv VIS_OCTAL 410Use a three digit octal sequence. 411The form is 412.Ql \eddd 413where 414.Em d 415represents an octal digit. 416.It Dv VIS_CSTYLE \&| Dv VIS_OCTAL 417Same as 418.Dv VIS_CSTYLE 419except that non-printable characters without C-style 420backslash sequences use a three digit octal sequence. 421.It Dv VIS_HTTPSTYLE 422Use URI encoding as described in RFC 1738. 423The form is 424.Ql %xx 425where 426.Em x 427represents a lower case hexadecimal digit. 428.It Dv VIS_MIMESTYLE 429Use MIME Quoted-Printable encoding as described in RFC 2045, only don't 430break lines and don't handle CRLF. 431The form is 432.Ql =XX 433where 434.Em X 435represents an upper case hexadecimal digit. 436.El 437.Pp 438There is one additional flag, 439.Dv VIS_NOSLASH , 440which inhibits the 441doubling of backslashes and the backslash before the default 442format (that is, control characters are represented by 443.Ql ^C 444and 445meta characters as 446.Ql M-C ) . 447With this flag set, the encoding is 448ambiguous and non-invertible. 449.Sh MULTIBYTE CHARACTER SUPPORT 450These functions support multibyte character input. 451The encoding conversion is influenced by the setting of the 452.Ev LC_CTYPE 453environment variable which defines the set of characters 454that can be copied without encoding. 455.Pp 456If 457.Dv VIS_NOLOCALE 458is set, processing is done assuming the C locale and overriding 459any other environment settings. 460.Pp 461When 8-bit data is present in the input, 462.Ev LC_CTYPE 463must be set to the correct locale or to the C locale. 464If the locales of the data and the conversion are mismatched, 465multibyte character recognition may fail and encoding will be performed 466byte-by-byte instead. 467.Pp 468As noted above, 469.Fa dst 470must be four times the number of bytes processed from 471.Fa src . 472But note that each multibyte character can be up to 473.Dv MB_LEN_MAX 474bytes 475.\" (see 476.\" .Xr multibyte 3 ) 477so in terms of multibyte characters, 478.Fa dst 479must be four times 480.Dv MB_LEN_MAX 481times the number of characters processed from 482.Fa src . 483.Sh ENVIRONMENT 484.Bl -tag -width ".Ev LC_CTYPE" 485.It Ev LC_CTYPE 486Specify the locale of the input data. 487Set to C if the input data locale is unknown. 488.El 489.Sh ERRORS 490The functions 491.Fn nvis 492and 493.Fn snvis 494will return 495.Dv NULL 496and the functions 497.Fn strnvis , 498.Fn strnvisx , 499.Fn strsnvis , 500and 501.Fn strsnvisx , 502will return \-1 when the 503.Fa dlen 504destination buffer size is not enough to perform the conversion while 505setting 506.Va errno 507to: 508.Bl -tag -width ".Bq Er ENOSPC" 509.It Bq Er ENOSPC 510The destination buffer size is not large enough to perform the conversion. 511.El 512.Sh SEE ALSO 513.Xr unvis 1 , 514.Xr vis 1 , 515.Xr glob 3 , 516.\" .Xr multibyte 3 , 517.Xr unvis 3 518.Rs 519.%A T. Berners-Lee 520.%T Uniform Resource Locators (URL) 521.%O "RFC 1738" 522.Re 523.Rs 524.%T "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies" 525.%O "RFC 2045" 526.Re 527.Sh HISTORY 528The 529.Fn vis , 530.Fn strvis , 531and 532.Fn strvisx 533functions first appeared in 534.Bx 4.4 . 535The 536.Fn svis , 537.Fn strsvis , 538and 539.Fn strsvisx 540functions appeared in 541.Nx 1.5 542and 543.Fx 9.2 . 544The buffer size limited versions of the functions 545.Po Fn nvis , 546.Fn strnvis , 547.Fn strnvisx , 548.Fn snvis , 549.Fn strsnvis , 550and 551.Fn strsnvisx Pc 552appeared in 553.Nx 6.0 554and 555.Fx 9.2 . 556Multibyte character support was added in 557.Nx 7.0 558and 559.Fx 9.2 . 560