1.\" Copyright (c) 1991, 1993 2.\" The Regents of the University of California. All rights reserved. 3.\" 4.\" This code is derived from software contributed to Berkeley by 5.\" the Institute of Electrical and Electronics Engineers, Inc. 6.\" 7.\" Redistribution and use in source and binary forms, with or without 8.\" modification, are permitted provided that the following conditions 9.\" are met: 10.\" 1. Redistributions of source code must retain the above copyright 11.\" notice, this list of conditions and the following disclaimer. 12.\" 2. Redistributions in binary form must reproduce the above copyright 13.\" notice, this list of conditions and the following disclaimer in the 14.\" documentation and/or other materials provided with the distribution. 15.\" 4. Neither the name of the University nor the names of its contributors 16.\" may be used to endorse or promote products derived from this software 17.\" without specific prior written permission. 18.\" 19.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 20.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 21.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 22.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 23.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 24.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 25.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 26.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 27.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 28.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 29.\" SUCH DAMAGE. 30.\" 31.\" @(#)tr.1 8.1 (Berkeley) 6/6/93 32.\" $FreeBSD$ 33.\" 34.Dd October 13, 2006 35.Dt TR 1 36.Os 37.Sh NAME 38.Nm tr 39.Nd translate characters 40.Sh SYNOPSIS 41.Nm 42.Op Fl Ccsu 43.Ar string1 string2 44.Nm 45.Op Fl Ccu 46.Fl d 47.Ar string1 48.Nm 49.Op Fl Ccu 50.Fl s 51.Ar string1 52.Nm 53.Op Fl Ccu 54.Fl ds 55.Ar string1 string2 56.Sh DESCRIPTION 57The 58.Nm 59utility copies the standard input to the standard output with substitution 60or deletion of selected characters. 61.Pp 62The following options are available: 63.Bl -tag -width Ds 64.It Fl C 65Complement the set of characters in 66.Ar string1 , 67that is 68.Dq Fl C Li ab 69includes every character except for 70.Ql a 71and 72.Ql b . 73.It Fl c 74Same as 75.Fl C 76but complement the set of values in 77.Ar string1 . 78.It Fl d 79Delete characters in 80.Ar string1 81from the input. 82.It Fl s 83Squeeze multiple occurrences of the characters listed in the last 84operand (either 85.Ar string1 86or 87.Ar string2 ) 88in the input into a single instance of the character. 89This occurs after all deletion and translation is completed. 90.It Fl u 91Guarantee that any output is unbuffered. 92.El 93.Pp 94In the first synopsis form, the characters in 95.Ar string1 96are translated into the characters in 97.Ar string2 98where the first character in 99.Ar string1 100is translated into the first character in 101.Ar string2 102and so on. 103If 104.Ar string1 105is longer than 106.Ar string2 , 107the last character found in 108.Ar string2 109is duplicated until 110.Ar string1 111is exhausted. 112.Pp 113In the second synopsis form, the characters in 114.Ar string1 115are deleted from the input. 116.Pp 117In the third synopsis form, the characters in 118.Ar string1 119are compressed as described for the 120.Fl s 121option. 122.Pp 123In the fourth synopsis form, the characters in 124.Ar string1 125are deleted from the input, and the characters in 126.Ar string2 127are compressed as described for the 128.Fl s 129option. 130.Pp 131The following conventions can be used in 132.Ar string1 133and 134.Ar string2 135to specify sets of characters: 136.Bl -tag -width [:equiv:] 137.It character 138Any character not described by one of the following conventions 139represents itself. 140.It \eoctal 141A backslash followed by 1, 2 or 3 octal digits represents a character 142with that encoded value. 143To follow an octal sequence with a digit as a character, left zero-pad 144the octal sequence to the full 3 octal digits. 145.It \echaracter 146A backslash followed by certain special characters maps to special 147values. 148.Bl -column "\ea" 149.It "\ea <alert character>" 150.It "\eb <backspace>" 151.It "\ef <form-feed>" 152.It "\en <newline>" 153.It "\er <carriage return>" 154.It "\et <tab>" 155.It "\ev <vertical tab>" 156.El 157.Pp 158A backslash followed by any other character maps to that character. 159.It c-c 160For non-octal range endpoints 161represents the range of characters between the range endpoints, inclusive, 162in ascending order, 163as defined by the collation sequence. 164If either or both of the range endpoints are octal sequences, it 165represents the range of specific coded values between the 166range endpoints, inclusive. 167.Ef 168.It [:class:] 169Represents all characters belonging to the defined character class. 170Class names are: 171.Bl -column "phonogram" 172.It "alnum <alphanumeric characters>" 173.It "alpha <alphabetic characters>" 174.It "blank <whitespace characters>" 175.It "cntrl <control characters>" 176.It "digit <numeric characters>" 177.It "graph <graphic characters>" 178.It "ideogram <ideographic characters>" 179.It "lower <lower-case alphabetic characters>" 180.It "phonogram <phonographic characters>" 181.It "print <printable characters>" 182.It "punct <punctuation characters>" 183.It "rune <valid characters>" 184.It "space <space characters>" 185.It "special <special characters>" 186.It "upper <upper-case characters>" 187.It "xdigit <hexadecimal characters>" 188.El 189.Pp 190.\" All classes may be used in 191.\" .Ar string1 , 192.\" and in 193.\" .Ar string2 194.\" when both the 195.\" .Fl d 196.\" and 197.\" .Fl s 198.\" options are specified. 199.\" Otherwise, only the classes ``upper'' and ``lower'' may be used in 200.\" .Ar string2 201.\" and then only when the corresponding class (``upper'' for ``lower'' 202.\" and vice-versa) is specified in the same relative position in 203.\" .Ar string1 . 204.\" .Pp 205When 206.Dq Li [:lower:] 207appears in 208.Ar string1 209and 210.Dq Li [:upper:] 211appears in the same relative position in 212.Ar string2 , 213it represents the characters pairs from the 214.Dv toupper 215mapping in the 216.Ev LC_CTYPE 217category of the current locale. 218When 219.Dq Li [:upper:] 220appears in 221.Ar string1 222and 223.Dq Li [:lower:] 224appears in the same relative position in 225.Ar string2 , 226it represents the characters pairs from the 227.Dv tolower 228mapping in the 229.Ev LC_CTYPE 230category of the current locale. 231.Pp 232With the exception of case conversion, 233characters in the classes are in unspecified order. 234.Pp 235For specific information as to which 236.Tn ASCII 237characters are included 238in these classes, see 239.Xr ctype 3 240and related manual pages. 241.It [=equiv=] 242Represents all characters belonging to the same equivalence class as 243.Ar equiv , 244ordered by their encoded values. 245.It [#*n] 246Represents 247.Ar n 248repeated occurrences of the character represented by 249.Ar # . 250This 251expression is only valid when it occurs in 252.Ar string2 . 253If 254.Ar n 255is omitted or is zero, it is be interpreted as large enough to extend 256.Ar string2 257sequence to the length of 258.Ar string1 . 259If 260.Ar n 261has a leading zero, it is interpreted as an octal value, otherwise, 262it is interpreted as a decimal value. 263.El 264.Sh ENVIRONMENT 265The 266.Ev LANG , LC_ALL , LC_CTYPE 267and 268.Ev LC_COLLATE 269environment variables affect the execution of 270.Nm 271as described in 272.Xr environ 7 . 273.Sh EXIT STATUS 274.Ex -std 275.Sh EXAMPLES 276The following examples are shown as given to the shell: 277.Pp 278Create a list of the words in file1, one per line, where a word is taken to 279be a maximal string of letters. 280.Pp 281.D1 Li "tr -cs \*q[:alpha:]\*q \*q\en\*q < file1" 282.Pp 283Translate the contents of file1 to upper-case. 284.Pp 285.D1 Li "tr \*q[:lower:]\*q \*q[:upper:]\*q < file1" 286.Pp 287(This should be preferred over the traditional 288.Ux 289idiom of 290.Dq Li "tr a-z A-Z" , 291since it works correctly in all locales.) 292.Pp 293Strip out non-printable characters from file1. 294.Pp 295.D1 Li "tr -cd \*q[:print:]\*q < file1" 296.Pp 297Remove diacritical marks from all accented variants of the letter 298.Ql e : 299.Pp 300.Dl "tr \*q[=e=]\*q \*qe\*q" 301.Sh COMPATIBILITY 302.Fx 303implementations of 304.Nm 305did not order characters in range expressions according to the current 306locale's collation order, making it possible to convert accented Latin 307characters from upper to lower case using 308the traditional 309.Ux 310idiom of 311.Dq Li "tr A-Z a-z" . 312As noted in the 313.Sx EXAMPLES 314section above, the character class expressions 315.Dq Li [:lower:] 316and 317.Dq Li [:upper:] 318should be used instead of explicit character ranges like 319.Dq Li a-z 320and 321.Dq Li A-Z . 322.Pp 323.Dq Li [=equiv=] 324expression is implemented for single byte locales only. 325.Pp 326System V has historically implemented character ranges using the syntax 327.Dq Li [c-c] 328instead of the 329.Dq Li c-c 330used by historic 331.Bx 332implementations and 333standardized by POSIX. 334System V shell scripts should work under this implementation as long as 335the range is intended to map in another range, i.e., the command 336.Dq Li "tr [a-z] [A-Z]" 337will work as it will map the 338.Ql \&[ 339character in 340.Ar string1 341to the 342.Ql \&[ 343character in 344.Ar string2 . 345However, if the shell script is deleting or squeezing characters as in 346the command 347.Dq Li "tr -d [a-z]" , 348the characters 349.Ql \&[ 350and 351.Ql \&] 352will be 353included in the deletion or compression list which would not have happened 354under a historic System V implementation. 355Additionally, any scripts that depended on the sequence 356.Dq Li a-z 357to 358represent the three characters 359.Ql a , 360.Ql \- 361and 362.Ql z 363will have to be 364rewritten as 365.Dq Li a\e-z . 366.Pp 367The 368.Nm 369utility has historically not permitted the manipulation of NUL bytes in 370its input and, additionally, stripped NUL's from its input stream. 371This implementation has removed this behavior as a bug. 372.Pp 373The 374.Nm 375utility has historically been extremely forgiving of syntax errors, 376for example, the 377.Fl c 378and 379.Fl s 380options were ignored unless two strings were specified. 381This implementation will not permit illegal syntax. 382.Sh STANDARDS 383The 384.Nm 385utility conforms to 386.St -p1003.1-2001 . 387The 388.Dq ideogram , 389.Dq phonogram , 390.Dq rune , 391and 392.Dq special 393character classes are extensions. 394.Pp 395It should be noted that the feature wherein the last character of 396.Ar string2 397is duplicated if 398.Ar string2 399has less characters than 400.Ar string1 401is permitted by POSIX but is not required. 402Shell scripts attempting to be portable to other POSIX systems should use 403the 404.Dq Li [#*] 405convention instead of relying on this behavior. 406The 407.Fl u 408option is an extension to the 409.St -p1003.1-2001 410standard. 411