xref: /freebsd/usr.bin/tr/tr.1 (revision 17ee9d00bc1ae1e598c38f25826f861e4bc6c3ce)
1.\" Copyright (c) 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" This code is derived from software contributed to Berkeley by
5.\" the Institute of Electrical and Electronics Engineers, Inc.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that the following conditions
9.\" are met:
10.\" 1. Redistributions of source code must retain the above copyright
11.\"    notice, this list of conditions and the following disclaimer.
12.\" 2. Redistributions in binary form must reproduce the above copyright
13.\"    notice, this list of conditions and the following disclaimer in the
14.\"    documentation and/or other materials provided with the distribution.
15.\" 3. All advertising materials mentioning features or use of this software
16.\"    must display the following acknowledgement:
17.\"	This product includes software developed by the University of
18.\"	California, Berkeley and its contributors.
19.\" 4. Neither the name of the University nor the names of its contributors
20.\"    may be used to endorse or promote products derived from this software
21.\"    without specific prior written permission.
22.\"
23.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
24.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
25.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
26.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
27.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
28.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
29.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
30.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
31.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
32.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
33.\" SUCH DAMAGE.
34.\"
35.\"     @(#)tr.1	8.1 (Berkeley) 6/6/93
36.\"
37.Dd June 6, 1993
38.Dt TR 1
39.Os
40.Sh NAME
41.Nm tr
42.Nd translate characters
43.Sh SYNOPSIS
44.Nm tr
45.Op Fl cs
46.Ar string1 string2
47.Nm tr
48.Op Fl c
49.Fl d
50.Ar string1
51.Nm tr
52.Op Fl c
53.Fl s
54.Ar string1
55.Nm tr
56.Op Fl c
57.Fl ds
58.Ar string1 string2
59.Sh DESCRIPTION
60The
61.Nm tr
62utility copies the standard input to the standard output with substitution
63or deletion of selected characters.
64.Pp
65The following options are available:
66.Bl -tag -width Ds
67.It Fl c
68Complements the set of characters in
69.Ar string1 ,
70that is ``-c ab'' includes every character except for ``a'' and ``b''.
71.It Fl d
72The
73.Fl d
74option causes characters to be deleted from the input.
75.It Fl s
76The
77.Fl s
78option squeezes multiple occurrences of the characters listed in the last
79operand (either
80.Ar string1
81or
82.Ar string2 )
83in the input into a single instance of the character.
84This occurs after all deletion and translation is completed.
85.El
86.Pp
87In the first synopsis form, the characters in
88.Ar string1
89are translated into the characters in
90.Ar string2
91where the first character in
92.Ar string1
93is translated into the first character in
94.Ar string2
95and so on.
96If
97.Ar string1
98is longer than
99.Ar string2 ,
100the last character found in
101.Ar string2
102is duplicated until
103.Ar string1
104is exhausted.
105.Pp
106In the second synopsis form, the characters in
107.Ar string1
108are deleted from the input.
109.Pp
110In the third synopsis form, the characters in
111.Ar string1
112are compressed as described for the
113.Fl s
114option.
115.Pp
116In the fourth synopsis form, the characters in
117.Ar string1
118are deleted from the input, and the characters in
119.Ar string2
120are compressed as described for the
121.Fl s
122option.
123.Pp
124The following conventions can be used in
125.Ar string1
126and
127.Ar string2
128to specify sets of characters:
129.Bl -tag -width [:equiv:]
130.It character
131Any character not described by one of the following conventions
132represents itself.
133.It \eoctal
134A backslash followed by 1, 2 or 3 octal digits represents a character
135with that encoded value.
136To follow an octal sequence with a digit as a character, left zero-pad
137the octal sequence to the full 3 octal digits.
138.It \echaracter
139A backslash followed by certain special characters maps to special
140values.
141.sp
142.Bl -column
143.It \ea	<alert character>
144.It \eb	<backspace>
145.It \ef	<form-feed>
146.It \en	<newline>
147.It \er	<carriage return>
148.It \et	<tab>
149.It \ev	<vertical tab>
150.El
151.sp
152A backslash followed by any other character maps to that character.
153.It c-c
154Represents the range of characters between the range endpoints, inclusively.
155.It [:class:]
156Represents all characters belonging to the defined character class.
157Class names are:
158.sp
159.Bl -column
160.It alnum	<alphanumeric characters>
161.It alpha	<alphabetic characters>
162.It cntrl	<control characters>
163.It digit	<numeric characters>
164.It graph	<graphic characters>
165.It lower	<lower-case alphabetic characters>
166.It print	<printable characters>
167.It punct	<punctuation characters>
168.It space	<space characters>
169.It upper	<upper-case characters>
170.It xdigit	<hexadecimal characters>
171.El
172.Pp
173\." All classes may be used in
174\." .Ar string1 ,
175\." and in
176\." .Ar string2
177\." when both the
178\." .Fl d
179\." and
180\." .Fl s
181\." options are specified.
182\." Otherwise, only the classes ``upper'' and ``lower'' may be used in
183\." .Ar string2
184\." and then only when the corresponding class (``upper'' for ``lower''
185\." and vice-versa) is specified in the same relative position in
186\." .Ar string1 .
187\." .Pp
188With the exception of the ``upper'' and ``lower'' classes, characters
189in the classes are in unspecified order.
190In the ``upper'' and ``lower'' classes, characters are entered in
191ascending order.
192.Pp
193For specific information as to which ASCII characters are included
194in these classes, see
195.Xr ctype 3
196and related manual pages.
197.It [=equiv=]
198Represents all characters or collating (sorting) elements belonging to
199the same equivalence class as
200.Ar equiv .
201If
202there is a secondary ordering within the equivalence class, the characters
203are ordered in ascending sequence.
204Otherwise, they are ordered after their encoded values.
205An example of an equivalence class might be ``c'' and ``ch'' in Spanish;
206English has no equivalence classes.
207.It [#*n]
208Represents
209.Ar n
210repeated occurrences of the character represented by
211.Ar # .
212This
213expression is only valid when it occurs in
214.Ar string2 .
215If
216.Ar n
217is omitted or is zero, it is be interpreted as large enough to extend
218.Ar string2
219sequence to the length of
220.Ar string1 .
221If
222.Ar n
223has a leading zero, it is interpreted as an octal value, otherwise,
224it's interpreted as a decimal value.
225.El
226.Pp
227The
228.Nm tr
229utility exits 0 on success, and >0 if an error occurs.
230.Sh EXAMPLES
231The following examples are shown as given to the shell:
232.sp
233Create a list of the words in file1, one per line, where a word is taken to
234be a maximal string of letters.
235.sp
236.D1 Li "tr -cs \*q[:alpha:]\*q \*q\en\*q < file1"
237.sp
238Translate the contents of file1 to upper-case.
239.sp
240.D1 Li "tr \*q[:lower:]\*q \*q[:upper:]\*q < file1"
241.sp
242Strip out non-printable characters from file1.
243.sp
244.D1 Li "tr -cd \*q[:print:]\*q < file1"
245.Sh COMPATIBILITY
246System V has historically implemented character ranges using the syntax
247``[c-c]'' instead of the ``c-c'' used by historic BSD implementations and
248standardized by POSIX.
249System V shell scripts should work under this implementation as long as
250the range is intended to map in another range, i.e. the command
251``tr [a-z] [A-Z]'' will work as it will map the ``['' character in
252.Ar string1
253to the ``['' character in
254.Ar string2.
255However, if the shell script is deleting or squeezing characters as in
256the command ``tr -d [a-z]'', the characters ``['' and ``]'' will be
257included in the deletion or compression list which would not have happened
258under an historic System V implementation.
259Additionally, any scripts that depended on the sequence ``a-z'' to
260represent the three characters ``a'', ``-'' and ``z'' will have to be
261rewritten as ``a\e-z''.
262.Pp
263The
264.Nm tr
265utility has historically not permitted the manipulation of NUL bytes in
266its input and, additionally, stripped NUL's from its input stream.
267This implementation has removed this behavior as a bug.
268.Pp
269The
270.Nm tr
271utility has historically been extremely forgiving of syntax errors,
272for example, the
273.Fl c
274and
275.Fl s
276options were ignored unless two strings were specified.
277This implementation will not permit illegal syntax.
278.Sh STANDARDS
279The
280.Nm tr
281utility is expected to be
282.St -p1003.2
283compatible.
284It should be noted that the feature wherein the last character of
285.Ar string2
286is duplicated if
287.Ar string2
288has less characters than
289.Ar string1
290is permitted by POSIX but is not required.
291Shell scripts attempting to be portable to other POSIX systems should use
292the ``[#*]'' convention instead of relying on this behavior.
293