xref: /freebsd/usr.bin/tr/tr.1 (revision 6990ffd8a95caaba6858ad44ff1b3157d1efba8f)
1.\" Copyright (c) 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" This code is derived from software contributed to Berkeley by
5.\" the Institute of Electrical and Electronics Engineers, Inc.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that the following conditions
9.\" are met:
10.\" 1. Redistributions of source code must retain the above copyright
11.\"    notice, this list of conditions and the following disclaimer.
12.\" 2. Redistributions in binary form must reproduce the above copyright
13.\"    notice, this list of conditions and the following disclaimer in the
14.\"    documentation and/or other materials provided with the distribution.
15.\" 3. All advertising materials mentioning features or use of this software
16.\"    must display the following acknowledgement:
17.\"	This product includes software developed by the University of
18.\"	California, Berkeley and its contributors.
19.\" 4. Neither the name of the University nor the names of its contributors
20.\"    may be used to endorse or promote products derived from this software
21.\"    without specific prior written permission.
22.\"
23.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
24.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
25.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
26.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
27.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
28.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
29.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
30.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
31.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
32.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
33.\" SUCH DAMAGE.
34.\"
35.\"     @(#)tr.1	8.1 (Berkeley) 6/6/93
36.\" $FreeBSD$
37.\"
38.Dd October 11, 1997
39.Dt TR 1
40.Os
41.Sh NAME
42.Nm tr
43.Nd translate characters
44.Sh SYNOPSIS
45.Nm
46.Op Fl csu
47.Ar string1 string2
48.Nm
49.Op Fl cu
50.Fl d
51.Ar string1
52.Nm
53.Op Fl cu
54.Fl s
55.Ar string1
56.Nm
57.Op Fl cu
58.Fl ds
59.Ar string1 string2
60.Sh DESCRIPTION
61The
62.Nm
63utility copies the standard input to the standard output with substitution
64or deletion of selected characters.
65.Pp
66The following options are available:
67.Bl -tag -width Ds
68.It Fl c
69Complements the set of characters in
70.Ar string1 ,
71that is ``-c ab'' includes every character except for ``a'' and ``b''.
72.It Fl d
73The
74.Fl d
75option causes characters to be deleted from the input.
76.It Fl s
77The
78.Fl s
79option squeezes multiple occurrences of the characters listed in the last
80operand (either
81.Ar string1
82or
83.Ar string2 )
84in the input into a single instance of the character.
85This occurs after all deletion and translation is completed.
86.It Fl u
87The
88.Fl u
89option guarantees that any output is unbuffered.
90.El
91.Pp
92In the first synopsis form, the characters in
93.Ar string1
94are translated into the characters in
95.Ar string2
96where the first character in
97.Ar string1
98is translated into the first character in
99.Ar string2
100and so on.
101If
102.Ar string1
103is longer than
104.Ar string2 ,
105the last character found in
106.Ar string2
107is duplicated until
108.Ar string1
109is exhausted.
110.Pp
111In the second synopsis form, the characters in
112.Ar string1
113are deleted from the input.
114.Pp
115In the third synopsis form, the characters in
116.Ar string1
117are compressed as described for the
118.Fl s
119option.
120.Pp
121In the fourth synopsis form, the characters in
122.Ar string1
123are deleted from the input, and the characters in
124.Ar string2
125are compressed as described for the
126.Fl s
127option.
128.Pp
129The following conventions can be used in
130.Ar string1
131and
132.Ar string2
133to specify sets of characters:
134.Bl -tag -width [:equiv:]
135.It character
136Any character not described by one of the following conventions
137represents itself.
138.It \eoctal
139A backslash followed by 1, 2 or 3 octal digits represents a character
140with that encoded value.
141To follow an octal sequence with a digit as a character, left zero-pad
142the octal sequence to the full 3 octal digits.
143.It \echaracter
144A backslash followed by certain special characters maps to special
145values.
146.Pp
147.Bl -column "\ea"
148.It "\ea	<alert character>
149.It "\eb	<backspace>
150.It "\ef	<form-feed>
151.It "\en	<newline>
152.It "\er	<carriage return>
153.It "\et	<tab>
154.It "\ev	<vertical tab>
155.El
156.Pp
157A backslash followed by any other character maps to that character.
158.It c-c
159Represents the range of characters between the range endpoints, inclusively.
160.It [:class:]
161Represents all characters belonging to the defined character class.
162Class names are:
163.Pp
164.Bl -column "xdigit"
165.It "alnum	<alphanumeric characters>
166.It "alpha	<alphabetic characters>
167.It "cntrl	<control characters>
168.It "digit	<numeric characters>
169.It "graph	<graphic characters>
170.It "lower	<lower-case alphabetic characters>
171.It "print	<printable characters>
172.It "punct	<punctuation characters>
173.It "space	<space characters>
174.It "upper	<upper-case characters>
175.It "xdigit	<hexadecimal characters>
176.El
177.Pp
178.\" All classes may be used in
179.\" .Ar string1 ,
180.\" and in
181.\" .Ar string2
182.\" when both the
183.\" .Fl d
184.\" and
185.\" .Fl s
186.\" options are specified.
187.\" Otherwise, only the classes ``upper'' and ``lower'' may be used in
188.\" .Ar string2
189.\" and then only when the corresponding class (``upper'' for ``lower''
190.\" and vice-versa) is specified in the same relative position in
191.\" .Ar string1 .
192.\" .Pp
193With the exception of the ``upper'' and ``lower'' classes, characters
194in the classes are in unspecified order.
195In the ``upper'' and ``lower'' classes, characters are entered in
196ascending order.
197.Pp
198For specific information as to which ASCII characters are included
199in these classes, see
200.Xr ctype 3
201and related manual pages.
202.It [=equiv=]
203Represents all characters or collating (sorting) elements belonging to
204the same equivalence class as
205.Ar equiv .
206If
207there is a secondary ordering within the equivalence class, the characters
208are ordered in ascending sequence.
209Otherwise, they are ordered after their encoded values.
210An example of an equivalence class might be ``c'' and ``ch'' in Spanish;
211English has no equivalence classes.
212.It [#*n]
213Represents
214.Ar n
215repeated occurrences of the character represented by
216.Ar # .
217This
218expression is only valid when it occurs in
219.Ar string2 .
220If
221.Ar n
222is omitted or is zero, it is be interpreted as large enough to extend
223.Ar string2
224sequence to the length of
225.Ar string1 .
226If
227.Ar n
228has a leading zero, it is interpreted as an octal value, otherwise,
229it's interpreted as a decimal value.
230.El
231.Sh DIAGNOSTICS
232.Ex -std
233.Sh EXAMPLES
234The following examples are shown as given to the shell:
235.Pp
236Create a list of the words in file1, one per line, where a word is taken to
237be a maximal string of letters.
238.Pp
239.D1 Li "tr -cs \*q[:alpha:]\*q \*q\en\*q < file1"
240.Pp
241Translate the contents of file1 to upper-case.
242.Pp
243.D1 Li "tr \*q[:lower:]\*q \*q[:upper:]\*q < file1"
244.Pp
245Strip out non-printable characters from file1.
246.Pp
247.D1 Li "tr -cd \*q[:print:]\*q < file1"
248.Sh COMPATIBILITY
249System V has historically implemented character ranges using the syntax
250``[c-c]'' instead of the ``c-c'' used by historic
251.Bx
252implementations and
253standardized by POSIX.
254System V shell scripts should work under this implementation as long as
255the range is intended to map in another range, i.e. the command
256``tr [a-z] [A-Z]'' will work as it will map the ``['' character in
257.Ar string1
258to the ``['' character in
259.Ar string2 .
260However, if the shell script is deleting or squeezing characters as in
261the command ``tr -d [a-z]'', the characters ``['' and ``]'' will be
262included in the deletion or compression list which would not have happened
263under an historic System V implementation.
264Additionally, any scripts that depended on the sequence ``a-z'' to
265represent the three characters ``a'', ``-'' and ``z'' will have to be
266rewritten as ``a\e-z''.
267.Pp
268The
269.Nm
270utility has historically not permitted the manipulation of NUL bytes in
271its input and, additionally, stripped NUL's from its input stream.
272This implementation has removed this behavior as a bug.
273.Pp
274The
275.Nm
276utility has historically been extremely forgiving of syntax errors,
277for example, the
278.Fl c
279and
280.Fl s
281options were ignored unless two strings were specified.
282This implementation will not permit illegal syntax.
283.Sh STANDARDS
284The
285.Nm
286utility is expected to be
287.St -p1003.2
288compatible.
289It should be noted that the feature wherein the last character of
290.Ar string2
291is duplicated if
292.Ar string2
293has less characters than
294.Ar string1
295is permitted by POSIX but is not required.
296Shell scripts attempting to be portable to other POSIX systems should use
297the ``[#*]'' convention instead of relying on this behavior.
298The
299.Fl u
300option is an extension to the
301.St -p1003.2
302standard.
303