xref: /freebsd/usr.bin/uniq/uniq.1 (revision 13ec1e3155c7e9bf037b12af186351b7fa9b9450)
1.\" Copyright (c) 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" This code is derived from software contributed to Berkeley by
5.\" the Institute of Electrical and Electronics Engineers, Inc.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that the following conditions
9.\" are met:
10.\" 1. Redistributions of source code must retain the above copyright
11.\"    notice, this list of conditions and the following disclaimer.
12.\" 2. Redistributions in binary form must reproduce the above copyright
13.\"    notice, this list of conditions and the following disclaimer in the
14.\"    documentation and/or other materials provided with the distribution.
15.\" 3. Neither the name of the University nor the names of its contributors
16.\"    may be used to endorse or promote products derived from this software
17.\"    without specific prior written permission.
18.\"
19.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
20.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
21.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
22.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
23.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
24.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
25.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
26.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
27.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
28.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
29.\" SUCH DAMAGE.
30.\"
31.\"     From: @(#)uniq.1	8.1 (Berkeley) 6/6/93
32.\" $FreeBSD$
33.\"
34.Dd June 7, 2020
35.Dt UNIQ 1
36.Os
37.Sh NAME
38.Nm uniq
39.Nd report or filter out repeated lines in a file
40.Sh SYNOPSIS
41.Nm
42.Op Fl c | Fl d | Fl D | Fl u
43.Op Fl i
44.Op Fl f Ar num
45.Op Fl s Ar chars
46.Oo
47.Ar input_file
48.Op Ar output_file
49.Oc
50.Sh DESCRIPTION
51The
52.Nm
53utility reads the specified
54.Ar input_file
55comparing adjacent lines, and writes a copy of each unique input line to
56the
57.Ar output_file .
58If
59.Ar input_file
60is a single dash
61.Pq Sq Fl
62or absent, the standard input is read.
63If
64.Ar output_file
65is absent, standard output is used for output.
66The second and succeeding copies of identical adjacent input lines are
67not written.
68Repeated lines in the input will not be detected if they are not adjacent,
69so it may be necessary to sort the files first.
70.Pp
71The following options are available:
72.Bl -tag -width Ds
73.It Fl c , Fl -count
74Precede each output line with the count of the number of times the line
75occurred in the input, followed by a single space.
76.It Fl d , Fl -repeated
77Output a single copy of each line that is repeated in the input.
78.It Fl D , Fl -all-repeated Op Ar septype
79Output all lines that are repeated (like
80.Fl d ,
81but each copy of the repeated line is written).
82The optional
83.Ar septype
84argument controls how to separate groups of repeated lines in the output;
85it must be one of the following values:
86.Pp
87.Bl -tag -compact -width separate
88.It none
89Do not separate groups of lines (this is the default).
90.It prepend
91Output an empty line before each group of lines.
92.It separate
93Output an empty line after each group of lines.
94.El
95.It Fl f Ar num , Fl -skip-fields Ar num
96Ignore the first
97.Ar num
98fields in each input line when doing comparisons.
99A field is a string of non-blank characters separated from adjacent fields
100by blanks.
101Field numbers are one based, i.e., the first field is field one.
102.It Fl i , Fl -ignore-case
103Case insensitive comparison of lines.
104.It Fl s Ar chars , Fl -skip-chars Ar chars
105Ignore the first
106.Ar chars
107characters in each input line when doing comparisons.
108If specified in conjunction with the
109.Fl f , Fl -unique
110option, the first
111.Ar chars
112characters after the first
113.Ar num
114fields will be ignored.
115Character numbers are one based, i.e., the first character is character one.
116.It Fl u , Fl -unique
117Only output lines that are not repeated in the input.
118.\".It Fl Ns Ar n
119.\"(Deprecated; replaced by
120.\".Fl f ) .
121.\"Ignore the first n
122.\"fields on each input line when doing comparisons,
123.\"where n is a number.
124.\"A field is a string of non-blank
125.\"characters separated from adjacent fields
126.\"by blanks.
127.\".It Cm \&\(pl Ns Ar n
128.\"(Deprecated; replaced by
129.\".Fl s ) .
130.\"Ignore the first
131.\".Ar m
132.\"characters when doing comparisons, where
133.\".Ar m
134.\"is a
135.\"number.
136.El
137.Sh ENVIRONMENT
138The
139.Ev LANG ,
140.Ev LC_ALL ,
141.Ev LC_COLLATE
142and
143.Ev LC_CTYPE
144environment variables affect the execution of
145.Nm
146as described in
147.Xr environ 7 .
148.Sh EXIT STATUS
149.Ex -std
150.Sh EXAMPLES
151Assuming a file named cities.txt with the following content:
152.Bd -literal -offset indent
153Madrid
154Lisbon
155Madrid
156.Ed
157.Pp
158The following command reports three different lines since identical elements
159are not adjacent:
160.Bd -literal -offset indent
161$ uniq -u cities.txt
162Madrid
163Lisbon
164Madrid
165.Ed
166.Pp
167Sort the file and count the number of identical lines:
168.Bd -literal -offset indent
169$ sort cities.txt | uniq -c
170	1 Lisbon
171	2 Madrid
172.Ed
173.Pp
174Assuming the following content for the file cities.txt:
175.Bd -literal -offset indent
176madrid
177Madrid
178Lisbon
179.Ed
180.Pp
181Show repeated lines ignoring case sensitiveness:
182.Bd -literal -offset indent
183$ uniq -d -i cities.txt
184madrid
185.Ed
186.Pp
187Same as above but showing the whole group of repeated lines:
188.Bd -literal -offset indent
189$ uniq -D -i cities.txt
190madrid
191Madrid
192.Ed
193.Pp
194Report the number of identical lines ignoring the first character of every line:
195.Bd -literal -offset indent
196$ uniq -s 1 -c cities.txt
197	2 madrid
198	1 Lisbon
199.Ed
200.Sh COMPATIBILITY
201The historic
202.Cm \&\(pl Ns Ar number
203and
204.Fl Ns Ar number
205options have been deprecated but are still supported in this implementation.
206.Sh SEE ALSO
207.Xr sort 1
208.Sh STANDARDS
209The
210.Nm
211utility conforms to
212.St -p1003.1-2001
213as amended by Cor.\& 1-2002.
214.Sh HISTORY
215A
216.Nm
217command appeared in
218.At v3 .
219