xref: /freebsd/usr.bin/uniq/uniq.1 (revision 0fca6ea1d4eea4c934cfff25ac9ee8ad6fe95583)
1.\" Copyright (c) 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" This code is derived from software contributed to Berkeley by
5.\" the Institute of Electrical and Electronics Engineers, Inc.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that the following conditions
9.\" are met:
10.\" 1. Redistributions of source code must retain the above copyright
11.\"    notice, this list of conditions and the following disclaimer.
12.\" 2. Redistributions in binary form must reproduce the above copyright
13.\"    notice, this list of conditions and the following disclaimer in the
14.\"    documentation and/or other materials provided with the distribution.
15.\" 3. Neither the name of the University nor the names of its contributors
16.\"    may be used to endorse or promote products derived from this software
17.\"    without specific prior written permission.
18.\"
19.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
20.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
21.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
22.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
23.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
24.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
25.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
26.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
27.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
28.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
29.\" SUCH DAMAGE.
30.\"
31.Dd January 12, 2024
32.Dt UNIQ 1
33.Os
34.Sh NAME
35.Nm uniq
36.Nd report or filter out repeated lines in a file
37.Sh SYNOPSIS
38.Nm
39.Op Fl c | Fl d | Fl D | Fl u
40.Op Fl i
41.Op Fl f Ar num
42.Op Fl s Ar chars
43.Oo
44.Ar input_file
45.Op Ar output_file
46.Oc
47.Sh DESCRIPTION
48The
49.Nm
50utility reads the specified
51.Ar input_file
52comparing adjacent lines, and writes a copy of each unique input line to
53the
54.Ar output_file .
55If
56.Ar input_file
57is a single dash
58.Pq Sq Fl
59or absent, the standard input is read.
60If
61.Ar output_file
62is absent, standard output is used for output.
63The second and succeeding copies of identical adjacent input lines are
64not written.
65Repeated lines in the input will not be detected if they are not adjacent,
66so it may be necessary to sort the files first.
67.Pp
68The following options are available:
69.Bl -tag -width Ds
70.It Fl c , Fl -count
71Precede each output line with the count of the number of times the line
72occurred in the input, followed by a single space.
73.It Fl d , Fl -repeated
74Output a single copy of each line that is repeated in the input.
75Ignored if
76.Fl D
77is also specified.
78.It Fl D , Fl -all-repeated Op Ar septype
79Output all lines that are repeated (like
80.Fl d ,
81but each copy of the repeated line is written).
82The optional
83.Ar septype
84argument controls how to separate groups of repeated lines in the output;
85it must be one of the following values:
86.Pp
87.Bl -tag -compact -width separate
88.It none
89Do not separate groups of lines (this is the default).
90.It prepend
91Output an empty line before each group of lines.
92.It separate
93Output an empty line after each group of lines.
94.El
95.It Fl f Ar num , Fl -skip-fields Ar num
96Ignore the first
97.Ar num
98fields in each input line when doing comparisons.
99A field is a string of non-blank characters separated from adjacent fields
100by blanks.
101Field numbers are one based, i.e., the first field is field one.
102.It Fl i , Fl -ignore-case
103Case insensitive comparison of lines.
104.It Fl s Ar chars , Fl -skip-chars Ar chars
105Ignore the first
106.Ar chars
107characters in each input line when doing comparisons.
108If specified in conjunction with the
109.Fl f , Fl -unique
110option, the first
111.Ar chars
112characters after the first
113.Ar num
114fields will be ignored.
115Character numbers are one based, i.e., the first character is character one.
116.It Fl u , Fl -unique
117Only output lines that are not repeated in the input.
118.\".It Fl Ns Ar n
119.\"(Deprecated; replaced by
120.\".Fl f ) .
121.\"Ignore the first n
122.\"fields on each input line when doing comparisons,
123.\"where n is a number.
124.\"A field is a string of non-blank
125.\"characters separated from adjacent fields
126.\"by blanks.
127.\".It Cm \&\(pl Ns Ar n
128.\"(Deprecated; replaced by
129.\".Fl s ) .
130.\"Ignore the first
131.\".Ar m
132.\"characters when doing comparisons, where
133.\".Ar m
134.\"is a
135.\"number.
136.El
137.Sh ENVIRONMENT
138The
139.Ev LANG ,
140.Ev LC_ALL ,
141.Ev LC_COLLATE
142and
143.Ev LC_CTYPE
144environment variables affect the execution of
145.Nm
146as described in
147.Xr environ 7 .
148.Sh EXIT STATUS
149.Ex -std
150.Sh EXAMPLES
151Assuming a file named cities.txt with the following content:
152.Bd -literal -offset indent
153Madrid
154Lisbon
155Madrid
156.Ed
157.Pp
158The following command reports three different lines since identical elements
159are not adjacent:
160.Bd -literal -offset indent
161$ uniq -u cities.txt
162Madrid
163Lisbon
164Madrid
165.Ed
166.Pp
167Sort the file and count the number of identical lines:
168.Bd -literal -offset indent
169$ sort cities.txt | uniq -c
170	1 Lisbon
171	2 Madrid
172.Ed
173.Pp
174Assuming the following content for the file cities.txt:
175.Bd -literal -offset indent
176madrid
177Madrid
178Lisbon
179.Ed
180.Pp
181Show repeated lines ignoring case sensitiveness:
182.Bd -literal -offset indent
183$ uniq -d -i cities.txt
184madrid
185.Ed
186.Pp
187Same as above but showing the whole group of repeated lines:
188.Bd -literal -offset indent
189$ uniq -D -i cities.txt
190madrid
191Madrid
192.Ed
193.Pp
194Report the number of identical lines ignoring the first character of every line:
195.Bd -literal -offset indent
196$ uniq -s 1 -c cities.txt
197	2 madrid
198	1 Lisbon
199.Ed
200.Sh COMPATIBILITY
201The historic
202.Cm \&\(pl Ns Ar number
203and
204.Fl Ns Ar number
205options have been deprecated but are still supported in this implementation.
206.Sh SEE ALSO
207.Xr sort 1
208.Sh STANDARDS
209The
210.Nm
211utility conforms to
212.St -p1003.1-2001
213as amended by Cor.\& 1-2002.
214.Sh HISTORY
215A
216.Nm
217command appeared in
218.At v3 .
219