xref: /freebsd/usr.bin/uniq/uniq.1 (revision 22cf89c938886d14f5796fc49f9f020c23ea8eaf)
1.\" Copyright (c) 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" This code is derived from software contributed to Berkeley by
5.\" the Institute of Electrical and Electronics Engineers, Inc.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that the following conditions
9.\" are met:
10.\" 1. Redistributions of source code must retain the above copyright
11.\"    notice, this list of conditions and the following disclaimer.
12.\" 2. Redistributions in binary form must reproduce the above copyright
13.\"    notice, this list of conditions and the following disclaimer in the
14.\"    documentation and/or other materials provided with the distribution.
15.\" 3. Neither the name of the University nor the names of its contributors
16.\"    may be used to endorse or promote products derived from this software
17.\"    without specific prior written permission.
18.\"
19.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
20.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
21.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
22.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
23.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
24.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
25.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
26.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
27.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
28.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
29.\" SUCH DAMAGE.
30.\"
31.\"     From: @(#)uniq.1	8.1 (Berkeley) 6/6/93
32.\"
33.Dd June 7, 2020
34.Dt UNIQ 1
35.Os
36.Sh NAME
37.Nm uniq
38.Nd report or filter out repeated lines in a file
39.Sh SYNOPSIS
40.Nm
41.Op Fl c | Fl d | Fl D | Fl u
42.Op Fl i
43.Op Fl f Ar num
44.Op Fl s Ar chars
45.Oo
46.Ar input_file
47.Op Ar output_file
48.Oc
49.Sh DESCRIPTION
50The
51.Nm
52utility reads the specified
53.Ar input_file
54comparing adjacent lines, and writes a copy of each unique input line to
55the
56.Ar output_file .
57If
58.Ar input_file
59is a single dash
60.Pq Sq Fl
61or absent, the standard input is read.
62If
63.Ar output_file
64is absent, standard output is used for output.
65The second and succeeding copies of identical adjacent input lines are
66not written.
67Repeated lines in the input will not be detected if they are not adjacent,
68so it may be necessary to sort the files first.
69.Pp
70The following options are available:
71.Bl -tag -width Ds
72.It Fl c , Fl -count
73Precede each output line with the count of the number of times the line
74occurred in the input, followed by a single space.
75.It Fl d , Fl -repeated
76Output a single copy of each line that is repeated in the input.
77.It Fl D , Fl -all-repeated Op Ar septype
78Output all lines that are repeated (like
79.Fl d ,
80but each copy of the repeated line is written).
81The optional
82.Ar septype
83argument controls how to separate groups of repeated lines in the output;
84it must be one of the following values:
85.Pp
86.Bl -tag -compact -width separate
87.It none
88Do not separate groups of lines (this is the default).
89.It prepend
90Output an empty line before each group of lines.
91.It separate
92Output an empty line after each group of lines.
93.El
94.It Fl f Ar num , Fl -skip-fields Ar num
95Ignore the first
96.Ar num
97fields in each input line when doing comparisons.
98A field is a string of non-blank characters separated from adjacent fields
99by blanks.
100Field numbers are one based, i.e., the first field is field one.
101.It Fl i , Fl -ignore-case
102Case insensitive comparison of lines.
103.It Fl s Ar chars , Fl -skip-chars Ar chars
104Ignore the first
105.Ar chars
106characters in each input line when doing comparisons.
107If specified in conjunction with the
108.Fl f , Fl -unique
109option, the first
110.Ar chars
111characters after the first
112.Ar num
113fields will be ignored.
114Character numbers are one based, i.e., the first character is character one.
115.It Fl u , Fl -unique
116Only output lines that are not repeated in the input.
117.\".It Fl Ns Ar n
118.\"(Deprecated; replaced by
119.\".Fl f ) .
120.\"Ignore the first n
121.\"fields on each input line when doing comparisons,
122.\"where n is a number.
123.\"A field is a string of non-blank
124.\"characters separated from adjacent fields
125.\"by blanks.
126.\".It Cm \&\(pl Ns Ar n
127.\"(Deprecated; replaced by
128.\".Fl s ) .
129.\"Ignore the first
130.\".Ar m
131.\"characters when doing comparisons, where
132.\".Ar m
133.\"is a
134.\"number.
135.El
136.Sh ENVIRONMENT
137The
138.Ev LANG ,
139.Ev LC_ALL ,
140.Ev LC_COLLATE
141and
142.Ev LC_CTYPE
143environment variables affect the execution of
144.Nm
145as described in
146.Xr environ 7 .
147.Sh EXIT STATUS
148.Ex -std
149.Sh EXAMPLES
150Assuming a file named cities.txt with the following content:
151.Bd -literal -offset indent
152Madrid
153Lisbon
154Madrid
155.Ed
156.Pp
157The following command reports three different lines since identical elements
158are not adjacent:
159.Bd -literal -offset indent
160$ uniq -u cities.txt
161Madrid
162Lisbon
163Madrid
164.Ed
165.Pp
166Sort the file and count the number of identical lines:
167.Bd -literal -offset indent
168$ sort cities.txt | uniq -c
169	1 Lisbon
170	2 Madrid
171.Ed
172.Pp
173Assuming the following content for the file cities.txt:
174.Bd -literal -offset indent
175madrid
176Madrid
177Lisbon
178.Ed
179.Pp
180Show repeated lines ignoring case sensitiveness:
181.Bd -literal -offset indent
182$ uniq -d -i cities.txt
183madrid
184.Ed
185.Pp
186Same as above but showing the whole group of repeated lines:
187.Bd -literal -offset indent
188$ uniq -D -i cities.txt
189madrid
190Madrid
191.Ed
192.Pp
193Report the number of identical lines ignoring the first character of every line:
194.Bd -literal -offset indent
195$ uniq -s 1 -c cities.txt
196	2 madrid
197	1 Lisbon
198.Ed
199.Sh COMPATIBILITY
200The historic
201.Cm \&\(pl Ns Ar number
202and
203.Fl Ns Ar number
204options have been deprecated but are still supported in this implementation.
205.Sh SEE ALSO
206.Xr sort 1
207.Sh STANDARDS
208The
209.Nm
210utility conforms to
211.St -p1003.1-2001
212as amended by Cor.\& 1-2002.
213.Sh HISTORY
214A
215.Nm
216command appeared in
217.At v3 .
218