xref: /freebsd/usr.bin/sort/sort.1.in (revision 3d44dce90a6946e2ef2ab30ffbf8e2930acf888b)
1f79477ebSPedro F. Giffuni.\"	$OpenBSD: sort.1,v 1.45 2015/03/19 13:51:10 jmc Exp $
2c66bbc91SGabor Kovesdan.\"
3c66bbc91SGabor Kovesdan.\" Copyright (c) 1991, 1993
4c66bbc91SGabor Kovesdan.\"	The Regents of the University of California.  All rights reserved.
5c66bbc91SGabor Kovesdan.\"
6c66bbc91SGabor Kovesdan.\" This code is derived from software contributed to Berkeley by
7c66bbc91SGabor Kovesdan.\" the Institute of Electrical and Electronics Engineers, Inc.
8c66bbc91SGabor Kovesdan.\"
9c66bbc91SGabor Kovesdan.\" Redistribution and use in source and binary forms, with or without
10c66bbc91SGabor Kovesdan.\" modification, are permitted provided that the following conditions
11c66bbc91SGabor Kovesdan.\" are met:
12c66bbc91SGabor Kovesdan.\" 1. Redistributions of source code must retain the above copyright
13c66bbc91SGabor Kovesdan.\"    notice, this list of conditions and the following disclaimer.
14c66bbc91SGabor Kovesdan.\" 2. Redistributions in binary form must reproduce the above copyright
15c66bbc91SGabor Kovesdan.\"    notice, this list of conditions and the following disclaimer in the
16c66bbc91SGabor Kovesdan.\"    documentation and/or other materials provided with the distribution.
17c66bbc91SGabor Kovesdan.\" 3. Neither the name of the University nor the names of its contributors
18c66bbc91SGabor Kovesdan.\"    may be used to endorse or promote products derived from this software
19c66bbc91SGabor Kovesdan.\"    without specific prior written permission.
20c66bbc91SGabor Kovesdan.\"
21c66bbc91SGabor Kovesdan.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
22c66bbc91SGabor Kovesdan.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
23c66bbc91SGabor Kovesdan.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
24c66bbc91SGabor Kovesdan.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
25c66bbc91SGabor Kovesdan.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
26c66bbc91SGabor Kovesdan.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
27c66bbc91SGabor Kovesdan.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
28c66bbc91SGabor Kovesdan.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
29c66bbc91SGabor Kovesdan.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
30c66bbc91SGabor Kovesdan.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
31c66bbc91SGabor Kovesdan.\" SUCH DAMAGE.
32c66bbc91SGabor Kovesdan.\"
33*3d44dce9SChristos Margiolis.Dd November 30, 2023
34c66bbc91SGabor Kovesdan.Dt SORT 1
35c66bbc91SGabor Kovesdan.Os
36c66bbc91SGabor Kovesdan.Sh NAME
37c66bbc91SGabor Kovesdan.Nm sort
38c66bbc91SGabor Kovesdan.Nd sort or merge records (lines) of text and binary files
39c66bbc91SGabor Kovesdan.Sh SYNOPSIS
40902b9f79SBaptiste Daroussin.Nm
41c66bbc91SGabor Kovesdan.Bk -words
42c66bbc91SGabor Kovesdan.Op Fl bcCdfghiRMmnrsuVz
43c66bbc91SGabor Kovesdan.Sm off
44c66bbc91SGabor Kovesdan.Op Fl k\ \& Ar field1 Op , Ar field2
45c66bbc91SGabor Kovesdan.Sm on
46c66bbc91SGabor Kovesdan.Op Fl S Ar memsize
47c66bbc91SGabor Kovesdan.Ek
48c66bbc91SGabor Kovesdan.Op Fl T Ar dir
49c66bbc91SGabor Kovesdan.Op Fl t Ar char
50c66bbc91SGabor Kovesdan.Op Fl o Ar output
51c66bbc91SGabor Kovesdan.Op Ar file ...
52902b9f79SBaptiste Daroussin.Nm
53c66bbc91SGabor Kovesdan.Fl Fl help
54902b9f79SBaptiste Daroussin.Nm
55c66bbc91SGabor Kovesdan.Fl Fl version
56c66bbc91SGabor Kovesdan.Sh DESCRIPTION
57c66bbc91SGabor KovesdanThe
58c66bbc91SGabor Kovesdan.Nm
59c66bbc91SGabor Kovesdanutility sorts text and binary files by lines.
60c66bbc91SGabor KovesdanA line is a record separated from the subsequent record by a
61c66bbc91SGabor Kovesdannewline (default) or NUL \'\\0\' character (-z option).
62c66bbc91SGabor KovesdanA record can contain any printable or unprintable characters.
63c66bbc91SGabor KovesdanComparisons are based on one or more sort keys extracted from
64c66bbc91SGabor Kovesdaneach line of input, and are performed lexicographically,
65c66bbc91SGabor Kovesdanaccording to the current locale's collating rules and the
66c66bbc91SGabor Kovesdanspecified command-line options that can tune the actual
67c66bbc91SGabor Kovesdansorting behavior.
68c66bbc91SGabor KovesdanBy default, if keys are not given,
69c66bbc91SGabor Kovesdan.Nm
70c66bbc91SGabor Kovesdanuses entire lines for comparison.
71c66bbc91SGabor Kovesdan.Pp
72c66bbc91SGabor KovesdanThe command line options are as follows:
73c66bbc91SGabor Kovesdan.Bl -tag -width Ds
74c66bbc91SGabor Kovesdan.It Fl c , Fl Fl check , Fl C , Fl Fl check=silent|quiet
75c66bbc91SGabor KovesdanCheck that the single input file is sorted.
76c66bbc91SGabor KovesdanIf the file is not sorted,
77c66bbc91SGabor Kovesdan.Nm
78c66bbc91SGabor Kovesdanproduces the appropriate error messages and exits with code 1,
79c66bbc91SGabor Kovesdanotherwise returns 0.
80c66bbc91SGabor KovesdanIf
81c66bbc91SGabor Kovesdan.Fl C
82c66bbc91SGabor Kovesdanor
83c66bbc91SGabor Kovesdan.Fl Fl check=silent
84c66bbc91SGabor Kovesdanis specified,
85c66bbc91SGabor Kovesdan.Nm
86c66bbc91SGabor Kovesdanproduces no output.
87c66bbc91SGabor KovesdanThis is a "silent" version of
88c66bbc91SGabor Kovesdan.Fl c .
89c66bbc91SGabor Kovesdan.It Fl m , Fl Fl merge
90c66bbc91SGabor KovesdanMerge only.
91c66bbc91SGabor KovesdanThe input files are assumed to be pre-sorted.
92c66bbc91SGabor KovesdanIf they are not sorted the output order is undefined.
93c66bbc91SGabor Kovesdan.It Fl o Ar output , Fl Fl output Ns = Ns Ar output
94c66bbc91SGabor KovesdanPrint the output to the
95c66bbc91SGabor Kovesdan.Ar output
96c66bbc91SGabor Kovesdanfile instead of the standard output.
97c66bbc91SGabor Kovesdan.It Fl S Ar size , Fl Fl buffer-size Ns = Ns Ar size
98c66bbc91SGabor KovesdanUse
99c66bbc91SGabor Kovesdan.Ar size
100c66bbc91SGabor Kovesdanfor the maximum size of the memory buffer.
101c66bbc91SGabor KovesdanSize modifiers %,b,K,M,G,T,P,E,Z,Y can be used.
102c66bbc91SGabor KovesdanIf a memory limit is not explicitly specified,
103c66bbc91SGabor Kovesdan.Nm
104c66bbc91SGabor Kovesdantakes up to about 90% of available memory.
105c66bbc91SGabor KovesdanIf the file size is too big to fit into the memory buffer,
106c66bbc91SGabor Kovesdanthe temporary disk files are used to perform the sorting.
107c66bbc91SGabor Kovesdan.It Fl T Ar dir , Fl Fl temporary-directory Ns = Ns Ar dir
108c66bbc91SGabor KovesdanStore temporary files in the directory
109c66bbc91SGabor Kovesdan.Ar dir .
110c66bbc91SGabor KovesdanThe default path is the value of the environment variable
111c66bbc91SGabor Kovesdan.Ev TMPDIR
112c66bbc91SGabor Kovesdanor
113c66bbc91SGabor Kovesdan.Pa /var/tmp
114c66bbc91SGabor Kovesdanif
115c66bbc91SGabor Kovesdan.Ev TMPDIR
116c66bbc91SGabor Kovesdanis not defined.
117c66bbc91SGabor Kovesdan.It Fl u , Fl Fl unique
118c66bbc91SGabor KovesdanUnique keys.
119c66bbc91SGabor KovesdanSuppress all lines that have a key that is equal to an already
120c66bbc91SGabor Kovesdanprocessed one.
121c66bbc91SGabor KovesdanThis option, similarly to
122c66bbc91SGabor Kovesdan.Fl s ,
123c66bbc91SGabor Kovesdanimplies a stable sort.
124c66bbc91SGabor KovesdanIf used with
125c66bbc91SGabor Kovesdan.Fl c
126c66bbc91SGabor Kovesdanor
127c66bbc91SGabor Kovesdan.Fl C ,
128c66bbc91SGabor Kovesdan.Nm
129c66bbc91SGabor Kovesdanalso checks that there are no lines with duplicate keys.
130c66bbc91SGabor Kovesdan.It Fl s
131c66bbc91SGabor KovesdanStable sort.
132c66bbc91SGabor KovesdanThis option maintains the original record order of records that have
133a6be4690SGabor Kovesdanan equal key.
134c66bbc91SGabor KovesdanThis is a non-standard feature, but it is widely accepted and used.
135c66bbc91SGabor Kovesdan.It Fl Fl version
136c66bbc91SGabor KovesdanPrint the version and silently exits.
137c66bbc91SGabor Kovesdan.It Fl Fl help
138c66bbc91SGabor KovesdanPrint the help text and silently exits.
139c66bbc91SGabor Kovesdan.El
140c66bbc91SGabor Kovesdan.Pp
141c66bbc91SGabor KovesdanThe following options override the default ordering rules.
142c66bbc91SGabor KovesdanWhen ordering options appear independently of key field
143c66bbc91SGabor Kovesdanspecifications, they apply globally to all sort keys.
144c66bbc91SGabor KovesdanWhen attached to a specific key (see
145c66bbc91SGabor Kovesdan.Fl k ) ,
146c66bbc91SGabor Kovesdanthe ordering options override all global ordering options for
147459d6434SBryan Drewerythe key they are attached to.
148c66bbc91SGabor Kovesdan.Bl -tag -width indent
149c66bbc91SGabor Kovesdan.It Fl b , Fl Fl ignore-leading-blanks
150c66bbc91SGabor KovesdanIgnore leading blank characters when comparing lines.
151c66bbc91SGabor Kovesdan.It Fl d , Fl Fl dictionary-order
152c66bbc91SGabor KovesdanConsider only blank spaces and alphanumeric characters in comparisons.
153c66bbc91SGabor Kovesdan.It Fl f , Fl Fl ignore-case
154c66bbc91SGabor KovesdanConvert all lowercase characters to their uppercase equivalent
155c66bbc91SGabor Kovesdanbefore comparison, that is, perform case-independent sorting.
156c66bbc91SGabor Kovesdan.It Fl g , Fl Fl general-numeric-sort , Fl Fl sort=general-numeric
157c66bbc91SGabor KovesdanSort by general numerical value.
158c66bbc91SGabor KovesdanAs opposed to
159c66bbc91SGabor Kovesdan.Fl n ,
160f79477ebSPedro F. Giffunithis option handles general floating points.
161f79477ebSPedro F. GiffuniIt has a more
162f79477ebSPedro F. Giffunipermissive format than that allowed by
163f79477ebSPedro F. Giffuni.Fl n
164c66bbc91SGabor Kovesdanbut it has a significant performance drawback.
165c66bbc91SGabor Kovesdan.It Fl h , Fl Fl human-numeric-sort , Fl Fl sort=human-numeric
166c66bbc91SGabor KovesdanSort by numerical value, but take into account the SI suffix,
167c66bbc91SGabor Kovesdanif present.
168c66bbc91SGabor KovesdanSort first by numeric sign (negative, zero, or
169c66bbc91SGabor Kovesdanpositive); then by SI suffix (either empty, or `k' or `K', or one
170c66bbc91SGabor Kovesdanof `MGTPEZY', in that order); and finally by numeric value.
171c66bbc91SGabor KovesdanThe SI suffix must immediately follow the number.
172c66bbc91SGabor KovesdanFor example, '12345K' sorts before '1M', because M is "larger" than K.
173c66bbc91SGabor KovesdanThis sort option is useful for sorting the output of a single invocation
174c66bbc91SGabor Kovesdanof 'df' command with
175c66bbc91SGabor Kovesdan.Fl h
176c66bbc91SGabor Kovesdanor
177c66bbc91SGabor Kovesdan.Fl H
178c66bbc91SGabor Kovesdanoptions (human-readable).
179c66bbc91SGabor Kovesdan.It Fl i , Fl Fl ignore-nonprinting
180c66bbc91SGabor KovesdanIgnore all non-printable characters.
181c66bbc91SGabor Kovesdan.It Fl M , Fl Fl month-sort , Fl Fl sort=month
182*3d44dce9SChristos MargiolisSort by month.
183c66bbc91SGabor KovesdanUnknown strings are considered smaller than the month names.
184c66bbc91SGabor Kovesdan.It Fl n , Fl Fl numeric-sort , Fl Fl sort=numeric
185c66bbc91SGabor KovesdanSort fields numerically by arithmetic value.
186c66bbc91SGabor KovesdanFields are supposed to have optional blanks in the beginning, an
187c66bbc91SGabor Kovesdanoptional minus sign, zero or more digits (including decimal point and
188c66bbc91SGabor Kovesdanpossible thousand separators).
189c66bbc91SGabor Kovesdan.It Fl R , Fl Fl random-sort , Fl Fl sort=random
190c66bbc91SGabor KovesdanSort by a random order.
191c66bbc91SGabor KovesdanThis is a random permutation of the inputs except that
192c66bbc91SGabor Kovesdanthe equal keys sort together.
193c66bbc91SGabor KovesdanIt is implemented by hashing the input keys and sorting
194c66bbc91SGabor Kovesdanthe hash values.
195b1a40986SPedro F. GiffuniThe hash function is chosen randomly.
196c66bbc91SGabor KovesdanThe hash function is randomized by
197c66bbc91SGabor Kovesdan.Cm /dev/random
198c66bbc91SGabor Kovesdancontent, or by file content if it is specified by
199c66bbc91SGabor Kovesdan.Fl Fl random-source .
200c66bbc91SGabor KovesdanEven if multiple sort fields are specified,
201c66bbc91SGabor Kovesdanthe same random hash function is used for all of them.
202c66bbc91SGabor Kovesdan.It Fl r , Fl Fl reverse
203c66bbc91SGabor KovesdanSort in reverse order.
204c66bbc91SGabor Kovesdan.It Fl V , Fl Fl version-sort
205c66bbc91SGabor KovesdanSort version numbers.
206c66bbc91SGabor KovesdanThe input lines are treated as file names in form
207c66bbc91SGabor KovesdanPREFIX VERSION SUFFIX, where SUFFIX matches the regular expression
208c66bbc91SGabor Kovesdan"(\.([A-Za-z~][A-Za-z0-9~]*)?)*".
209c66bbc91SGabor KovesdanThe files are compared by their prefixes and versions (leading
210c66bbc91SGabor Kovesdanzeros are ignored in version numbers, see example below).
211c66bbc91SGabor KovesdanIf an input string does not match the pattern, then it is compared
212c66bbc91SGabor Kovesdanusing the byte compare function.
213b1a40986SPedro F. GiffuniAll string comparisons are performed in C locale, the locale
214c66bbc91SGabor Kovesdanenvironment setting is ignored.
215c66bbc91SGabor Kovesdan.Bl -tag -width indent
216c66bbc91SGabor Kovesdan.It Example:
217c66bbc91SGabor Kovesdan.It $ ls sort* | sort -V
218c66bbc91SGabor Kovesdan.It sort-1.022.tgz
219c66bbc91SGabor Kovesdan.It sort-1.23.tgz
220c66bbc91SGabor Kovesdan.It sort-1.23.1.tgz
221c66bbc91SGabor Kovesdan.It sort-1.024.tgz
222c66bbc91SGabor Kovesdan.It sort-1.024.003.
223c66bbc91SGabor Kovesdan.It sort-1.024.003.tgz
224c66bbc91SGabor Kovesdan.It sort-1.024.07.tgz
225c66bbc91SGabor Kovesdan.It sort-1.024.009.tgz
226c66bbc91SGabor Kovesdan.El
227c66bbc91SGabor Kovesdan.El
228c66bbc91SGabor Kovesdan.Pp
229c66bbc91SGabor KovesdanThe treatment of field separators can be altered using these options:
230c66bbc91SGabor Kovesdan.Bl -tag -width indent
231c66bbc91SGabor Kovesdan.It Fl b , Fl Fl ignore-leading-blanks
232c66bbc91SGabor KovesdanIgnore leading blank space when determining the start
233c66bbc91SGabor Kovesdanand end of a restricted sort key (see
234902b9f79SBaptiste Daroussin.Fl k ) .
235c66bbc91SGabor KovesdanIf
236c66bbc91SGabor Kovesdan.Fl b
237c66bbc91SGabor Kovesdanis specified before the first
238c66bbc91SGabor Kovesdan.Fl k
239c66bbc91SGabor Kovesdanoption, it applies globally to all key specifications.
240c66bbc91SGabor KovesdanOtherwise,
241c66bbc91SGabor Kovesdan.Fl b
242c66bbc91SGabor Kovesdancan be attached independently to each
243c66bbc91SGabor Kovesdan.Ar field
244c66bbc91SGabor Kovesdanargument of the key specifications.
245c66bbc91SGabor Kovesdan.Fl b .
246c66bbc91SGabor Kovesdan.It Xo
247f79477ebSPedro F. Giffuni.Fl k Ar field1 Ns Op , Ns Ar field2 ,
248f79477ebSPedro F. Giffuni.Fl Fl key Ns = Ns Ar field1 Ns Op , Ns Ar field2
249c66bbc91SGabor Kovesdan.Xc
250c66bbc91SGabor KovesdanDefine a restricted sort key that has the starting position
251c66bbc91SGabor Kovesdan.Ar field1 ,
252c66bbc91SGabor Kovesdanand optional ending position
253c66bbc91SGabor Kovesdan.Ar field2
254c66bbc91SGabor Kovesdanof a key field.
255c66bbc91SGabor KovesdanThe
256c66bbc91SGabor Kovesdan.Fl k
257c66bbc91SGabor Kovesdanoption may be specified multiple times,
258c66bbc91SGabor Kovesdanin which case subsequent keys are compared when earlier keys compare equal.
259c66bbc91SGabor KovesdanThe
260c66bbc91SGabor Kovesdan.Fl k
261c66bbc91SGabor Kovesdanoption replaces the obsolete options
262c66bbc91SGabor Kovesdan.Cm \(pl Ns Ar pos1
263c66bbc91SGabor Kovesdanand
264c66bbc91SGabor Kovesdan.Fl Ns Ar pos2 ,
265c66bbc91SGabor Kovesdanbut the old notation is also supported.
266c66bbc91SGabor Kovesdan.It Fl t Ar char , Fl Fl field-separator Ns = Ns Ar char
267c66bbc91SGabor KovesdanUse
268c66bbc91SGabor Kovesdan.Ar char
269c66bbc91SGabor Kovesdanas a field separator character.
270c66bbc91SGabor KovesdanThe initial
271c66bbc91SGabor Kovesdan.Ar char
272c66bbc91SGabor Kovesdanis not considered to be part of a field when determining key offsets.
273c66bbc91SGabor KovesdanEach occurrence of
274c66bbc91SGabor Kovesdan.Ar char
275c66bbc91SGabor Kovesdanis significant (for example,
276c66bbc91SGabor Kovesdan.Dq Ar charchar
277c66bbc91SGabor Kovesdandelimits an empty field).
278c66bbc91SGabor KovesdanIf
279c66bbc91SGabor Kovesdan.Fl t
280c66bbc91SGabor Kovesdanis not specified, the default field separator is a sequence of
281c66bbc91SGabor Kovesdanblank space characters, and consecutive blank spaces do
282c66bbc91SGabor Kovesdan.Em not
283c66bbc91SGabor Kovesdandelimit an empty field, however, the initial blank space
284c66bbc91SGabor Kovesdan.Em is
285c66bbc91SGabor Kovesdanconsidered part of a field when determining key offsets.
286c66bbc91SGabor KovesdanTo use NUL as field separator, use
287c66bbc91SGabor Kovesdan.Fl t
288c66bbc91SGabor Kovesdan\'\\0\'.
289c66bbc91SGabor Kovesdan.It Fl z , Fl Fl zero-terminated
290c66bbc91SGabor KovesdanUse NUL as record separator.
291c66bbc91SGabor KovesdanBy default, records in the files are supposed to be separated by
292c66bbc91SGabor Kovesdanthe newline characters.
293c66bbc91SGabor KovesdanWith this option, NUL (\'\\0\') is used as a record separator character.
294c66bbc91SGabor Kovesdan.El
295c66bbc91SGabor Kovesdan.Pp
296c66bbc91SGabor KovesdanOther options:
297c66bbc91SGabor Kovesdan.Bl -tag -width indent
298c66bbc91SGabor Kovesdan.It Fl Fl batch-size Ns = Ns Ar num
299c66bbc91SGabor KovesdanSpecify maximum number of files that can be opened by
300c66bbc91SGabor Kovesdan.Nm
301c66bbc91SGabor Kovesdanat once.
302c66bbc91SGabor KovesdanThis option affects behavior when having many input files or using
303c66bbc91SGabor Kovesdantemporary files.
304c66bbc91SGabor KovesdanThe default value is 16.
305c66bbc91SGabor Kovesdan.It Fl Fl compress-program Ns = Ns Ar PROGRAM
306c66bbc91SGabor KovesdanUse PROGRAM to compress temporary files.
307c66bbc91SGabor KovesdanPROGRAM must compress standard input to standard output, when called
308c66bbc91SGabor Kovesdanwithout arguments.
309c66bbc91SGabor KovesdanWhen called with argument
310c66bbc91SGabor Kovesdan.Fl d
311c66bbc91SGabor Kovesdanit must decompress standard input to standard output.
312c66bbc91SGabor KovesdanIf PROGRAM fails,
313c66bbc91SGabor Kovesdan.Nm
314c66bbc91SGabor Kovesdanmust exit with error.
315c66bbc91SGabor KovesdanAn example of PROGRAM that can be used here is bzip2.
316c66bbc91SGabor Kovesdan.It Fl Fl random-source Ns = Ns Ar filename
317c66bbc91SGabor KovesdanIn random sort, the file content is used as the source of the 'seed' data
318c66bbc91SGabor Kovesdanfor the hash function choice.
319c66bbc91SGabor KovesdanTwo invocations of random sort with the same seed data will use
320c66bbc91SGabor Kovesdanthe same hash function and will produce the same result if the input is
321c66bbc91SGabor Kovesdanalso identical.
322c66bbc91SGabor KovesdanBy default, file
323c66bbc91SGabor Kovesdan.Cm /dev/random
324c66bbc91SGabor Kovesdanis used.
325c66bbc91SGabor Kovesdan.It Fl Fl debug
326c66bbc91SGabor KovesdanPrint some extra information about the sorting process to the
327c66bbc91SGabor Kovesdanstandard output.
3285d5151aeSGabor Kovesdan%%THREADS%%.It Fl Fl parallel
329c66bbc91SGabor Kovesdan%%THREADS%%Set the maximum number of execution threads.
330c66bbc91SGabor Kovesdan%%THREADS%%Default number equals to the number of CPUs.
331c66bbc91SGabor Kovesdan.It Fl Fl files0-from Ns = Ns Ar filename
332c66bbc91SGabor KovesdanTake the input file list from the file
333c66bbc91SGabor Kovesdan.Ar filename .
334c66bbc91SGabor KovesdanThe file names must be separated by NUL
335c66bbc91SGabor Kovesdan(like the output produced by the command "find ... -print0").
336c66bbc91SGabor Kovesdan.It Fl Fl radixsort
337c66bbc91SGabor KovesdanTry to use radix sort, if the sort specifications allow.
338c66bbc91SGabor KovesdanThe radix sort can only be used for trivial locales (C and POSIX),
339c66bbc91SGabor Kovesdanand it cannot be used for numeric or month sort.
340c66bbc91SGabor KovesdanRadix sort is very fast and stable.
341c66bbc91SGabor Kovesdan.It Fl Fl mergesort
342c66bbc91SGabor KovesdanUse mergesort.
343c66bbc91SGabor KovesdanThis is a universal algorithm that can always be used,
344c66bbc91SGabor Kovesdanbut it is not always the fastest.
345c66bbc91SGabor Kovesdan.It Fl Fl qsort
346c66bbc91SGabor KovesdanTry to use quick sort, if the sort specifications allow.
347c66bbc91SGabor KovesdanThis sort algorithm cannot be used with
348c66bbc91SGabor Kovesdan.Fl u
349c66bbc91SGabor Kovesdanand
350c66bbc91SGabor Kovesdan.Fl s .
351c66bbc91SGabor Kovesdan.It Fl Fl heapsort
352c66bbc91SGabor KovesdanTry to use heap sort, if the sort specifications allow.
353c66bbc91SGabor KovesdanThis sort algorithm cannot be used with
354c66bbc91SGabor Kovesdan.Fl u
355c66bbc91SGabor Kovesdanand
356c66bbc91SGabor Kovesdan.Fl s .
3575ca724dcSGabor Kovesdan.It Fl Fl mmap
3585ca724dcSGabor KovesdanTry to use file memory mapping system call.
3595ca724dcSGabor KovesdanIt may increase speed in some cases.
360c66bbc91SGabor Kovesdan.El
361c66bbc91SGabor Kovesdan.Pp
362c66bbc91SGabor KovesdanThe following operands are available:
363c66bbc91SGabor Kovesdan.Bl -tag -width indent
364c66bbc91SGabor Kovesdan.It Ar file
365c66bbc91SGabor KovesdanThe pathname of a file to be sorted, merged, or checked.
366c66bbc91SGabor KovesdanIf no
367c66bbc91SGabor Kovesdan.Ar file
368c66bbc91SGabor Kovesdanoperands are specified, or if a
369c66bbc91SGabor Kovesdan.Ar file
370c66bbc91SGabor Kovesdanoperand is
371c66bbc91SGabor Kovesdan.Fl ,
372c66bbc91SGabor Kovesdanthe standard input is used.
373c66bbc91SGabor Kovesdan.El
374c66bbc91SGabor Kovesdan.Pp
375c66bbc91SGabor KovesdanA field is defined as a maximal sequence of characters other than the
376c66bbc91SGabor Kovesdanfield separator and record separator (newline by default).
377c66bbc91SGabor KovesdanInitial blank spaces are included in the field unless
378c66bbc91SGabor Kovesdan.Fl b
379c66bbc91SGabor Kovesdanhas been specified;
380c66bbc91SGabor Kovesdanthe first blank space of a sequence of blank spaces acts as the field
381c66bbc91SGabor Kovesdanseparator and is included in the field (unless
382c66bbc91SGabor Kovesdan.Fl t
383c66bbc91SGabor Kovesdanis specified).
384c66bbc91SGabor KovesdanFor example, all blank spaces at the beginning of a line are
385c66bbc91SGabor Kovesdanconsidered to be part of the first field.
386c66bbc91SGabor Kovesdan.Pp
387c66bbc91SGabor KovesdanFields are specified by the
388c66bbc91SGabor Kovesdan.Sm off
389c66bbc91SGabor Kovesdan.Fl k\ \& Ar field1 Op , Ar field2
390c66bbc91SGabor Kovesdan.Sm on
391c66bbc91SGabor Kovesdancommand-line option.
392c66bbc91SGabor KovesdanIf
393c66bbc91SGabor Kovesdan.Ar field2
394c66bbc91SGabor Kovesdanis missing, the end of the key defaults to the end of the line.
395c66bbc91SGabor Kovesdan.Pp
396c66bbc91SGabor KovesdanThe arguments
397c66bbc91SGabor Kovesdan.Ar field1
398c66bbc91SGabor Kovesdanand
399c66bbc91SGabor Kovesdan.Ar field2
400c66bbc91SGabor Kovesdanhave the form
401c66bbc91SGabor Kovesdan.Em m.n
402c66bbc91SGabor Kovesdan.Em (m,n > 0)
403c66bbc91SGabor Kovesdanand can be followed by one or more of the modifiers
404c66bbc91SGabor Kovesdan.Cm b , d , f , i ,
405c66bbc91SGabor Kovesdan.Cm n , g , M
406c66bbc91SGabor Kovesdanand
407c66bbc91SGabor Kovesdan.Cm r ,
408c66bbc91SGabor Kovesdanwhich correspond to the options discussed above.
409c66bbc91SGabor KovesdanWhen
410c66bbc91SGabor Kovesdan.Cm b
411c66bbc91SGabor Kovesdanis specified it applies only to
412c66bbc91SGabor Kovesdan.Ar field1
413c66bbc91SGabor Kovesdanor
414c66bbc91SGabor Kovesdan.Ar field2
415c66bbc91SGabor Kovesdanwhere it is specified while the rest of the modifiers
416c66bbc91SGabor Kovesdanapply to the whole key field regardless if they are
417c66bbc91SGabor Kovesdanspecified only with
418c66bbc91SGabor Kovesdan.Ar field1
419c66bbc91SGabor Kovesdanor
420c66bbc91SGabor Kovesdan.Ar field2
421c66bbc91SGabor Kovesdanor both.
422c66bbc91SGabor KovesdanA
423c66bbc91SGabor Kovesdan.Ar field1
424c66bbc91SGabor Kovesdanposition specified by
425c66bbc91SGabor Kovesdan.Em m.n
426c66bbc91SGabor Kovesdanis interpreted as the
427c66bbc91SGabor Kovesdan.Em n Ns th
428c66bbc91SGabor Kovesdancharacter from the beginning of the
429c66bbc91SGabor Kovesdan.Em m Ns th
430c66bbc91SGabor Kovesdanfield.
431c66bbc91SGabor KovesdanA missing
432c66bbc91SGabor Kovesdan.Em \&.n
433c66bbc91SGabor Kovesdanin
434c66bbc91SGabor Kovesdan.Ar field1
435c66bbc91SGabor Kovesdanmeans
436c66bbc91SGabor Kovesdan.Ql \&.1 ,
437c66bbc91SGabor Kovesdanindicating the first character of the
438c66bbc91SGabor Kovesdan.Em m Ns th
439c66bbc91SGabor Kovesdanfield; if the
440c66bbc91SGabor Kovesdan.Fl b
441c66bbc91SGabor Kovesdanoption is in effect,
442c66bbc91SGabor Kovesdan.Em n
443c66bbc91SGabor Kovesdanis counted from the first non-blank character in the
444c66bbc91SGabor Kovesdan.Em m Ns th
445c66bbc91SGabor Kovesdanfield;
446c66bbc91SGabor Kovesdan.Em m Ns \&.1b
447c66bbc91SGabor Kovesdanrefers to the first non-blank character in the
448c66bbc91SGabor Kovesdan.Em m Ns th
449c66bbc91SGabor Kovesdanfield.
450c66bbc91SGabor Kovesdan.No 1\&. Ns Em n
451c66bbc91SGabor Kovesdanrefers to the
452c66bbc91SGabor Kovesdan.Em n Ns th
453c66bbc91SGabor Kovesdancharacter from the beginning of the line;
454c66bbc91SGabor Kovesdanif
455c66bbc91SGabor Kovesdan.Em n
456c66bbc91SGabor Kovesdanis greater than the length of the line, the field is taken to be empty.
457c66bbc91SGabor Kovesdan.Pp
458c66bbc91SGabor Kovesdan.Em n Ns th
459c66bbc91SGabor Kovesdanpositions are always counted from the field beginning, even if the field
460c66bbc91SGabor Kovesdanis shorter than the number of specified positions.
461c66bbc91SGabor KovesdanThus, the key can really start from a position in a subsequent field.
462c66bbc91SGabor Kovesdan.Pp
463c66bbc91SGabor KovesdanA
464c66bbc91SGabor Kovesdan.Ar field2
465c66bbc91SGabor Kovesdanposition specified by
466c66bbc91SGabor Kovesdan.Em m.n
467c66bbc91SGabor Kovesdanis interpreted as the
468c66bbc91SGabor Kovesdan.Em n Ns th
469c66bbc91SGabor Kovesdancharacter (including separators) from the beginning of the
470c66bbc91SGabor Kovesdan.Em m Ns th
471c66bbc91SGabor Kovesdanfield.
472c66bbc91SGabor KovesdanA missing
473c66bbc91SGabor Kovesdan.Em \&.n
474c66bbc91SGabor Kovesdanindicates the last character of the
475c66bbc91SGabor Kovesdan.Em m Ns th
476c66bbc91SGabor Kovesdanfield;
477c66bbc91SGabor Kovesdan.Em m
478c66bbc91SGabor Kovesdan= \&0
479c66bbc91SGabor Kovesdandesignates the end of a line.
480c66bbc91SGabor KovesdanThus the option
481c66bbc91SGabor Kovesdan.Fl k Ar v.x,w.y
482c66bbc91SGabor Kovesdanis synonymous with the obsolete option
483c66bbc91SGabor Kovesdan.Cm \(pl Ns Ar v-\&1.x-\&1
484c66bbc91SGabor Kovesdan.Fl Ns Ar w-\&1.y ;
485c66bbc91SGabor Kovesdanwhen
486c66bbc91SGabor Kovesdan.Em y
487c66bbc91SGabor Kovesdanis omitted,
488c66bbc91SGabor Kovesdan.Fl k Ar v.x,w
489c66bbc91SGabor Kovesdanis synonymous with
490c66bbc91SGabor Kovesdan.Cm \(pl Ns Ar v-\&1.x-\&1
491c66bbc91SGabor Kovesdan.Fl Ns Ar w\&.0 .
492c66bbc91SGabor KovesdanThe obsolete
493c66bbc91SGabor Kovesdan.Cm \(pl Ns Ar pos1
494c66bbc91SGabor Kovesdan.Fl Ns Ar pos2
495c66bbc91SGabor Kovesdanoption is still supported, except for
496c66bbc91SGabor Kovesdan.Fl Ns Ar w\&.0b ,
497c66bbc91SGabor Kovesdanwhich has no
498c66bbc91SGabor Kovesdan.Fl k
499c66bbc91SGabor Kovesdanequivalent.
500c66bbc91SGabor Kovesdan.Sh ENVIRONMENT
501c66bbc91SGabor Kovesdan.Bl -tag -width Fl
502c66bbc91SGabor Kovesdan.It Ev LC_COLLATE
503c66bbc91SGabor KovesdanLocale settings to be used to determine the collation for
504c66bbc91SGabor Kovesdansorting records.
505c66bbc91SGabor Kovesdan.It Ev LC_CTYPE
506c66bbc91SGabor KovesdanLocale settings to be used to case conversion and classification
507c66bbc91SGabor Kovesdanof characters, that is, which characters are considered
508c66bbc91SGabor Kovesdanwhitespaces, etc.
509c66bbc91SGabor Kovesdan.It Ev LC_MESSAGES
510c66bbc91SGabor KovesdanLocale settings that determine the language of output messages
511c66bbc91SGabor Kovesdanthat
512c66bbc91SGabor Kovesdan.Nm
513c66bbc91SGabor Kovesdanprints out.
514c66bbc91SGabor Kovesdan.It Ev LC_NUMERIC
515c66bbc91SGabor KovesdanLocale settings that determine the number format used in numeric sort.
516c66bbc91SGabor Kovesdan.It Ev LC_TIME
517c66bbc91SGabor KovesdanLocale settings that determine the month format used in month sort.
518c66bbc91SGabor Kovesdan.It Ev LC_ALL
519c66bbc91SGabor KovesdanLocale settings that override all of the above locale settings.
520c66bbc91SGabor KovesdanThis environment variable can be used to set all these settings
521c66bbc91SGabor Kovesdanto the same value at once.
522c66bbc91SGabor Kovesdan.It Ev LANG
523c66bbc91SGabor KovesdanUsed as a last resort to determine different kinds of locale-specific
524c66bbc91SGabor Kovesdanbehavior if neither the respective environment variable, nor
525c66bbc91SGabor Kovesdan.Ev LC_ALL
526c66bbc91SGabor Kovesdanare set.
527c66bbc91SGabor Kovesdan.It Ev TMPDIR
528c66bbc91SGabor KovesdanPath to the directory in which temporary files will be stored.
529c66bbc91SGabor KovesdanNote that
530c66bbc91SGabor Kovesdan.Ev TMPDIR
531c66bbc91SGabor Kovesdanmay be overridden by the
532c66bbc91SGabor Kovesdan.Fl T
533c66bbc91SGabor Kovesdanoption.
534c66bbc91SGabor Kovesdan.It Ev GNUSORT_NUMERIC_COMPATIBILITY
535c66bbc91SGabor KovesdanIf defined
536c66bbc91SGabor Kovesdan.Fl t
537c66bbc91SGabor Kovesdanwill not override the locale numeric symbols, that is, thousand
538c66bbc91SGabor Kovesdanseparators and decimal separators.
539c66bbc91SGabor KovesdanBy default, if we specify
540c66bbc91SGabor Kovesdan.Fl t
541c66bbc91SGabor Kovesdanwith the same symbol as the thousand separator or decimal point,
542c66bbc91SGabor Kovesdanthe symbol will be treated as the field separator.
543c66bbc91SGabor KovesdanOlder behavior was less definite; the symbol was treated as both field
544c66bbc91SGabor Kovesdanseparator and numeric separator, simultaneously.
545c66bbc91SGabor KovesdanThis environment variable enables the old behavior.
546c66bbc91SGabor Kovesdan.El
547c66bbc91SGabor Kovesdan.Sh FILES
548c66bbc91SGabor Kovesdan.Bl -tag -width Pa -compact
549c66bbc91SGabor Kovesdan.It Pa /var/tmp/.bsdsort.PID.*
550c66bbc91SGabor KovesdanTemporary files.
551c66bbc91SGabor Kovesdan.It Pa /dev/random
552c66bbc91SGabor KovesdanDefault seed file for the random sort.
553c66bbc91SGabor Kovesdan.El
5542dbc3019SJoel Dahl.Sh EXIT STATUS
5552dbc3019SJoel DahlThe
5562dbc3019SJoel Dahl.Nm
5572dbc3019SJoel Dahlutility shall exit with one of the following values:
5582dbc3019SJoel Dahl.Pp
5592dbc3019SJoel Dahl.Bl -tag -width flag -compact
5602dbc3019SJoel Dahl.It 0
5612dbc3019SJoel DahlSuccessfully sorted the input files or if used with
5622dbc3019SJoel Dahl.Fl c
5632dbc3019SJoel Dahlor
5642dbc3019SJoel Dahl.Fl C ,
5652dbc3019SJoel Dahlthe input file already met the sorting criteria.
5662dbc3019SJoel Dahl.It 1
5672dbc3019SJoel DahlOn disorder (or non-uniqueness) with the
5682dbc3019SJoel Dahl.Fl c
5692dbc3019SJoel Dahlor
5702dbc3019SJoel Dahl.Fl C
5712dbc3019SJoel Dahloptions.
5722dbc3019SJoel Dahl.It 2
5732dbc3019SJoel DahlAn error occurred.
5742dbc3019SJoel Dahl.El
575c66bbc91SGabor Kovesdan.Sh SEE ALSO
576c66bbc91SGabor Kovesdan.Xr comm 1 ,
577c66bbc91SGabor Kovesdan.Xr join 1 ,
57896c566eaSGlen Barber.Xr uniq 1
579c66bbc91SGabor Kovesdan.Sh STANDARDS
580c66bbc91SGabor KovesdanThe
581c66bbc91SGabor Kovesdan.Nm
582c66bbc91SGabor Kovesdanutility is compliant with the
583c66bbc91SGabor Kovesdan.St -p1003.1-2008
584c66bbc91SGabor Kovesdanspecification.
585c66bbc91SGabor Kovesdan.Pp
586c66bbc91SGabor KovesdanThe flags
587c66bbc91SGabor Kovesdan.Op Fl ghRMSsTVz
588c66bbc91SGabor Kovesdanare extensions to the POSIX specification.
589c66bbc91SGabor Kovesdan.Pp
590c66bbc91SGabor KovesdanAll long options are extensions to the specification, some of them are
591c66bbc91SGabor Kovesdanprovided for compatibility with GNU versions and some of them are
592c66bbc91SGabor Kovesdanown extensions.
593c66bbc91SGabor Kovesdan.Pp
594c66bbc91SGabor KovesdanThe old key notations
595c66bbc91SGabor Kovesdan.Cm \(pl Ns Ar pos1
596c66bbc91SGabor Kovesdanand
597c66bbc91SGabor Kovesdan.Fl Ns Ar pos2
598c66bbc91SGabor Kovesdancome from older versions of
599c66bbc91SGabor Kovesdan.Nm
600c66bbc91SGabor Kovesdanand are still supported but their use is highly discouraged.
601c66bbc91SGabor Kovesdan.Sh HISTORY
602c66bbc91SGabor KovesdanA
603c66bbc91SGabor Kovesdan.Nm
604c66bbc91SGabor Kovesdancommand first appeared in
60508509077SSevan Janiyan.At v1 .
606c66bbc91SGabor Kovesdan.Sh AUTHORS
607385385fbSJoel Dahl.An Gabor Kovesdan Aq Mt gabor@FreeBSD.org ,
608c66bbc91SGabor Kovesdan.Pp
609385385fbSJoel Dahl.An Oleg Moskalenko Aq Mt mom040267@gmail.com
610c66bbc91SGabor Kovesdan.Sh NOTES
611c66bbc91SGabor KovesdanThis implementation of
612c66bbc91SGabor Kovesdan.Nm
613c66bbc91SGabor Kovesdanhas no limits on input line length (other than imposed by available
614c66bbc91SGabor Kovesdanmemory) or any restrictions on bytes allowed within lines.
615c66bbc91SGabor Kovesdan.Pp
616c66bbc91SGabor KovesdanThe performance depends highly on locale settings,
617c66bbc91SGabor Kovesdanefficient choice of sort keys and key complexity.
618c66bbc91SGabor KovesdanThe fastest sort is with locale C, on whole lines,
619c66bbc91SGabor Kovesdanwith option
620c66bbc91SGabor Kovesdan.Fl s .
621c66bbc91SGabor KovesdanIn general, locale C is the fastest, then single-byte
622c66bbc91SGabor Kovesdanlocales follow and multi-byte locales as the slowest but
623c66bbc91SGabor Kovesdanthe correct collation order is always respected.
624c66bbc91SGabor KovesdanAs for the key specification, the simpler to process the
625c66bbc91SGabor Kovesdanlines the faster the search will be.
626c66bbc91SGabor Kovesdan.Pp
627c66bbc91SGabor KovesdanWhen sorting by arithmetic value, using
628c66bbc91SGabor Kovesdan.Fl n
629c66bbc91SGabor Kovesdanresults in much better performance than
630c66bbc91SGabor Kovesdan.Fl g
631c66bbc91SGabor Kovesdanso its use is encouraged
632c66bbc91SGabor Kovesdanwhenever possible.
633