xref: /freebsd/usr.bin/join/join.1 (revision bdcbfde31e8e9b343f113a1956384bdf30d1ed62)
10da30e9aSPeter Wemm.\" Copyright (c) 1990, 1993
20da30e9aSPeter Wemm.\"	The Regents of the University of California.  All rights reserved.
30da30e9aSPeter Wemm.\"
40da30e9aSPeter Wemm.\" This code is derived from software contributed to Berkeley by
50da30e9aSPeter Wemm.\" the Institute of Electrical and Electronics Engineers, Inc.
60da30e9aSPeter Wemm.\"
70da30e9aSPeter Wemm.\" Redistribution and use in source and binary forms, with or without
80da30e9aSPeter Wemm.\" modification, are permitted provided that the following conditions
90da30e9aSPeter Wemm.\" are met:
100da30e9aSPeter Wemm.\" 1. Redistributions of source code must retain the above copyright
110da30e9aSPeter Wemm.\"    notice, this list of conditions and the following disclaimer.
120da30e9aSPeter Wemm.\" 2. Redistributions in binary form must reproduce the above copyright
130da30e9aSPeter Wemm.\"    notice, this list of conditions and the following disclaimer in the
140da30e9aSPeter Wemm.\"    documentation and/or other materials provided with the distribution.
15fbbd9655SWarner Losh.\" 3. Neither the name of the University nor the names of its contributors
160da30e9aSPeter Wemm.\"    may be used to endorse or promote products derived from this software
170da30e9aSPeter Wemm.\"    without specific prior written permission.
180da30e9aSPeter Wemm.\"
190da30e9aSPeter Wemm.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
200da30e9aSPeter Wemm.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
210da30e9aSPeter Wemm.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
220da30e9aSPeter Wemm.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
230da30e9aSPeter Wemm.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
240da30e9aSPeter Wemm.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
250da30e9aSPeter Wemm.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
260da30e9aSPeter Wemm.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
270da30e9aSPeter Wemm.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
280da30e9aSPeter Wemm.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
290da30e9aSPeter Wemm.\" SUCH DAMAGE.
300da30e9aSPeter Wemm.\"
31*69ad6e73SFernando Apesteguía.Dd June 20, 2020
320da30e9aSPeter Wemm.Dt JOIN 1
330da30e9aSPeter Wemm.Os
340da30e9aSPeter Wemm.Sh NAME
350da30e9aSPeter Wemm.Nm join
360da30e9aSPeter Wemm.Nd relational database operator
370da30e9aSPeter Wemm.Sh SYNOPSIS
388fe908efSRuslan Ermilov.Nm
390da30e9aSPeter Wemm.Oo
400da30e9aSPeter Wemm.Fl a Ar file_number | Fl v Ar file_number
410da30e9aSPeter Wemm.Oc
420da30e9aSPeter Wemm.Op Fl e Ar string
430da30e9aSPeter Wemm.Op Fl o Ar list
440da30e9aSPeter Wemm.Op Fl t Ar char
45dce95037SRuslan Ermilov.Op Fl 1 Ar field
46dce95037SRuslan Ermilov.Op Fl 2 Ar field
470da30e9aSPeter Wemm.Ar file1
480da30e9aSPeter Wemm.Ar file2
490da30e9aSPeter Wemm.Sh DESCRIPTION
50f2941023SPhilippe CharnierThe
51f2941023SPhilippe Charnier.Nm
52ffa1b5c1SPhilippe Charnierutility performs an
53ffa1b5c1SPhilippe Charnier.Dq equality join
54ffa1b5c1SPhilippe Charnieron the specified files
550da30e9aSPeter Wemmand writes the result to the standard output.
56ffa1b5c1SPhilippe CharnierThe
57ffa1b5c1SPhilippe Charnier.Dq join field
58ffa1b5c1SPhilippe Charnieris the field in each file by which the files are compared.
590da30e9aSPeter WemmThe first field in each line is used by default.
600da30e9aSPeter WemmThere is one line in the output for each pair of lines in
610da30e9aSPeter Wemm.Ar file1
620da30e9aSPeter Wemmand
630da30e9aSPeter Wemm.Ar file2
640da30e9aSPeter Wemmwhich have identical join fields.
650da30e9aSPeter WemmEach output line consists of the join field, the remaining fields from
660da30e9aSPeter Wemm.Ar file1
670da30e9aSPeter Wemmand then the remaining fields from
680da30e9aSPeter Wemm.Ar file2 .
690da30e9aSPeter Wemm.Pp
700da30e9aSPeter WemmThe default field separators are tab and space characters.
710da30e9aSPeter WemmIn this case, multiple tabs and spaces count as a single field separator,
720da30e9aSPeter Wemmand leading tabs and spaces are ignored.
730da30e9aSPeter WemmThe default output field separator is a single space character.
740da30e9aSPeter Wemm.Pp
750da30e9aSPeter WemmMany of the options use file and field numbers.
766a3e8b0aSRuslan ErmilovBoth file numbers and field numbers are 1 based, i.e., the first file on
770da30e9aSPeter Wemmthe command line is file number 1 and the first field is field number 1.
780da30e9aSPeter WemmThe following options are available:
79f2941023SPhilippe Charnier.Bl -tag -width indent
800da30e9aSPeter Wemm.It Fl a Ar file_number
810da30e9aSPeter WemmIn addition to the default output, produce a line for each unpairable
820da30e9aSPeter Wemmline in file
830da30e9aSPeter Wemm.Ar file_number .
840da30e9aSPeter Wemm.It Fl e Ar string
850da30e9aSPeter WemmReplace empty output fields with
860da30e9aSPeter Wemm.Ar string .
870da30e9aSPeter Wemm.It Fl o Ar list
880da30e9aSPeter WemmThe
890da30e9aSPeter Wemm.Fl o
900da30e9aSPeter Wemmoption specifies the fields that will be output from each file for
910da30e9aSPeter Wemmeach line with matching join fields.
920da30e9aSPeter WemmEach element of
930da30e9aSPeter Wemm.Ar list
94dce95037SRuslan Ermilovhas either the form
95dce95037SRuslan Ermilov.Ar file_number . Ns Ar field ,
960da30e9aSPeter Wemmwhere
970da30e9aSPeter Wemm.Ar file_number
980da30e9aSPeter Wemmis a file number and
990da30e9aSPeter Wemm.Ar field
100dcd587c9SJuli Mallettis a field number, or the form
101dcd587c9SJuli Mallett.Ql 0
102dcd587c9SJuli Mallett.Pq zero ,
103dcd587c9SJuli Mallettrepresenting the join field.
104ffa1b5c1SPhilippe CharnierThe elements of list must be either comma
105bbdca214SRuslan Ermilov.Pq Ql \&,
106ffa1b5c1SPhilippe Charnieror whitespace separated.
107cbc1e6c2SGlen Barber(The latter requires quoting to protect it from the shell, or, a simpler
1080da30e9aSPeter Wemmapproach is to use multiple
1090da30e9aSPeter Wemm.Fl o
1100da30e9aSPeter Wemmoptions.)
1110da30e9aSPeter Wemm.It Fl t Ar char
1120da30e9aSPeter WemmUse character
1130da30e9aSPeter Wemm.Ar char
1140da30e9aSPeter Wemmas a field delimiter for both input and output.
1150da30e9aSPeter WemmEvery occurrence of
1160da30e9aSPeter Wemm.Ar char
1170da30e9aSPeter Wemmin a line is significant.
1180da30e9aSPeter Wemm.It Fl v Ar file_number
1190da30e9aSPeter WemmDo not display the default output, but display a line for each unpairable
1200da30e9aSPeter Wemmline in file
1210da30e9aSPeter Wemm.Ar file_number .
1220da30e9aSPeter WemmThe options
123dce95037SRuslan Ermilov.Fl v Cm 1
1240da30e9aSPeter Wemmand
125dce95037SRuslan Ermilov.Fl v Cm 2
1260da30e9aSPeter Wemmmay be specified at the same time.
1270da30e9aSPeter Wemm.It Fl 1 Ar field
1280da30e9aSPeter WemmJoin on the
1290da30e9aSPeter Wemm.Ar field Ns 'th
130dce95037SRuslan Ermilovfield of
131dce95037SRuslan Ermilov.Ar file1 .
1320da30e9aSPeter Wemm.It Fl 2 Ar field
1330da30e9aSPeter WemmJoin on the
1340da30e9aSPeter Wemm.Ar field Ns 'th
135dce95037SRuslan Ermilovfield of
136dce95037SRuslan Ermilov.Ar file2 .
1370da30e9aSPeter Wemm.El
1380da30e9aSPeter Wemm.Pp
1390da30e9aSPeter WemmWhen the default field delimiter characters are used, the files to be joined
1400da30e9aSPeter Wemmshould be ordered in the collating sequence of
1410da30e9aSPeter Wemm.Xr sort 1 ,
1420da30e9aSPeter Wemmusing the
1430da30e9aSPeter Wemm.Fl b
1440da30e9aSPeter Wemmoption, on the fields on which they are to be joined, otherwise
145f2941023SPhilippe Charnier.Nm
1460da30e9aSPeter Wemmmay not report all field matches.
1470da30e9aSPeter WemmWhen the field delimiter characters are specified by the
1480da30e9aSPeter Wemm.Fl t
1490da30e9aSPeter Wemmoption, the collating sequence should be the same as
150f2941023SPhilippe Charnier.Xr sort 1
1510da30e9aSPeter Wemmwithout the
1520da30e9aSPeter Wemm.Fl b
1530da30e9aSPeter Wemmoption.
1540da30e9aSPeter Wemm.Pp
1550da30e9aSPeter WemmIf one of the arguments
1560da30e9aSPeter Wemm.Ar file1
1570da30e9aSPeter Wemmor
1580da30e9aSPeter Wemm.Ar file2
159ffa1b5c1SPhilippe Charnieris
160dce95037SRuslan Ermilov.Sq Fl ,
161ffa1b5c1SPhilippe Charnierthe standard input is used.
162a866e170SRuslan Ermilov.Sh EXIT STATUS
163d628d776SRuslan Ermilov.Ex -std
164*69ad6e73SFernando Apesteguía.Sh EXAMPLES
165*69ad6e73SFernando ApesteguíaAssuming a file named
166*69ad6e73SFernando Apesteguía.Pa nobel_laureates.txt
167*69ad6e73SFernando Apesteguíawith information about some of the first Nobel Peace Prize laureates:
168*69ad6e73SFernando Apesteguía.Bd -literal -offset indent
169*69ad6e73SFernando Apesteguía1901,Jean Henri Dunant,M
170*69ad6e73SFernando Apesteguía1901,Frederic Passy,M
171*69ad6e73SFernando Apesteguía1902,Elie Ducommun,M
172*69ad6e73SFernando Apesteguía1905,Baroness Bertha Sophie Felicita Von Suttner,F
173*69ad6e73SFernando Apesteguía1910,Permanent International Peace Bureau,
174*69ad6e73SFernando Apesteguía.Ed
175*69ad6e73SFernando Apesteguía.Pp
176*69ad6e73SFernando Apesteguíaand a second file
177*69ad6e73SFernando Apesteguía.Pa nobel_nationalities.txt
178*69ad6e73SFernando Apesteguíawith their nationalities:
179*69ad6e73SFernando Apesteguía.Bd -literal -offset indent
180*69ad6e73SFernando ApesteguíaJean Henri Dunant,Switzerland
181*69ad6e73SFernando ApesteguíaFrederic Passy,France
182*69ad6e73SFernando ApesteguíaElie Ducommun,Switzerland
183*69ad6e73SFernando ApesteguíaBaroness Bertha Sophie Felicita Von Suttner
184*69ad6e73SFernando Apesteguía.Ed
185*69ad6e73SFernando Apesteguía.Pp
186*69ad6e73SFernando ApesteguíaJoin the two files using the second column from first file and the default first
187*69ad6e73SFernando Apesteguíacolumn from second file specifying a custom field delimiter:
188*69ad6e73SFernando Apesteguía.Bd -literal -offset indent
189*69ad6e73SFernando Apesteguía$ join -t, -1 2 nobel_laureates.txt nobel_nationalities.txt
190*69ad6e73SFernando ApesteguíaJean Henri Dunant,1901,M,Switzerland
191*69ad6e73SFernando ApesteguíaFrederic Passy,1901,M,France
192*69ad6e73SFernando ApesteguíaElie Ducommun,1902,M,Switzerland
193*69ad6e73SFernando ApesteguíaBaroness Bertha Sophie Felicita Von Suttner,1905,F
194*69ad6e73SFernando Apesteguía.Ed
195*69ad6e73SFernando Apesteguía.Pp
196*69ad6e73SFernando ApesteguíaShow only the year and the nationality of the laureate using
197*69ad6e73SFernando Apesteguía.Ql <<NULL>>
198*69ad6e73SFernando Apesteguíato replace empty fields:
199*69ad6e73SFernando Apesteguía.Bd -literal -offset indent
200*69ad6e73SFernando Apesteguía$ join -e "<<NULL>>" -t, -1 2 -o "1.1 2.2" nobel_laureates.txt nobel_nationalities.txt
201*69ad6e73SFernando Apesteguía1901,Switzerland
202*69ad6e73SFernando Apesteguía1901,France
203*69ad6e73SFernando Apesteguía1902,Switzerland
204*69ad6e73SFernando Apesteguía1905,<<NULL>>
205*69ad6e73SFernando Apesteguía.Ed
206*69ad6e73SFernando Apesteguía.Pp
207*69ad6e73SFernando ApesteguíaShow only lines from first file which do not have a match in second file:
208*69ad6e73SFernando Apesteguía.Bd -literal -offset indent
209*69ad6e73SFernando Apesteguía$ join -v1 -t, -1 2 nobel_laureates.txt nobel_nationalities.txt
210*69ad6e73SFernando ApesteguíaPermanent International Peace Bureau,1910,
211*69ad6e73SFernando Apesteguía.Ed
212*69ad6e73SFernando Apesteguía.Pp
213*69ad6e73SFernando ApesteguíaAssuming a file named
214*69ad6e73SFernando Apesteguía.Pa capitals.txt
215*69ad6e73SFernando Apesteguíawith the following content:
216*69ad6e73SFernando Apesteguía.Bd -literal -offset indent
217*69ad6e73SFernando ApesteguíaBelgium,Brussels
218*69ad6e73SFernando ApesteguíaFrance,Paris
219*69ad6e73SFernando ApesteguíaItaly,Rome
220*69ad6e73SFernando ApesteguíaSwitzerland
221*69ad6e73SFernando Apesteguía.Ed
222*69ad6e73SFernando Apesteguía.Pp
223*69ad6e73SFernando ApesteguíaShow the name and capital of the country where the laureate was born.
224*69ad6e73SFernando ApesteguíaThis example uses
225*69ad6e73SFernando Apesteguía.Pa nobel_nationalities.txt
226*69ad6e73SFernando Apesteguíaas a bridge but does not show any information from that file.
227*69ad6e73SFernando ApesteguíaAlso see the note about
228*69ad6e73SFernando Apesteguía.Xr sort 1
229*69ad6e73SFernando Apesteguíaabove to understand why we need to sort the intermediate result.
230*69ad6e73SFernando Apesteguía.Bd -literal -offset indent
231*69ad6e73SFernando Apesteguía$ join -t, -1 2 -o 1.2 2.2 nobel_laureates.txt nobel_nationalities.txt | \e
232*69ad6e73SFernando Apesteguía    sort -k2 -t, | join -t, -e "<<NULL>>" -1 2 -o 1.1 2.2 - capitals.txt
233*69ad6e73SFernando ApesteguíaElie Ducommun,<<NULL>>
234*69ad6e73SFernando ApesteguíaJean Henri Dunant,<<NULL>>
235*69ad6e73SFernando Apesteguía.Ed
2360da30e9aSPeter Wemm.Sh COMPATIBILITY
2370da30e9aSPeter WemmFor compatibility with historic versions of
2388fe908efSRuslan Ermilov.Nm ,
2390da30e9aSPeter Wemmthe following options are available:
240f2941023SPhilippe Charnier.Bl -tag -width indent
2410da30e9aSPeter Wemm.It Fl a
2420da30e9aSPeter WemmIn addition to the default output, produce a line for each unpairable line
243dce95037SRuslan Ermilovin both
244dce95037SRuslan Ermilov.Ar file1
245dce95037SRuslan Ermilovand
246dce95037SRuslan Ermilov.Ar file2 .
2470da30e9aSPeter Wemm.It Fl j1 Ar field
2480da30e9aSPeter WemmJoin on the
2490da30e9aSPeter Wemm.Ar field Ns 'th
250dce95037SRuslan Ermilovfield of
251dce95037SRuslan Ermilov.Ar file1 .
2520da30e9aSPeter Wemm.It Fl j2 Ar field
2530da30e9aSPeter WemmJoin on the
2540da30e9aSPeter Wemm.Ar field Ns 'th
255dce95037SRuslan Ermilovfield of
256dce95037SRuslan Ermilov.Ar file2 .
2570da30e9aSPeter Wemm.It Fl j Ar field
2580da30e9aSPeter WemmJoin on the
2590da30e9aSPeter Wemm.Ar field Ns 'th
260dce95037SRuslan Ermilovfield of both
261dce95037SRuslan Ermilov.Ar file1
262dce95037SRuslan Ermilovand
263dce95037SRuslan Ermilov.Ar file2 .
2640da30e9aSPeter Wemm.It Fl o Ar list ...
2650da30e9aSPeter WemmHistorical implementations of
266f2941023SPhilippe Charnier.Nm
2670da30e9aSPeter Wemmpermitted multiple arguments to the
2680da30e9aSPeter Wemm.Fl o
2690da30e9aSPeter Wemmoption.
270f2941023SPhilippe CharnierThese arguments were of the form
271dce95037SRuslan Ermilov.Ar file_number . Ns Ar field_number
272f2941023SPhilippe Charnieras described
2730da30e9aSPeter Wemmfor the current
2740da30e9aSPeter Wemm.Fl o
2750da30e9aSPeter Wemmoption.
276f2941023SPhilippe CharnierThis has obvious difficulties in the presence of files named
277dce95037SRuslan Ermilov.Pa 1.2 .
2780da30e9aSPeter Wemm.El
2790da30e9aSPeter Wemm.Pp
2800227791bSRuslan ErmilovThese options are available only so historic shell scripts do not require
2810da30e9aSPeter Wemmmodification and should not be used.
2820da30e9aSPeter Wemm.Sh SEE ALSO
2830da30e9aSPeter Wemm.Xr awk 1 ,
2840da30e9aSPeter Wemm.Xr comm 1 ,
2850da30e9aSPeter Wemm.Xr paste 1 ,
2860da30e9aSPeter Wemm.Xr sort 1 ,
2870da30e9aSPeter Wemm.Xr uniq 1
2886c7216dfSRuslan Ermilov.Sh STANDARDS
2896c7216dfSRuslan ErmilovThe
2906c7216dfSRuslan Ermilov.Nm
2916c7216dfSRuslan Ermilovcommand conforms to
2926c7216dfSRuslan Ermilov.St -p1003.1-2001 .
293