10da30e9aSPeter Wemm.\" Copyright (c) 1990, 1993 20da30e9aSPeter Wemm.\" The Regents of the University of California. All rights reserved. 30da30e9aSPeter Wemm.\" 40da30e9aSPeter Wemm.\" This code is derived from software contributed to Berkeley by 50da30e9aSPeter Wemm.\" the Institute of Electrical and Electronics Engineers, Inc. 60da30e9aSPeter Wemm.\" 70da30e9aSPeter Wemm.\" Redistribution and use in source and binary forms, with or without 80da30e9aSPeter Wemm.\" modification, are permitted provided that the following conditions 90da30e9aSPeter Wemm.\" are met: 100da30e9aSPeter Wemm.\" 1. Redistributions of source code must retain the above copyright 110da30e9aSPeter Wemm.\" notice, this list of conditions and the following disclaimer. 120da30e9aSPeter Wemm.\" 2. Redistributions in binary form must reproduce the above copyright 130da30e9aSPeter Wemm.\" notice, this list of conditions and the following disclaimer in the 140da30e9aSPeter Wemm.\" documentation and/or other materials provided with the distribution. 15fbbd9655SWarner Losh.\" 3. Neither the name of the University nor the names of its contributors 160da30e9aSPeter Wemm.\" may be used to endorse or promote products derived from this software 170da30e9aSPeter Wemm.\" without specific prior written permission. 180da30e9aSPeter Wemm.\" 190da30e9aSPeter Wemm.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 200da30e9aSPeter Wemm.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 210da30e9aSPeter Wemm.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 220da30e9aSPeter Wemm.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 230da30e9aSPeter Wemm.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 240da30e9aSPeter Wemm.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 250da30e9aSPeter Wemm.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 260da30e9aSPeter Wemm.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 270da30e9aSPeter Wemm.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 280da30e9aSPeter Wemm.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 290da30e9aSPeter Wemm.\" SUCH DAMAGE. 300da30e9aSPeter Wemm.\" 31*69ad6e73SFernando Apesteguía.Dd June 20, 2020 320da30e9aSPeter Wemm.Dt JOIN 1 330da30e9aSPeter Wemm.Os 340da30e9aSPeter Wemm.Sh NAME 350da30e9aSPeter Wemm.Nm join 360da30e9aSPeter Wemm.Nd relational database operator 370da30e9aSPeter Wemm.Sh SYNOPSIS 388fe908efSRuslan Ermilov.Nm 390da30e9aSPeter Wemm.Oo 400da30e9aSPeter Wemm.Fl a Ar file_number | Fl v Ar file_number 410da30e9aSPeter Wemm.Oc 420da30e9aSPeter Wemm.Op Fl e Ar string 430da30e9aSPeter Wemm.Op Fl o Ar list 440da30e9aSPeter Wemm.Op Fl t Ar char 45dce95037SRuslan Ermilov.Op Fl 1 Ar field 46dce95037SRuslan Ermilov.Op Fl 2 Ar field 470da30e9aSPeter Wemm.Ar file1 480da30e9aSPeter Wemm.Ar file2 490da30e9aSPeter Wemm.Sh DESCRIPTION 50f2941023SPhilippe CharnierThe 51f2941023SPhilippe Charnier.Nm 52ffa1b5c1SPhilippe Charnierutility performs an 53ffa1b5c1SPhilippe Charnier.Dq equality join 54ffa1b5c1SPhilippe Charnieron the specified files 550da30e9aSPeter Wemmand writes the result to the standard output. 56ffa1b5c1SPhilippe CharnierThe 57ffa1b5c1SPhilippe Charnier.Dq join field 58ffa1b5c1SPhilippe Charnieris the field in each file by which the files are compared. 590da30e9aSPeter WemmThe first field in each line is used by default. 600da30e9aSPeter WemmThere is one line in the output for each pair of lines in 610da30e9aSPeter Wemm.Ar file1 620da30e9aSPeter Wemmand 630da30e9aSPeter Wemm.Ar file2 640da30e9aSPeter Wemmwhich have identical join fields. 650da30e9aSPeter WemmEach output line consists of the join field, the remaining fields from 660da30e9aSPeter Wemm.Ar file1 670da30e9aSPeter Wemmand then the remaining fields from 680da30e9aSPeter Wemm.Ar file2 . 690da30e9aSPeter Wemm.Pp 700da30e9aSPeter WemmThe default field separators are tab and space characters. 710da30e9aSPeter WemmIn this case, multiple tabs and spaces count as a single field separator, 720da30e9aSPeter Wemmand leading tabs and spaces are ignored. 730da30e9aSPeter WemmThe default output field separator is a single space character. 740da30e9aSPeter Wemm.Pp 750da30e9aSPeter WemmMany of the options use file and field numbers. 766a3e8b0aSRuslan ErmilovBoth file numbers and field numbers are 1 based, i.e., the first file on 770da30e9aSPeter Wemmthe command line is file number 1 and the first field is field number 1. 780da30e9aSPeter WemmThe following options are available: 79f2941023SPhilippe Charnier.Bl -tag -width indent 800da30e9aSPeter Wemm.It Fl a Ar file_number 810da30e9aSPeter WemmIn addition to the default output, produce a line for each unpairable 820da30e9aSPeter Wemmline in file 830da30e9aSPeter Wemm.Ar file_number . 840da30e9aSPeter Wemm.It Fl e Ar string 850da30e9aSPeter WemmReplace empty output fields with 860da30e9aSPeter Wemm.Ar string . 870da30e9aSPeter Wemm.It Fl o Ar list 880da30e9aSPeter WemmThe 890da30e9aSPeter Wemm.Fl o 900da30e9aSPeter Wemmoption specifies the fields that will be output from each file for 910da30e9aSPeter Wemmeach line with matching join fields. 920da30e9aSPeter WemmEach element of 930da30e9aSPeter Wemm.Ar list 94dce95037SRuslan Ermilovhas either the form 95dce95037SRuslan Ermilov.Ar file_number . Ns Ar field , 960da30e9aSPeter Wemmwhere 970da30e9aSPeter Wemm.Ar file_number 980da30e9aSPeter Wemmis a file number and 990da30e9aSPeter Wemm.Ar field 100dcd587c9SJuli Mallettis a field number, or the form 101dcd587c9SJuli Mallett.Ql 0 102dcd587c9SJuli Mallett.Pq zero , 103dcd587c9SJuli Mallettrepresenting the join field. 104ffa1b5c1SPhilippe CharnierThe elements of list must be either comma 105bbdca214SRuslan Ermilov.Pq Ql \&, 106ffa1b5c1SPhilippe Charnieror whitespace separated. 107cbc1e6c2SGlen Barber(The latter requires quoting to protect it from the shell, or, a simpler 1080da30e9aSPeter Wemmapproach is to use multiple 1090da30e9aSPeter Wemm.Fl o 1100da30e9aSPeter Wemmoptions.) 1110da30e9aSPeter Wemm.It Fl t Ar char 1120da30e9aSPeter WemmUse character 1130da30e9aSPeter Wemm.Ar char 1140da30e9aSPeter Wemmas a field delimiter for both input and output. 1150da30e9aSPeter WemmEvery occurrence of 1160da30e9aSPeter Wemm.Ar char 1170da30e9aSPeter Wemmin a line is significant. 1180da30e9aSPeter Wemm.It Fl v Ar file_number 1190da30e9aSPeter WemmDo not display the default output, but display a line for each unpairable 1200da30e9aSPeter Wemmline in file 1210da30e9aSPeter Wemm.Ar file_number . 1220da30e9aSPeter WemmThe options 123dce95037SRuslan Ermilov.Fl v Cm 1 1240da30e9aSPeter Wemmand 125dce95037SRuslan Ermilov.Fl v Cm 2 1260da30e9aSPeter Wemmmay be specified at the same time. 1270da30e9aSPeter Wemm.It Fl 1 Ar field 1280da30e9aSPeter WemmJoin on the 1290da30e9aSPeter Wemm.Ar field Ns 'th 130dce95037SRuslan Ermilovfield of 131dce95037SRuslan Ermilov.Ar file1 . 1320da30e9aSPeter Wemm.It Fl 2 Ar field 1330da30e9aSPeter WemmJoin on the 1340da30e9aSPeter Wemm.Ar field Ns 'th 135dce95037SRuslan Ermilovfield of 136dce95037SRuslan Ermilov.Ar file2 . 1370da30e9aSPeter Wemm.El 1380da30e9aSPeter Wemm.Pp 1390da30e9aSPeter WemmWhen the default field delimiter characters are used, the files to be joined 1400da30e9aSPeter Wemmshould be ordered in the collating sequence of 1410da30e9aSPeter Wemm.Xr sort 1 , 1420da30e9aSPeter Wemmusing the 1430da30e9aSPeter Wemm.Fl b 1440da30e9aSPeter Wemmoption, on the fields on which they are to be joined, otherwise 145f2941023SPhilippe Charnier.Nm 1460da30e9aSPeter Wemmmay not report all field matches. 1470da30e9aSPeter WemmWhen the field delimiter characters are specified by the 1480da30e9aSPeter Wemm.Fl t 1490da30e9aSPeter Wemmoption, the collating sequence should be the same as 150f2941023SPhilippe Charnier.Xr sort 1 1510da30e9aSPeter Wemmwithout the 1520da30e9aSPeter Wemm.Fl b 1530da30e9aSPeter Wemmoption. 1540da30e9aSPeter Wemm.Pp 1550da30e9aSPeter WemmIf one of the arguments 1560da30e9aSPeter Wemm.Ar file1 1570da30e9aSPeter Wemmor 1580da30e9aSPeter Wemm.Ar file2 159ffa1b5c1SPhilippe Charnieris 160dce95037SRuslan Ermilov.Sq Fl , 161ffa1b5c1SPhilippe Charnierthe standard input is used. 162a866e170SRuslan Ermilov.Sh EXIT STATUS 163d628d776SRuslan Ermilov.Ex -std 164*69ad6e73SFernando Apesteguía.Sh EXAMPLES 165*69ad6e73SFernando ApesteguíaAssuming a file named 166*69ad6e73SFernando Apesteguía.Pa nobel_laureates.txt 167*69ad6e73SFernando Apesteguíawith information about some of the first Nobel Peace Prize laureates: 168*69ad6e73SFernando Apesteguía.Bd -literal -offset indent 169*69ad6e73SFernando Apesteguía1901,Jean Henri Dunant,M 170*69ad6e73SFernando Apesteguía1901,Frederic Passy,M 171*69ad6e73SFernando Apesteguía1902,Elie Ducommun,M 172*69ad6e73SFernando Apesteguía1905,Baroness Bertha Sophie Felicita Von Suttner,F 173*69ad6e73SFernando Apesteguía1910,Permanent International Peace Bureau, 174*69ad6e73SFernando Apesteguía.Ed 175*69ad6e73SFernando Apesteguía.Pp 176*69ad6e73SFernando Apesteguíaand a second file 177*69ad6e73SFernando Apesteguía.Pa nobel_nationalities.txt 178*69ad6e73SFernando Apesteguíawith their nationalities: 179*69ad6e73SFernando Apesteguía.Bd -literal -offset indent 180*69ad6e73SFernando ApesteguíaJean Henri Dunant,Switzerland 181*69ad6e73SFernando ApesteguíaFrederic Passy,France 182*69ad6e73SFernando ApesteguíaElie Ducommun,Switzerland 183*69ad6e73SFernando ApesteguíaBaroness Bertha Sophie Felicita Von Suttner 184*69ad6e73SFernando Apesteguía.Ed 185*69ad6e73SFernando Apesteguía.Pp 186*69ad6e73SFernando ApesteguíaJoin the two files using the second column from first file and the default first 187*69ad6e73SFernando Apesteguíacolumn from second file specifying a custom field delimiter: 188*69ad6e73SFernando Apesteguía.Bd -literal -offset indent 189*69ad6e73SFernando Apesteguía$ join -t, -1 2 nobel_laureates.txt nobel_nationalities.txt 190*69ad6e73SFernando ApesteguíaJean Henri Dunant,1901,M,Switzerland 191*69ad6e73SFernando ApesteguíaFrederic Passy,1901,M,France 192*69ad6e73SFernando ApesteguíaElie Ducommun,1902,M,Switzerland 193*69ad6e73SFernando ApesteguíaBaroness Bertha Sophie Felicita Von Suttner,1905,F 194*69ad6e73SFernando Apesteguía.Ed 195*69ad6e73SFernando Apesteguía.Pp 196*69ad6e73SFernando ApesteguíaShow only the year and the nationality of the laureate using 197*69ad6e73SFernando Apesteguía.Ql <<NULL>> 198*69ad6e73SFernando Apesteguíato replace empty fields: 199*69ad6e73SFernando Apesteguía.Bd -literal -offset indent 200*69ad6e73SFernando Apesteguía$ join -e "<<NULL>>" -t, -1 2 -o "1.1 2.2" nobel_laureates.txt nobel_nationalities.txt 201*69ad6e73SFernando Apesteguía1901,Switzerland 202*69ad6e73SFernando Apesteguía1901,France 203*69ad6e73SFernando Apesteguía1902,Switzerland 204*69ad6e73SFernando Apesteguía1905,<<NULL>> 205*69ad6e73SFernando Apesteguía.Ed 206*69ad6e73SFernando Apesteguía.Pp 207*69ad6e73SFernando ApesteguíaShow only lines from first file which do not have a match in second file: 208*69ad6e73SFernando Apesteguía.Bd -literal -offset indent 209*69ad6e73SFernando Apesteguía$ join -v1 -t, -1 2 nobel_laureates.txt nobel_nationalities.txt 210*69ad6e73SFernando ApesteguíaPermanent International Peace Bureau,1910, 211*69ad6e73SFernando Apesteguía.Ed 212*69ad6e73SFernando Apesteguía.Pp 213*69ad6e73SFernando ApesteguíaAssuming a file named 214*69ad6e73SFernando Apesteguía.Pa capitals.txt 215*69ad6e73SFernando Apesteguíawith the following content: 216*69ad6e73SFernando Apesteguía.Bd -literal -offset indent 217*69ad6e73SFernando ApesteguíaBelgium,Brussels 218*69ad6e73SFernando ApesteguíaFrance,Paris 219*69ad6e73SFernando ApesteguíaItaly,Rome 220*69ad6e73SFernando ApesteguíaSwitzerland 221*69ad6e73SFernando Apesteguía.Ed 222*69ad6e73SFernando Apesteguía.Pp 223*69ad6e73SFernando ApesteguíaShow the name and capital of the country where the laureate was born. 224*69ad6e73SFernando ApesteguíaThis example uses 225*69ad6e73SFernando Apesteguía.Pa nobel_nationalities.txt 226*69ad6e73SFernando Apesteguíaas a bridge but does not show any information from that file. 227*69ad6e73SFernando ApesteguíaAlso see the note about 228*69ad6e73SFernando Apesteguía.Xr sort 1 229*69ad6e73SFernando Apesteguíaabove to understand why we need to sort the intermediate result. 230*69ad6e73SFernando Apesteguía.Bd -literal -offset indent 231*69ad6e73SFernando Apesteguía$ join -t, -1 2 -o 1.2 2.2 nobel_laureates.txt nobel_nationalities.txt | \e 232*69ad6e73SFernando Apesteguía sort -k2 -t, | join -t, -e "<<NULL>>" -1 2 -o 1.1 2.2 - capitals.txt 233*69ad6e73SFernando ApesteguíaElie Ducommun,<<NULL>> 234*69ad6e73SFernando ApesteguíaJean Henri Dunant,<<NULL>> 235*69ad6e73SFernando Apesteguía.Ed 2360da30e9aSPeter Wemm.Sh COMPATIBILITY 2370da30e9aSPeter WemmFor compatibility with historic versions of 2388fe908efSRuslan Ermilov.Nm , 2390da30e9aSPeter Wemmthe following options are available: 240f2941023SPhilippe Charnier.Bl -tag -width indent 2410da30e9aSPeter Wemm.It Fl a 2420da30e9aSPeter WemmIn addition to the default output, produce a line for each unpairable line 243dce95037SRuslan Ermilovin both 244dce95037SRuslan Ermilov.Ar file1 245dce95037SRuslan Ermilovand 246dce95037SRuslan Ermilov.Ar file2 . 2470da30e9aSPeter Wemm.It Fl j1 Ar field 2480da30e9aSPeter WemmJoin on the 2490da30e9aSPeter Wemm.Ar field Ns 'th 250dce95037SRuslan Ermilovfield of 251dce95037SRuslan Ermilov.Ar file1 . 2520da30e9aSPeter Wemm.It Fl j2 Ar field 2530da30e9aSPeter WemmJoin on the 2540da30e9aSPeter Wemm.Ar field Ns 'th 255dce95037SRuslan Ermilovfield of 256dce95037SRuslan Ermilov.Ar file2 . 2570da30e9aSPeter Wemm.It Fl j Ar field 2580da30e9aSPeter WemmJoin on the 2590da30e9aSPeter Wemm.Ar field Ns 'th 260dce95037SRuslan Ermilovfield of both 261dce95037SRuslan Ermilov.Ar file1 262dce95037SRuslan Ermilovand 263dce95037SRuslan Ermilov.Ar file2 . 2640da30e9aSPeter Wemm.It Fl o Ar list ... 2650da30e9aSPeter WemmHistorical implementations of 266f2941023SPhilippe Charnier.Nm 2670da30e9aSPeter Wemmpermitted multiple arguments to the 2680da30e9aSPeter Wemm.Fl o 2690da30e9aSPeter Wemmoption. 270f2941023SPhilippe CharnierThese arguments were of the form 271dce95037SRuslan Ermilov.Ar file_number . Ns Ar field_number 272f2941023SPhilippe Charnieras described 2730da30e9aSPeter Wemmfor the current 2740da30e9aSPeter Wemm.Fl o 2750da30e9aSPeter Wemmoption. 276f2941023SPhilippe CharnierThis has obvious difficulties in the presence of files named 277dce95037SRuslan Ermilov.Pa 1.2 . 2780da30e9aSPeter Wemm.El 2790da30e9aSPeter Wemm.Pp 2800227791bSRuslan ErmilovThese options are available only so historic shell scripts do not require 2810da30e9aSPeter Wemmmodification and should not be used. 2820da30e9aSPeter Wemm.Sh SEE ALSO 2830da30e9aSPeter Wemm.Xr awk 1 , 2840da30e9aSPeter Wemm.Xr comm 1 , 2850da30e9aSPeter Wemm.Xr paste 1 , 2860da30e9aSPeter Wemm.Xr sort 1 , 2870da30e9aSPeter Wemm.Xr uniq 1 2886c7216dfSRuslan Ermilov.Sh STANDARDS 2896c7216dfSRuslan ErmilovThe 2906c7216dfSRuslan Ermilov.Nm 2916c7216dfSRuslan Ermilovcommand conforms to 2926c7216dfSRuslan Ermilov.St -p1003.1-2001 . 293