1.\" Copyright (c) 1990, 1993 2.\" The Regents of the University of California. All rights reserved. 3.\" 4.\" This code is derived from software contributed to Berkeley by 5.\" the Institute of Electrical and Electronics Engineers, Inc. 6.\" 7.\" Redistribution and use in source and binary forms, with or without 8.\" modification, are permitted provided that the following conditions 9.\" are met: 10.\" 1. Redistributions of source code must retain the above copyright 11.\" notice, this list of conditions and the following disclaimer. 12.\" 2. Redistributions in binary form must reproduce the above copyright 13.\" notice, this list of conditions and the following disclaimer in the 14.\" documentation and/or other materials provided with the distribution. 15.\" 3. Neither the name of the University nor the names of its contributors 16.\" may be used to endorse or promote products derived from this software 17.\" without specific prior written permission. 18.\" 19.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 20.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 21.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 22.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 23.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 24.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 25.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 26.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 27.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 28.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 29.\" SUCH DAMAGE. 30.\" 31.\" @(#)join.1 8.3 (Berkeley) 4/28/95 32.\" $FreeBSD$ 33.\" 34.Dd June 20, 2020 35.Dt JOIN 1 36.Os 37.Sh NAME 38.Nm join 39.Nd relational database operator 40.Sh SYNOPSIS 41.Nm 42.Oo 43.Fl a Ar file_number | Fl v Ar file_number 44.Oc 45.Op Fl e Ar string 46.Op Fl o Ar list 47.Op Fl t Ar char 48.Op Fl 1 Ar field 49.Op Fl 2 Ar field 50.Ar file1 51.Ar file2 52.Sh DESCRIPTION 53The 54.Nm 55utility performs an 56.Dq equality join 57on the specified files 58and writes the result to the standard output. 59The 60.Dq join field 61is the field in each file by which the files are compared. 62The first field in each line is used by default. 63There is one line in the output for each pair of lines in 64.Ar file1 65and 66.Ar file2 67which have identical join fields. 68Each output line consists of the join field, the remaining fields from 69.Ar file1 70and then the remaining fields from 71.Ar file2 . 72.Pp 73The default field separators are tab and space characters. 74In this case, multiple tabs and spaces count as a single field separator, 75and leading tabs and spaces are ignored. 76The default output field separator is a single space character. 77.Pp 78Many of the options use file and field numbers. 79Both file numbers and field numbers are 1 based, i.e., the first file on 80the command line is file number 1 and the first field is field number 1. 81The following options are available: 82.Bl -tag -width indent 83.It Fl a Ar file_number 84In addition to the default output, produce a line for each unpairable 85line in file 86.Ar file_number . 87.It Fl e Ar string 88Replace empty output fields with 89.Ar string . 90.It Fl o Ar list 91The 92.Fl o 93option specifies the fields that will be output from each file for 94each line with matching join fields. 95Each element of 96.Ar list 97has either the form 98.Ar file_number . Ns Ar field , 99where 100.Ar file_number 101is a file number and 102.Ar field 103is a field number, or the form 104.Ql 0 105.Pq zero , 106representing the join field. 107The elements of list must be either comma 108.Pq Ql \&, 109or whitespace separated. 110(The latter requires quoting to protect it from the shell, or, a simpler 111approach is to use multiple 112.Fl o 113options.) 114.It Fl t Ar char 115Use character 116.Ar char 117as a field delimiter for both input and output. 118Every occurrence of 119.Ar char 120in a line is significant. 121.It Fl v Ar file_number 122Do not display the default output, but display a line for each unpairable 123line in file 124.Ar file_number . 125The options 126.Fl v Cm 1 127and 128.Fl v Cm 2 129may be specified at the same time. 130.It Fl 1 Ar field 131Join on the 132.Ar field Ns 'th 133field of 134.Ar file1 . 135.It Fl 2 Ar field 136Join on the 137.Ar field Ns 'th 138field of 139.Ar file2 . 140.El 141.Pp 142When the default field delimiter characters are used, the files to be joined 143should be ordered in the collating sequence of 144.Xr sort 1 , 145using the 146.Fl b 147option, on the fields on which they are to be joined, otherwise 148.Nm 149may not report all field matches. 150When the field delimiter characters are specified by the 151.Fl t 152option, the collating sequence should be the same as 153.Xr sort 1 154without the 155.Fl b 156option. 157.Pp 158If one of the arguments 159.Ar file1 160or 161.Ar file2 162is 163.Sq Fl , 164the standard input is used. 165.Sh EXIT STATUS 166.Ex -std 167.Sh EXAMPLES 168Assuming a file named 169.Pa nobel_laureates.txt 170with information about some of the first Nobel Peace Prize laureates: 171.Bd -literal -offset indent 1721901,Jean Henri Dunant,M 1731901,Frederic Passy,M 1741902,Elie Ducommun,M 1751905,Baroness Bertha Sophie Felicita Von Suttner,F 1761910,Permanent International Peace Bureau, 177.Ed 178.Pp 179and a second file 180.Pa nobel_nationalities.txt 181with their nationalities: 182.Bd -literal -offset indent 183Jean Henri Dunant,Switzerland 184Frederic Passy,France 185Elie Ducommun,Switzerland 186Baroness Bertha Sophie Felicita Von Suttner 187.Ed 188.Pp 189Join the two files using the second column from first file and the default first 190column from second file specifying a custom field delimiter: 191.Bd -literal -offset indent 192$ join -t, -1 2 nobel_laureates.txt nobel_nationalities.txt 193Jean Henri Dunant,1901,M,Switzerland 194Frederic Passy,1901,M,France 195Elie Ducommun,1902,M,Switzerland 196Baroness Bertha Sophie Felicita Von Suttner,1905,F 197.Ed 198.Pp 199Show only the year and the nationality of the laureate using 200.Ql <<NULL>> 201to replace empty fields: 202.Bd -literal -offset indent 203$ join -e "<<NULL>>" -t, -1 2 -o "1.1 2.2" nobel_laureates.txt nobel_nationalities.txt 2041901,Switzerland 2051901,France 2061902,Switzerland 2071905,<<NULL>> 208.Ed 209.Pp 210Show only lines from first file which do not have a match in second file: 211.Bd -literal -offset indent 212$ join -v1 -t, -1 2 nobel_laureates.txt nobel_nationalities.txt 213Permanent International Peace Bureau,1910, 214.Ed 215.Pp 216Assuming a file named 217.Pa capitals.txt 218with the following content: 219.Bd -literal -offset indent 220Belgium,Brussels 221France,Paris 222Italy,Rome 223Switzerland 224.Ed 225.Pp 226Show the name and capital of the country where the laureate was born. 227This example uses 228.Pa nobel_nationalities.txt 229as a bridge but does not show any information from that file. 230Also see the note about 231.Xr sort 1 232above to understand why we need to sort the intermediate result. 233.Bd -literal -offset indent 234$ join -t, -1 2 -o 1.2 2.2 nobel_laureates.txt nobel_nationalities.txt | \e 235 sort -k2 -t, | join -t, -e "<<NULL>>" -1 2 -o 1.1 2.2 - capitals.txt 236Elie Ducommun,<<NULL>> 237Jean Henri Dunant,<<NULL>> 238.Ed 239.Sh COMPATIBILITY 240For compatibility with historic versions of 241.Nm , 242the following options are available: 243.Bl -tag -width indent 244.It Fl a 245In addition to the default output, produce a line for each unpairable line 246in both 247.Ar file1 248and 249.Ar file2 . 250.It Fl j1 Ar field 251Join on the 252.Ar field Ns 'th 253field of 254.Ar file1 . 255.It Fl j2 Ar field 256Join on the 257.Ar field Ns 'th 258field of 259.Ar file2 . 260.It Fl j Ar field 261Join on the 262.Ar field Ns 'th 263field of both 264.Ar file1 265and 266.Ar file2 . 267.It Fl o Ar list ... 268Historical implementations of 269.Nm 270permitted multiple arguments to the 271.Fl o 272option. 273These arguments were of the form 274.Ar file_number . Ns Ar field_number 275as described 276for the current 277.Fl o 278option. 279This has obvious difficulties in the presence of files named 280.Pa 1.2 . 281.El 282.Pp 283These options are available only so historic shell scripts do not require 284modification and should not be used. 285.Sh SEE ALSO 286.Xr awk 1 , 287.Xr comm 1 , 288.Xr paste 1 , 289.Xr sort 1 , 290.Xr uniq 1 291.Sh STANDARDS 292The 293.Nm 294command conforms to 295.St -p1003.1-2001 . 296