1.\" Copyright (c) 1990, 1993 2.\" The Regents of the University of California. All rights reserved. 3.\" 4.\" This code is derived from software contributed to Berkeley by 5.\" the Institute of Electrical and Electronics Engineers, Inc. 6.\" 7.\" Redistribution and use in source and binary forms, with or without 8.\" modification, are permitted provided that the following conditions 9.\" are met: 10.\" 1. Redistributions of source code must retain the above copyright 11.\" notice, this list of conditions and the following disclaimer. 12.\" 2. Redistributions in binary form must reproduce the above copyright 13.\" notice, this list of conditions and the following disclaimer in the 14.\" documentation and/or other materials provided with the distribution. 15.\" 3. Neither the name of the University nor the names of its contributors 16.\" may be used to endorse or promote products derived from this software 17.\" without specific prior written permission. 18.\" 19.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 20.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 21.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 22.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 23.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 24.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 25.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 26.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 27.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 28.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 29.\" SUCH DAMAGE. 30.\" 31.\" @(#)join.1 8.3 (Berkeley) 4/28/95 32.\" 33.Dd June 20, 2020 34.Dt JOIN 1 35.Os 36.Sh NAME 37.Nm join 38.Nd relational database operator 39.Sh SYNOPSIS 40.Nm 41.Oo 42.Fl a Ar file_number | Fl v Ar file_number 43.Oc 44.Op Fl e Ar string 45.Op Fl o Ar list 46.Op Fl t Ar char 47.Op Fl 1 Ar field 48.Op Fl 2 Ar field 49.Ar file1 50.Ar file2 51.Sh DESCRIPTION 52The 53.Nm 54utility performs an 55.Dq equality join 56on the specified files 57and writes the result to the standard output. 58The 59.Dq join field 60is the field in each file by which the files are compared. 61The first field in each line is used by default. 62There is one line in the output for each pair of lines in 63.Ar file1 64and 65.Ar file2 66which have identical join fields. 67Each output line consists of the join field, the remaining fields from 68.Ar file1 69and then the remaining fields from 70.Ar file2 . 71.Pp 72The default field separators are tab and space characters. 73In this case, multiple tabs and spaces count as a single field separator, 74and leading tabs and spaces are ignored. 75The default output field separator is a single space character. 76.Pp 77Many of the options use file and field numbers. 78Both file numbers and field numbers are 1 based, i.e., the first file on 79the command line is file number 1 and the first field is field number 1. 80The following options are available: 81.Bl -tag -width indent 82.It Fl a Ar file_number 83In addition to the default output, produce a line for each unpairable 84line in file 85.Ar file_number . 86.It Fl e Ar string 87Replace empty output fields with 88.Ar string . 89.It Fl o Ar list 90The 91.Fl o 92option specifies the fields that will be output from each file for 93each line with matching join fields. 94Each element of 95.Ar list 96has either the form 97.Ar file_number . Ns Ar field , 98where 99.Ar file_number 100is a file number and 101.Ar field 102is a field number, or the form 103.Ql 0 104.Pq zero , 105representing the join field. 106The elements of list must be either comma 107.Pq Ql \&, 108or whitespace separated. 109(The latter requires quoting to protect it from the shell, or, a simpler 110approach is to use multiple 111.Fl o 112options.) 113.It Fl t Ar char 114Use character 115.Ar char 116as a field delimiter for both input and output. 117Every occurrence of 118.Ar char 119in a line is significant. 120.It Fl v Ar file_number 121Do not display the default output, but display a line for each unpairable 122line in file 123.Ar file_number . 124The options 125.Fl v Cm 1 126and 127.Fl v Cm 2 128may be specified at the same time. 129.It Fl 1 Ar field 130Join on the 131.Ar field Ns 'th 132field of 133.Ar file1 . 134.It Fl 2 Ar field 135Join on the 136.Ar field Ns 'th 137field of 138.Ar file2 . 139.El 140.Pp 141When the default field delimiter characters are used, the files to be joined 142should be ordered in the collating sequence of 143.Xr sort 1 , 144using the 145.Fl b 146option, on the fields on which they are to be joined, otherwise 147.Nm 148may not report all field matches. 149When the field delimiter characters are specified by the 150.Fl t 151option, the collating sequence should be the same as 152.Xr sort 1 153without the 154.Fl b 155option. 156.Pp 157If one of the arguments 158.Ar file1 159or 160.Ar file2 161is 162.Sq Fl , 163the standard input is used. 164.Sh EXIT STATUS 165.Ex -std 166.Sh EXAMPLES 167Assuming a file named 168.Pa nobel_laureates.txt 169with information about some of the first Nobel Peace Prize laureates: 170.Bd -literal -offset indent 1711901,Jean Henri Dunant,M 1721901,Frederic Passy,M 1731902,Elie Ducommun,M 1741905,Baroness Bertha Sophie Felicita Von Suttner,F 1751910,Permanent International Peace Bureau, 176.Ed 177.Pp 178and a second file 179.Pa nobel_nationalities.txt 180with their nationalities: 181.Bd -literal -offset indent 182Jean Henri Dunant,Switzerland 183Frederic Passy,France 184Elie Ducommun,Switzerland 185Baroness Bertha Sophie Felicita Von Suttner 186.Ed 187.Pp 188Join the two files using the second column from first file and the default first 189column from second file specifying a custom field delimiter: 190.Bd -literal -offset indent 191$ join -t, -1 2 nobel_laureates.txt nobel_nationalities.txt 192Jean Henri Dunant,1901,M,Switzerland 193Frederic Passy,1901,M,France 194Elie Ducommun,1902,M,Switzerland 195Baroness Bertha Sophie Felicita Von Suttner,1905,F 196.Ed 197.Pp 198Show only the year and the nationality of the laureate using 199.Ql <<NULL>> 200to replace empty fields: 201.Bd -literal -offset indent 202$ join -e "<<NULL>>" -t, -1 2 -o "1.1 2.2" nobel_laureates.txt nobel_nationalities.txt 2031901,Switzerland 2041901,France 2051902,Switzerland 2061905,<<NULL>> 207.Ed 208.Pp 209Show only lines from first file which do not have a match in second file: 210.Bd -literal -offset indent 211$ join -v1 -t, -1 2 nobel_laureates.txt nobel_nationalities.txt 212Permanent International Peace Bureau,1910, 213.Ed 214.Pp 215Assuming a file named 216.Pa capitals.txt 217with the following content: 218.Bd -literal -offset indent 219Belgium,Brussels 220France,Paris 221Italy,Rome 222Switzerland 223.Ed 224.Pp 225Show the name and capital of the country where the laureate was born. 226This example uses 227.Pa nobel_nationalities.txt 228as a bridge but does not show any information from that file. 229Also see the note about 230.Xr sort 1 231above to understand why we need to sort the intermediate result. 232.Bd -literal -offset indent 233$ join -t, -1 2 -o 1.2 2.2 nobel_laureates.txt nobel_nationalities.txt | \e 234 sort -k2 -t, | join -t, -e "<<NULL>>" -1 2 -o 1.1 2.2 - capitals.txt 235Elie Ducommun,<<NULL>> 236Jean Henri Dunant,<<NULL>> 237.Ed 238.Sh COMPATIBILITY 239For compatibility with historic versions of 240.Nm , 241the following options are available: 242.Bl -tag -width indent 243.It Fl a 244In addition to the default output, produce a line for each unpairable line 245in both 246.Ar file1 247and 248.Ar file2 . 249.It Fl j1 Ar field 250Join on the 251.Ar field Ns 'th 252field of 253.Ar file1 . 254.It Fl j2 Ar field 255Join on the 256.Ar field Ns 'th 257field of 258.Ar file2 . 259.It Fl j Ar field 260Join on the 261.Ar field Ns 'th 262field of both 263.Ar file1 264and 265.Ar file2 . 266.It Fl o Ar list ... 267Historical implementations of 268.Nm 269permitted multiple arguments to the 270.Fl o 271option. 272These arguments were of the form 273.Ar file_number . Ns Ar field_number 274as described 275for the current 276.Fl o 277option. 278This has obvious difficulties in the presence of files named 279.Pa 1.2 . 280.El 281.Pp 282These options are available only so historic shell scripts do not require 283modification and should not be used. 284.Sh SEE ALSO 285.Xr awk 1 , 286.Xr comm 1 , 287.Xr paste 1 , 288.Xr sort 1 , 289.Xr uniq 1 290.Sh STANDARDS 291The 292.Nm 293command conforms to 294.St -p1003.1-2001 . 295