1.\" Copyright (c) 1990, 1993 2.\" The Regents of the University of California. All rights reserved. 3.\" 4.\" This code is derived from software contributed to Berkeley by 5.\" the Institute of Electrical and Electronics Engineers, Inc. 6.\" 7.\" Redistribution and use in source and binary forms, with or without 8.\" modification, are permitted provided that the following conditions 9.\" are met: 10.\" 1. Redistributions of source code must retain the above copyright 11.\" notice, this list of conditions and the following disclaimer. 12.\" 2. Redistributions in binary form must reproduce the above copyright 13.\" notice, this list of conditions and the following disclaimer in the 14.\" documentation and/or other materials provided with the distribution. 15.\" 3. Neither the name of the University nor the names of its contributors 16.\" may be used to endorse or promote products derived from this software 17.\" without specific prior written permission. 18.\" 19.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 20.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 21.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 22.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 23.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 24.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 25.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 26.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 27.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 28.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 29.\" SUCH DAMAGE. 30.\" 31.Dd June 20, 2020 32.Dt JOIN 1 33.Os 34.Sh NAME 35.Nm join 36.Nd relational database operator 37.Sh SYNOPSIS 38.Nm 39.Oo 40.Fl a Ar file_number | Fl v Ar file_number 41.Oc 42.Op Fl e Ar string 43.Op Fl o Ar list 44.Op Fl t Ar char 45.Op Fl 1 Ar field 46.Op Fl 2 Ar field 47.Ar file1 48.Ar file2 49.Sh DESCRIPTION 50The 51.Nm 52utility performs an 53.Dq equality join 54on the specified files 55and writes the result to the standard output. 56The 57.Dq join field 58is the field in each file by which the files are compared. 59The first field in each line is used by default. 60There is one line in the output for each pair of lines in 61.Ar file1 62and 63.Ar file2 64which have identical join fields. 65Each output line consists of the join field, the remaining fields from 66.Ar file1 67and then the remaining fields from 68.Ar file2 . 69.Pp 70The default field separators are tab and space characters. 71In this case, multiple tabs and spaces count as a single field separator, 72and leading tabs and spaces are ignored. 73The default output field separator is a single space character. 74.Pp 75Many of the options use file and field numbers. 76Both file numbers and field numbers are 1 based, i.e., the first file on 77the command line is file number 1 and the first field is field number 1. 78The following options are available: 79.Bl -tag -width indent 80.It Fl a Ar file_number 81In addition to the default output, produce a line for each unpairable 82line in file 83.Ar file_number . 84.It Fl e Ar string 85Replace empty output fields with 86.Ar string . 87.It Fl o Ar list 88The 89.Fl o 90option specifies the fields that will be output from each file for 91each line with matching join fields. 92Each element of 93.Ar list 94has either the form 95.Ar file_number . Ns Ar field , 96where 97.Ar file_number 98is a file number and 99.Ar field 100is a field number, or the form 101.Ql 0 102.Pq zero , 103representing the join field. 104The elements of list must be either comma 105.Pq Ql \&, 106or whitespace separated. 107(The latter requires quoting to protect it from the shell, or, a simpler 108approach is to use multiple 109.Fl o 110options.) 111.It Fl t Ar char 112Use character 113.Ar char 114as a field delimiter for both input and output. 115Every occurrence of 116.Ar char 117in a line is significant. 118.It Fl v Ar file_number 119Do not display the default output, but display a line for each unpairable 120line in file 121.Ar file_number . 122The options 123.Fl v Cm 1 124and 125.Fl v Cm 2 126may be specified at the same time. 127.It Fl 1 Ar field 128Join on the 129.Ar field Ns 'th 130field of 131.Ar file1 . 132.It Fl 2 Ar field 133Join on the 134.Ar field Ns 'th 135field of 136.Ar file2 . 137.El 138.Pp 139When the default field delimiter characters are used, the files to be joined 140should be ordered in the collating sequence of 141.Xr sort 1 , 142using the 143.Fl b 144option, on the fields on which they are to be joined, otherwise 145.Nm 146may not report all field matches. 147When the field delimiter characters are specified by the 148.Fl t 149option, the collating sequence should be the same as 150.Xr sort 1 151without the 152.Fl b 153option. 154.Pp 155If one of the arguments 156.Ar file1 157or 158.Ar file2 159is 160.Sq Fl , 161the standard input is used. 162.Sh EXIT STATUS 163.Ex -std 164.Sh EXAMPLES 165Assuming a file named 166.Pa nobel_laureates.txt 167with information about some of the first Nobel Peace Prize laureates: 168.Bd -literal -offset indent 1691901,Jean Henri Dunant,M 1701901,Frederic Passy,M 1711902,Elie Ducommun,M 1721905,Baroness Bertha Sophie Felicita Von Suttner,F 1731910,Permanent International Peace Bureau, 174.Ed 175.Pp 176and a second file 177.Pa nobel_nationalities.txt 178with their nationalities: 179.Bd -literal -offset indent 180Jean Henri Dunant,Switzerland 181Frederic Passy,France 182Elie Ducommun,Switzerland 183Baroness Bertha Sophie Felicita Von Suttner 184.Ed 185.Pp 186Join the two files using the second column from first file and the default first 187column from second file specifying a custom field delimiter: 188.Bd -literal -offset indent 189$ join -t, -1 2 nobel_laureates.txt nobel_nationalities.txt 190Jean Henri Dunant,1901,M,Switzerland 191Frederic Passy,1901,M,France 192Elie Ducommun,1902,M,Switzerland 193Baroness Bertha Sophie Felicita Von Suttner,1905,F 194.Ed 195.Pp 196Show only the year and the nationality of the laureate using 197.Ql <<NULL>> 198to replace empty fields: 199.Bd -literal -offset indent 200$ join -e "<<NULL>>" -t, -1 2 -o "1.1 2.2" nobel_laureates.txt nobel_nationalities.txt 2011901,Switzerland 2021901,France 2031902,Switzerland 2041905,<<NULL>> 205.Ed 206.Pp 207Show only lines from first file which do not have a match in second file: 208.Bd -literal -offset indent 209$ join -v1 -t, -1 2 nobel_laureates.txt nobel_nationalities.txt 210Permanent International Peace Bureau,1910, 211.Ed 212.Pp 213Assuming a file named 214.Pa capitals.txt 215with the following content: 216.Bd -literal -offset indent 217Belgium,Brussels 218France,Paris 219Italy,Rome 220Switzerland 221.Ed 222.Pp 223Show the name and capital of the country where the laureate was born. 224This example uses 225.Pa nobel_nationalities.txt 226as a bridge but does not show any information from that file. 227Also see the note about 228.Xr sort 1 229above to understand why we need to sort the intermediate result. 230.Bd -literal -offset indent 231$ join -t, -1 2 -o 1.2 2.2 nobel_laureates.txt nobel_nationalities.txt | \e 232 sort -k2 -t, | join -t, -e "<<NULL>>" -1 2 -o 1.1 2.2 - capitals.txt 233Elie Ducommun,<<NULL>> 234Jean Henri Dunant,<<NULL>> 235.Ed 236.Sh COMPATIBILITY 237For compatibility with historic versions of 238.Nm , 239the following options are available: 240.Bl -tag -width indent 241.It Fl a 242In addition to the default output, produce a line for each unpairable line 243in both 244.Ar file1 245and 246.Ar file2 . 247.It Fl j1 Ar field 248Join on the 249.Ar field Ns 'th 250field of 251.Ar file1 . 252.It Fl j2 Ar field 253Join on the 254.Ar field Ns 'th 255field of 256.Ar file2 . 257.It Fl j Ar field 258Join on the 259.Ar field Ns 'th 260field of both 261.Ar file1 262and 263.Ar file2 . 264.It Fl o Ar list ... 265Historical implementations of 266.Nm 267permitted multiple arguments to the 268.Fl o 269option. 270These arguments were of the form 271.Ar file_number . Ns Ar field_number 272as described 273for the current 274.Fl o 275option. 276This has obvious difficulties in the presence of files named 277.Pa 1.2 . 278.El 279.Pp 280These options are available only so historic shell scripts do not require 281modification and should not be used. 282.Sh SEE ALSO 283.Xr awk 1 , 284.Xr comm 1 , 285.Xr paste 1 , 286.Xr sort 1 , 287.Xr uniq 1 288.Sh STANDARDS 289The 290.Nm 291command conforms to 292.St -p1003.1-2001 . 293