xref: /freebsd/usr.bin/join/join.1 (revision 77a0943ded95b9e6438f7db70c4a28e4d93946d4)
1.\" Copyright (c) 1990, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" This code is derived from software contributed to Berkeley by
5.\" the Institute of Electrical and Electronics Engineers, Inc.
6.\"
7.\" Redistribution and use in source and binary forms, with or without
8.\" modification, are permitted provided that the following conditions
9.\" are met:
10.\" 1. Redistributions of source code must retain the above copyright
11.\"    notice, this list of conditions and the following disclaimer.
12.\" 2. Redistributions in binary form must reproduce the above copyright
13.\"    notice, this list of conditions and the following disclaimer in the
14.\"    documentation and/or other materials provided with the distribution.
15.\" 3. All advertising materials mentioning features or use of this software
16.\"    must display the following acknowledgement:
17.\"	This product includes software developed by the University of
18.\"	California, Berkeley and its contributors.
19.\" 4. Neither the name of the University nor the names of its contributors
20.\"    may be used to endorse or promote products derived from this software
21.\"    without specific prior written permission.
22.\"
23.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
24.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
25.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
26.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
27.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
28.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
29.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
30.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
31.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
32.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
33.\" SUCH DAMAGE.
34.\"
35.\"	@(#)join.1	8.3 (Berkeley) 4/28/95
36.\" $FreeBSD$
37.\"
38.Dd April 28, 1995
39.Dt JOIN 1
40.Os
41.Sh NAME
42.Nm join
43.Nd relational database operator
44.Sh SYNOPSIS
45.Nm
46.Oo
47.Fl a Ar file_number | Fl v Ar file_number
48.Oc
49.Op Fl e Ar string
50.Op Fl j Ar file_number field
51.Op Fl o Ar list
52.Bk -words
53.Ek
54.Op Fl t Ar char
55.Op Fl \&1 Ar field
56.Op Fl \&2 Ar field
57.Ar file1
58.Ar file2
59.Sh DESCRIPTION
60The
61.Nm
62utility performs an
63.Dq equality join
64on the specified files
65and writes the result to the standard output.
66The
67.Dq join field
68is the field in each file by which the files are compared.
69The first field in each line is used by default.
70There is one line in the output for each pair of lines in
71.Ar file1
72and
73.Ar file2
74which have identical join fields.
75Each output line consists of the join field, the remaining fields from
76.Ar file1
77and then the remaining fields from
78.Ar file2 .
79.Pp
80The default field separators are tab and space characters.
81In this case, multiple tabs and spaces count as a single field separator,
82and leading tabs and spaces are ignored.
83The default output field separator is a single space character.
84.Pp
85Many of the options use file and field numbers.
86Both file numbers and field numbers are 1 based, i.e. the first file on
87the command line is file number 1 and the first field is field number 1.
88The following options are available:
89.Bl -tag -width indent
90.It Fl a Ar file_number
91In addition to the default output, produce a line for each unpairable
92line in file
93.Ar file_number .
94(The argument to
95.Fl a
96must not be preceded by a space; see the
97.Sx COMPATIBILITY
98section.)
99.It Fl e Ar string
100Replace empty output fields with
101.Ar string .
102.It Fl o Ar list
103The
104.Fl o
105option specifies the fields that will be output from each file for
106each line with matching join fields.
107Each element of
108.Ar list
109has the form
110.Ql file_number.field ,
111where
112.Ar file_number
113is a file number and
114.Ar field
115is a field number.
116The elements of list must be either comma
117.Pf ( Dq , Ns )
118or whitespace separated.
119(The latter requires quoting to protect it from the shell, or, a simpler
120approach is to use multiple
121.Fl o
122options.)
123.It Fl t Ar char
124Use character
125.Ar char
126as a field delimiter for both input and output.
127Every occurrence of
128.Ar char
129in a line is significant.
130.It Fl v Ar file_number
131Do not display the default output, but display a line for each unpairable
132line in file
133.Ar file_number .
134The options
135.Fl v Ar 1
136and
137.Fl v Ar 2
138may be specified at the same time.
139.It Fl 1 Ar field
140Join on the
141.Ar field Ns 'th
142field of file 1.
143.It Fl 2 Ar field
144Join on the
145.Ar field Ns 'th
146field of file 2.
147.El
148.Pp
149When the default field delimiter characters are used, the files to be joined
150should be ordered in the collating sequence of
151.Xr sort 1 ,
152using the
153.Fl b
154option, on the fields on which they are to be joined, otherwise
155.Nm
156may not report all field matches.
157When the field delimiter characters are specified by the
158.Fl t
159option, the collating sequence should be the same as
160.Xr sort 1
161without the
162.Fl b
163option.
164.Pp
165If one of the arguments
166.Ar file1
167or
168.Ar file2
169is
170.Dq - ,
171the standard input is used.
172.Sh DIAGNOSTICS
173The
174.Nm
175utility exits 0 on success, and >0 if an error occurs.
176.Sh COMPATIBILITY
177For compatibility with historic versions of
178.Nm ,
179the following options are available:
180.Bl -tag -width indent
181.It Fl a
182In addition to the default output, produce a line for each unpairable line
183in both file 1 and file 2.
184(To distinguish between this and
185.Fl a Ar file_number ,
186.Nm
187currently requires that the latter not include any white space.)
188.It Fl j1 Ar field
189Join on the
190.Ar field Ns 'th
191field of file 1.
192.It Fl j2 Ar field
193Join on the
194.Ar field Ns 'th
195field of file 2.
196.It Fl j Ar field
197Join on the
198.Ar field Ns 'th
199field of both file 1 and file 2.
200.It Fl o Ar list ...
201Historical implementations of
202.Nm
203permitted multiple arguments to the
204.Fl o
205option.
206These arguments were of the form
207.Ql file_number.field_number
208as described
209for the current
210.Fl o
211option.
212This has obvious difficulties in the presence of files named
213.Ql 1.2 .
214.El
215.Pp
216These options are available only so historic shellscripts don't require
217modification and should not be used.
218.Sh STANDARDS
219The
220.Nm
221command is expected to be
222.St -p1003.2
223compatible.
224.Sh SEE ALSO
225.Xr awk 1 ,
226.Xr comm 1 ,
227.Xr paste 1 ,
228.Xr sort 1 ,
229.Xr uniq 1
230