xref: /freebsd/usr.bin/split/split.1 (revision c4f7198f47c15eece849d06e8fdd1fb46ed43bba)
19b50d902SRodney W. Grimes.\" Copyright (c) 1990, 1991, 1993, 1994
29b50d902SRodney W. Grimes.\"	The Regents of the University of California.  All rights reserved.
39b50d902SRodney W. Grimes.\"
49b50d902SRodney W. Grimes.\" Redistribution and use in source and binary forms, with or without
59b50d902SRodney W. Grimes.\" modification, are permitted provided that the following conditions
69b50d902SRodney W. Grimes.\" are met:
79b50d902SRodney W. Grimes.\" 1. Redistributions of source code must retain the above copyright
89b50d902SRodney W. Grimes.\"    notice, this list of conditions and the following disclaimer.
99b50d902SRodney W. Grimes.\" 2. Redistributions in binary form must reproduce the above copyright
109b50d902SRodney W. Grimes.\"    notice, this list of conditions and the following disclaimer in the
119b50d902SRodney W. Grimes.\"    documentation and/or other materials provided with the distribution.
12fbbd9655SWarner Losh.\" 3. Neither the name of the University nor the names of its contributors
139b50d902SRodney W. Grimes.\"    may be used to endorse or promote products derived from this software
149b50d902SRodney W. Grimes.\"    without specific prior written permission.
159b50d902SRodney W. Grimes.\"
169b50d902SRodney W. Grimes.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
179b50d902SRodney W. Grimes.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
189b50d902SRodney W. Grimes.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
199b50d902SRodney W. Grimes.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
209b50d902SRodney W. Grimes.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
219b50d902SRodney W. Grimes.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
229b50d902SRodney W. Grimes.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
239b50d902SRodney W. Grimes.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
249b50d902SRodney W. Grimes.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
259b50d902SRodney W. Grimes.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
269b50d902SRodney W. Grimes.\" SUCH DAMAGE.
279b50d902SRodney W. Grimes.\"
289b50d902SRodney W. Grimes.\"	@(#)split.1	8.3 (Berkeley) 4/16/94
29c3aac50fSPeter Wemm.\" $FreeBSD$
309b50d902SRodney W. Grimes.\"
31*c4f7198fSJan Schaumann.Dd May 26, 2023
329b50d902SRodney W. Grimes.Dt SPLIT 1
339b50d902SRodney W. Grimes.Os
349b50d902SRodney W. Grimes.Sh NAME
359b50d902SRodney W. Grimes.Nm split
369b50d902SRodney W. Grimes.Nd split a file into pieces
379b50d902SRodney W. Grimes.Sh SYNOPSIS
388fe908efSRuslan Ermilov.Nm
39fb499259SMateusz Piotrowski.Op Fl d
409b50d902SRodney W. Grimes.Op Fl l Ar line_count
4149198c42SGiorgos Keramidas.Op Fl a Ar suffix_length
4249198c42SGiorgos Keramidas.Op Ar file Op Ar prefix
4349198c42SGiorgos Keramidas.Nm
44fb499259SMateusz Piotrowski.Op Fl d
4549198c42SGiorgos Keramidas.Fl b Ar byte_count Ns
4649198c42SGiorgos Keramidas.Oo
47cb29445aSRuslan Ermilov.Sm off
48cb29445aSRuslan Ermilov.Cm K | k | M | m | G | g
49cb29445aSRuslan Ermilov.Sm on
5049198c42SGiorgos Keramidas.Oc
5149198c42SGiorgos Keramidas.Op Fl a Ar suffix_length
5249198c42SGiorgos Keramidas.Op Ar file Op Ar prefix
5349198c42SGiorgos Keramidas.Nm
54fb499259SMateusz Piotrowski.Op Fl d
550e286f08SDavid Schultz.Fl n Ar chunk_count
560e286f08SDavid Schultz.Op Fl a Ar suffix_length
570e286f08SDavid Schultz.Op Ar file Op Ar prefix
580e286f08SDavid Schultz.Nm
59fb499259SMateusz Piotrowski.Op Fl d
6049198c42SGiorgos Keramidas.Fl p Ar pattern
6149198c42SGiorgos Keramidas.Op Fl a Ar suffix_length
6249198c42SGiorgos Keramidas.Op Ar file Op Ar prefix
639b50d902SRodney W. Grimes.Sh DESCRIPTION
649b50d902SRodney W. GrimesThe
658fe908efSRuslan Ermilov.Nm
669b50d902SRodney W. Grimesutility reads the given
679b50d902SRodney W. Grimes.Ar file
6849198c42SGiorgos Keramidasand breaks it up into files of 1000 lines each
6949198c42SGiorgos Keramidas(if no options are specified), leaving the
7049198c42SGiorgos Keramidas.Ar file
7149198c42SGiorgos Keramidasunchanged.
723e4228c3STim J. RobbinsIf
733e4228c3STim J. Robbins.Ar file
743e4228c3STim J. Robbinsis a single dash
75f9988a58SRuslan Ermilov.Pq Sq Fl
763e4228c3STim J. Robbinsor absent,
773e4228c3STim J. Robbins.Nm
783e4228c3STim J. Robbinsreads from the standard input.
799b50d902SRodney W. Grimes.Pp
809b50d902SRodney W. GrimesThe options are as follows:
8149198c42SGiorgos Keramidas.Bl -tag -width indent
8233eafb83STim J. Robbins.It Fl a Ar suffix_length
8341850495SMike BarcroftUse
8441850495SMike Barcroft.Ar suffix_length
8541850495SMike Barcroftletters to form the suffix of the file name.
86cb29445aSRuslan Ermilov.It Fl b Ar byte_count Ns Oo
87cb29445aSRuslan Ermilov.Sm off
88cb29445aSRuslan Ermilov.Cm K | k | M | m | G | g
89cb29445aSRuslan Ermilov.Sm on
90cb29445aSRuslan Ermilov.Oc
91cff548f0STom RhodesCreate split files
929b50d902SRodney W. Grimes.Ar byte_count
939b50d902SRodney W. Grimesbytes in length.
949b50d902SRodney W. GrimesIf
954e9e907dSRuslan Ermilov.Cm k
96a6dd1c93SGiorgos Keramidasor
97a6dd1c93SGiorgos Keramidas.Cm K
989b50d902SRodney W. Grimesis appended to the number, the file is split into
999b50d902SRodney W. Grimes.Ar byte_count
1009b50d902SRodney W. Grimeskilobyte pieces.
1019b50d902SRodney W. GrimesIf
1024e9e907dSRuslan Ermilov.Cm m
103a6dd1c93SGiorgos Keramidasor
104a6dd1c93SGiorgos Keramidas.Cm M
1059b50d902SRodney W. Grimesis appended to the number, the file is split into
1069b50d902SRodney W. Grimes.Ar byte_count
1079b50d902SRodney W. Grimesmegabyte pieces.
108a6dd1c93SGiorgos KeramidasIf
109a6dd1c93SGiorgos Keramidas.Cm g
110a6dd1c93SGiorgos Keramidasor
111a6dd1c93SGiorgos Keramidas.Cm G
112a6dd1c93SGiorgos Keramidasis appended to the number, the file is split into
113a6dd1c93SGiorgos Keramidas.Ar byte_count
114a6dd1c93SGiorgos Keramidasgigabyte pieces.
1157f418e34SEitan Adler.It Fl d
1167f418e34SEitan AdlerUse a numeric suffix instead of a alphabetic suffix.
11733eafb83STim J. Robbins.It Fl l Ar line_count
118cff548f0STom RhodesCreate split files
11949198c42SGiorgos Keramidas.Ar line_count
1209b50d902SRodney W. Grimeslines in length.
1210e286f08SDavid Schultz.It Fl n Ar chunk_count
122f806ea8aSGavin AtkinsonSplit file into
1230e286f08SDavid Schultz.Ar chunk_count
1240e286f08SDavid Schultzsmaller files.
125e48cafb5SFernando ApesteguíaThe first n - 1 files will be of size (size of
126e48cafb5SFernando Apesteguía.Ar file
127e48cafb5SFernando Apesteguía/
128e48cafb5SFernando Apesteguía.Ar chunk_count
129e48cafb5SFernando Apesteguía)
130e48cafb5SFernando Apesteguíaand the last file will contain the remaining bytes.
1312fa6610fSArchie Cobbs.It Fl p Ar pattern
1322fa6610fSArchie CobbsThe file is split whenever an input line matches
1332fa6610fSArchie Cobbs.Ar pattern ,
1342fa6610fSArchie Cobbswhich is interpreted as an extended regular expression.
1352fa6610fSArchie CobbsThe matching line will be the first line of the next output file.
1362fa6610fSArchie CobbsThis option is incompatible with the
1372fa6610fSArchie Cobbs.Fl b
1382fa6610fSArchie Cobbsand
1392fa6610fSArchie Cobbs.Fl l
1402fa6610fSArchie Cobbsoptions.
1419b50d902SRodney W. Grimes.El
1429b50d902SRodney W. Grimes.Pp
1439b50d902SRodney W. GrimesIf additional arguments are specified, the first is used as the name
1449b50d902SRodney W. Grimesof the input file which is to be split.
1459b50d902SRodney W. GrimesIf a second additional argument is specified, it is used as a prefix
1469b50d902SRodney W. Grimesfor the names of the files into which the file is split.
1479b50d902SRodney W. GrimesIn this case, each file into which the file is split is named by the
14841850495SMike Barcroftprefix followed by a lexically ordered suffix using
14941850495SMike Barcroft.Ar suffix_length
15041850495SMike Barcroftcharacters in the range
15149198c42SGiorgos Keramidas.Dq Li a Ns - Ns Li z .
15241850495SMike BarcroftIf
15341850495SMike Barcroft.Fl a
154*c4f7198fSJan Schaumannis not specified, two letters are used as the initial suffix.
155*c4f7198fSJan SchaumannIf the output does not fit into the resulting number of files and the
156*c4f7198fSJan Schaumann.Fl d
157*c4f7198fSJan Schaumannflag is not specified, then the suffix length is automatically extended as
158*c4f7198fSJan Schaumannneeded such that all output files continue to sort in lexical order.
1599b50d902SRodney W. Grimes.Pp
1609b50d902SRodney W. GrimesIf the
16149198c42SGiorgos Keramidas.Ar prefix
1629b50d902SRodney W. Grimesargument is not specified, the file is split into lexically ordered
163e93586dfSTim J. Robbinsfiles named with the prefix
164e93586dfSTim J. Robbins.Dq Li x
16541850495SMike Barcroftand with suffixes as above.
1665c9fc899STim J. Robbins.Sh ENVIRONMENT
1675c9fc899STim J. RobbinsThe
1685c9fc899STim J. Robbins.Ev LANG , LC_ALL , LC_CTYPE
1695c9fc899STim J. Robbinsand
1705c9fc899STim J. Robbins.Ev LC_COLLATE
1715c9fc899STim J. Robbinsenvironment variables affect the execution of
1725c9fc899STim J. Robbins.Nm
1735c9fc899STim J. Robbinsas described in
1745c9fc899STim J. Robbins.Xr environ 7 .
175a866e170SRuslan Ermilov.Sh EXIT STATUS
1765c9fc899STim J. Robbins.Ex -std
177e48cafb5SFernando Apesteguía.Sh EXAMPLES
178e48cafb5SFernando ApesteguíaSplit input into as many files as needed, so that each file contains at most 2
179e48cafb5SFernando Apesteguíalines:
180e48cafb5SFernando Apesteguía.Bd -literal -offset indent
181e48cafb5SFernando Apesteguía$ echo -e "first line\\nsecond line\\nthird line\\nforth line" | split -l2
182e48cafb5SFernando Apesteguía.Ed
183e48cafb5SFernando Apesteguía.Pp
184e48cafb5SFernando ApesteguíaSplit input in chunks of 10 bytes using numeric prefixes for file names.
185e48cafb5SFernando ApesteguíaThis generates two files of 10 bytes (x00 and x01) and a third file (x02) with the
186e48cafb5SFernando Apesteguíaremaining 2 bytes:
187e48cafb5SFernando Apesteguía.Bd -literal -offset indent
188e48cafb5SFernando Apesteguía$ echo -e "This is 22 bytes long" | split -d -b10
189e48cafb5SFernando Apesteguía.Ed
190e48cafb5SFernando Apesteguía.Pp
191e48cafb5SFernando ApesteguíaSplit input generating 6 files:
192e48cafb5SFernando Apesteguía.Bd -literal -offset indent
193fb499259SMateusz Piotrowski$ echo -e "This is 22 bytes long" | split -n 6
194e48cafb5SFernando Apesteguía.Ed
195e48cafb5SFernando Apesteguía.Pp
196e48cafb5SFernando ApesteguíaSplit input creating a new file every time a line matches the regular expression
197e48cafb5SFernando Apesteguíafor a
198e48cafb5SFernando Apesteguía.Dq t
199e48cafb5SFernando Apesteguíafollowed by either
200e48cafb5SFernando Apesteguía.Dq a
201e48cafb5SFernando Apesteguíaor
202e48cafb5SFernando Apesteguía.Dq u
203e48cafb5SFernando Apesteguíathus creating two files:
204e48cafb5SFernando Apesteguía.Bd -literal -offset indent
205e48cafb5SFernando Apesteguía$ echo -e "stack\\nstock\\nstuck\\nanother line" | split -p 't[au]'
206e48cafb5SFernando Apesteguía.Ed
2072fa6610fSArchie Cobbs.Sh SEE ALSO
2083662a240STim J. Robbins.Xr csplit 1 ,
20976a06f84SBen Smithurst.Xr re_format 7
2103662a240STim J. Robbins.Sh STANDARDS
2113662a240STim J. RobbinsThe
2123662a240STim J. Robbins.Nm
2133662a240STim J. Robbinsutility conforms to
2143662a240STim J. Robbins.St -p1003.1-2001 .
2159b50d902SRodney W. Grimes.Sh HISTORY
2169b50d902SRodney W. GrimesA
2178fe908efSRuslan Ermilov.Nm
2189b50d902SRodney W. Grimescommand appeared in
21903c249afSTim J. Robbins.At v3 .
2205c053aa3SKyle Evans.Pp
2215c053aa3SKyle EvansBefore
2225c053aa3SKyle Evans.Fx 14 ,
2237aaa50c6SKyle Evanspattern and line matching only operated on lines shorter than 65,536 bytes.
224