xref: /freebsd/contrib/one-true-awk/README.md (revision 17853db4b0dc36ed32af039cd803f13b692913da)
1f68a53dbSWarner Losh# The One True Awk
2f68a53dbSWarner Losh
3f68a53dbSWarner LoshThis is the version of `awk` described in _The AWK Programming Language_,
4f32a6403SWarner LoshSecond Edition, by Al Aho, Brian Kernighan, and Peter Weinberger
5f32a6403SWarner Losh(Addison-Wesley, 2024, ISBN-13 978-0138269722, ISBN-10 0138269726).
6f32a6403SWarner Losh
7f32a6403SWarner Losh## What's New? ##
8f32a6403SWarner Losh
9f32a6403SWarner LoshThis version of Awk handles UTF-8 and comma-separated values (CSV) input.
10f32a6403SWarner Losh
11f32a6403SWarner Losh### Strings ###
12f32a6403SWarner Losh
13f32a6403SWarner LoshFunctions that process strings now count Unicode code points, not bytes;
14f32a6403SWarner Loshthis affects `length`, `substr`, `index`, `match`, `split`,
15f32a6403SWarner Losh`sub`, `gsub`, and others.  Note that code
16f32a6403SWarner Loshpoints are not necessarily characters.
17f32a6403SWarner Losh
18f32a6403SWarner LoshUTF-8 sequences may appear in literal strings and regular expressions.
19*17853db4SWarner LoshArbitrary characters may be included with `\u` followed by 1 to 8 hexadecimal digits.
20f32a6403SWarner Losh
21f32a6403SWarner Losh### Regular expressions ###
22f32a6403SWarner Losh
23f32a6403SWarner LoshRegular expressions may include UTF-8 code points, including `\u`.
24f32a6403SWarner Losh
25f32a6403SWarner Losh### CSV ###
26f32a6403SWarner Losh
27f32a6403SWarner LoshThe option `--csv` turns on CSV processing of input:
28f32a6403SWarner Loshfields are separated by commas, fields may be quoted with
29f32a6403SWarner Loshdouble-quote (`"`) characters, quoted fields may contain embedded newlines.
30f32a6403SWarner LoshDouble-quotes in fields have to be doubled and enclosed in quoted fields.
31f32a6403SWarner LoshIn CSV mode, `FS` is ignored.
32f32a6403SWarner Losh
33f32a6403SWarner LoshIf no explicit separator argument is provided,
34f32a6403SWarner Loshfield-splitting in `split` is determined by CSV mode.
35f68a53dbSWarner Losh
36f68a53dbSWarner Losh## Copyright
37f68a53dbSWarner Losh
38f68a53dbSWarner LoshCopyright (C) Lucent Technologies 1997<br/>
39f68a53dbSWarner LoshAll Rights Reserved
40f68a53dbSWarner Losh
41f68a53dbSWarner LoshPermission to use, copy, modify, and distribute this software and
42f68a53dbSWarner Loshits documentation for any purpose and without fee is hereby
43f68a53dbSWarner Loshgranted, provided that the above copyright notice appear in all
44f68a53dbSWarner Loshcopies and that both that the copyright notice and this
45f68a53dbSWarner Loshpermission notice and warranty disclaimer appear in supporting
46f68a53dbSWarner Loshdocumentation, and that the name Lucent Technologies or any of
47f68a53dbSWarner Loshits entities not be used in advertising or publicity pertaining
48f68a53dbSWarner Loshto distribution of the software without specific, written prior
49f68a53dbSWarner Loshpermission.
50f68a53dbSWarner Losh
51f68a53dbSWarner LoshLUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
52f68a53dbSWarner LoshINCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
53f68a53dbSWarner LoshIN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
54f68a53dbSWarner LoshSPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
55f68a53dbSWarner LoshWHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
56f68a53dbSWarner LoshIN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
57f68a53dbSWarner LoshARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
58f68a53dbSWarner LoshTHIS SOFTWARE.
59f68a53dbSWarner Losh
60f68a53dbSWarner Losh## Distribution and Reporting Problems
61f68a53dbSWarner Losh
62f68a53dbSWarner LoshChanges, mostly bug fixes and occasional enhancements, are listed
63f68a53dbSWarner Loshin `FIXES`.  If you distribute this code further, please please please
64f68a53dbSWarner Loshdistribute `FIXES` with it.
65f68a53dbSWarner Losh
66f68a53dbSWarner LoshIf you find errors, please report them
67f32a6403SWarner Loshto the current maintainer, ozan.yigit@gmail.com.
68f68a53dbSWarner LoshPlease _also_ open an issue in the GitHub issue tracker, to make
69f68a53dbSWarner Loshit easy to track issues.
70f68a53dbSWarner LoshThanks.
71f68a53dbSWarner Losh
72f68a53dbSWarner Losh## Submitting Pull Requests
73f68a53dbSWarner Losh
74f68a53dbSWarner LoshPull requests are welcome. Some guidelines:
75f68a53dbSWarner Losh
76f68a53dbSWarner Losh* Please do not use functions or facilities that are not standard (e.g.,
77f68a53dbSWarner Losh`strlcpy()`, `fpurge()`).
78f68a53dbSWarner Losh
79f68a53dbSWarner Losh* Please run the test suite and make sure that your changes pass before
80f68a53dbSWarner Loshposting the pull request. To do so:
81f68a53dbSWarner Losh
82f68a53dbSWarner Losh  1. Save the previous version of `awk` somewhere in your path. Call it `nawk` (for example).
83f68a53dbSWarner Losh  1. Run `oldawk=nawk make check > check.out 2>&1`.
84f68a53dbSWarner Losh  1. Search for `BAD` or `error` in the result. In general, look over it manually to make sure there are no errors.
85f68a53dbSWarner Losh
86f68a53dbSWarner Losh* Please create the pull request with a request
87f68a53dbSWarner Loshto merge into the `staging` branch instead of into the `master` branch.
88f68a53dbSWarner LoshThis allows us to do testing, and to make any additional edits or changes
89f68a53dbSWarner Loshafter the merge but before merging to `master`.
90f68a53dbSWarner Losh
91f68a53dbSWarner Losh## Building
92f68a53dbSWarner Losh
93f68a53dbSWarner LoshThe program itself is created by
94f68a53dbSWarner Losh
95f68a53dbSWarner Losh	make
96f68a53dbSWarner Losh
97f68a53dbSWarner Loshwhich should produce a sequence of messages roughly like this:
98f68a53dbSWarner Losh
99f32a6403SWarner Losh	bison -d  awkgram.y
100f32a6403SWarner Losh	awkgram.y: warning: 44 shift/reduce conflicts [-Wconflicts-sr]
101f32a6403SWarner Losh	awkgram.y: warning: 85 reduce/reduce conflicts [-Wconflicts-rr]
102f32a6403SWarner Losh	awkgram.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples
103f32a6403SWarner Losh	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o awkgram.tab.o awkgram.tab.c
104f32a6403SWarner Losh	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o b.o b.c
105f32a6403SWarner Losh	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o main.o main.c
106f32a6403SWarner Losh	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o parse.o parse.c
107f32a6403SWarner Losh	gcc -g -Wall -pedantic -Wcast-qual -O2 maketab.c -o maketab
108f32a6403SWarner Losh	./maketab awkgram.tab.h >proctab.c
109f32a6403SWarner Losh	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o proctab.o proctab.c
110f32a6403SWarner Losh	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o tran.o tran.c
111f32a6403SWarner Losh	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o lib.o lib.c
112f32a6403SWarner Losh	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o run.o run.c
113f32a6403SWarner Losh	gcc -g -Wall -pedantic -Wcast-qual   -O2   -c -o lex.o lex.c
114f32a6403SWarner Losh	gcc -g -Wall -pedantic -Wcast-qual   -O2 awkgram.tab.o b.o main.o parse.o proctab.o tran.o lib.o run.o lex.o   -lm
115f68a53dbSWarner Losh
116f68a53dbSWarner LoshThis produces an executable `a.out`; you will eventually want to
117f68a53dbSWarner Loshmove this to some place like `/usr/bin/awk`.
118f68a53dbSWarner Losh
119f68a53dbSWarner LoshIf your system does not have `yacc` or `bison` (the GNU
120f68a53dbSWarner Loshequivalent), you need to install one of them first.
121f32a6403SWarner LoshThe default in the `makefile` is `bison`; you will have
122f32a6403SWarner Loshto edit the `makefile` to use `yacc`.
123f68a53dbSWarner Losh
124f32a6403SWarner LoshNOTE: This version uses ISO/IEC C99, as you should also.  We have
125f68a53dbSWarner Loshcompiled this without any changes using `gcc -Wall` and/or local C
126f68a53dbSWarner Loshcompilers on a variety of systems, but new systems or compilers
127f68a53dbSWarner Loshmay raise some new complaint; reports of difficulties are
128f68a53dbSWarner Loshwelcome.
129f68a53dbSWarner Losh
130f68a53dbSWarner LoshThis compiles without change on Macintosh OS X using `gcc` and
131f68a53dbSWarner Loshthe standard developer tools.
132f68a53dbSWarner Losh
133f68a53dbSWarner LoshYou can also use `make CC=g++` to build with the GNU C++ compiler,
134f68a53dbSWarner Loshshould you choose to do so.
135f68a53dbSWarner Losh
13623f24377SWarner Losh## A Note About Releases
13723f24377SWarner Losh
138f32a6403SWarner LoshWe don't usually do releases.
13923f24377SWarner Losh
140f68a53dbSWarner Losh## A Note About Maintenance
141f68a53dbSWarner Losh
14223f24377SWarner LoshNOTICE! Maintenance of this program is on a ''best effort''
143f68a53dbSWarner Loshbasis.  We try to get to issues and pull requests as quickly
144f68a53dbSWarner Loshas we can.  Unfortunately, however, keeping this program going
145f68a53dbSWarner Loshis not at the top of our priority list.
146f68a53dbSWarner Losh
147f68a53dbSWarner Losh#### Last Updated
148f68a53dbSWarner Losh
149f32a6403SWarner LoshMon 05 Feb 2024 08:46:55 IST
150