1f68a53dbSWarner Losh# The One True Awk 2f68a53dbSWarner Losh 3f68a53dbSWarner LoshThis is the version of `awk` described in _The AWK Programming Language_, 4f32a6403SWarner LoshSecond Edition, by Al Aho, Brian Kernighan, and Peter Weinberger 5f32a6403SWarner Losh(Addison-Wesley, 2024, ISBN-13 978-0138269722, ISBN-10 0138269726). 6f32a6403SWarner Losh 7f32a6403SWarner Losh## What's New? ## 8f32a6403SWarner Losh 9f32a6403SWarner LoshThis version of Awk handles UTF-8 and comma-separated values (CSV) input. 10f32a6403SWarner Losh 11f32a6403SWarner Losh### Strings ### 12f32a6403SWarner Losh 13f32a6403SWarner LoshFunctions that process strings now count Unicode code points, not bytes; 14f32a6403SWarner Loshthis affects `length`, `substr`, `index`, `match`, `split`, 15f32a6403SWarner Losh`sub`, `gsub`, and others. Note that code 16f32a6403SWarner Loshpoints are not necessarily characters. 17f32a6403SWarner Losh 18f32a6403SWarner LoshUTF-8 sequences may appear in literal strings and regular expressions. 19*17853db4SWarner LoshArbitrary characters may be included with `\u` followed by 1 to 8 hexadecimal digits. 20f32a6403SWarner Losh 21f32a6403SWarner Losh### Regular expressions ### 22f32a6403SWarner Losh 23f32a6403SWarner LoshRegular expressions may include UTF-8 code points, including `\u`. 24f32a6403SWarner Losh 25f32a6403SWarner Losh### CSV ### 26f32a6403SWarner Losh 27f32a6403SWarner LoshThe option `--csv` turns on CSV processing of input: 28f32a6403SWarner Loshfields are separated by commas, fields may be quoted with 29f32a6403SWarner Loshdouble-quote (`"`) characters, quoted fields may contain embedded newlines. 30f32a6403SWarner LoshDouble-quotes in fields have to be doubled and enclosed in quoted fields. 31f32a6403SWarner LoshIn CSV mode, `FS` is ignored. 32f32a6403SWarner Losh 33f32a6403SWarner LoshIf no explicit separator argument is provided, 34f32a6403SWarner Loshfield-splitting in `split` is determined by CSV mode. 35f68a53dbSWarner Losh 36f68a53dbSWarner Losh## Copyright 37f68a53dbSWarner Losh 38f68a53dbSWarner LoshCopyright (C) Lucent Technologies 1997<br/> 39f68a53dbSWarner LoshAll Rights Reserved 40f68a53dbSWarner Losh 41f68a53dbSWarner LoshPermission to use, copy, modify, and distribute this software and 42f68a53dbSWarner Loshits documentation for any purpose and without fee is hereby 43f68a53dbSWarner Loshgranted, provided that the above copyright notice appear in all 44f68a53dbSWarner Loshcopies and that both that the copyright notice and this 45f68a53dbSWarner Loshpermission notice and warranty disclaimer appear in supporting 46f68a53dbSWarner Loshdocumentation, and that the name Lucent Technologies or any of 47f68a53dbSWarner Loshits entities not be used in advertising or publicity pertaining 48f68a53dbSWarner Loshto distribution of the software without specific, written prior 49f68a53dbSWarner Loshpermission. 50f68a53dbSWarner Losh 51f68a53dbSWarner LoshLUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, 52f68a53dbSWarner LoshINCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. 53f68a53dbSWarner LoshIN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY 54f68a53dbSWarner LoshSPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES 55f68a53dbSWarner LoshWHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER 56f68a53dbSWarner LoshIN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, 57f68a53dbSWarner LoshARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF 58f68a53dbSWarner LoshTHIS SOFTWARE. 59f68a53dbSWarner Losh 60f68a53dbSWarner Losh## Distribution and Reporting Problems 61f68a53dbSWarner Losh 62f68a53dbSWarner LoshChanges, mostly bug fixes and occasional enhancements, are listed 63f68a53dbSWarner Loshin `FIXES`. If you distribute this code further, please please please 64f68a53dbSWarner Loshdistribute `FIXES` with it. 65f68a53dbSWarner Losh 66f68a53dbSWarner LoshIf you find errors, please report them 67f32a6403SWarner Loshto the current maintainer, ozan.yigit@gmail.com. 68f68a53dbSWarner LoshPlease _also_ open an issue in the GitHub issue tracker, to make 69f68a53dbSWarner Loshit easy to track issues. 70f68a53dbSWarner LoshThanks. 71f68a53dbSWarner Losh 72f68a53dbSWarner Losh## Submitting Pull Requests 73f68a53dbSWarner Losh 74f68a53dbSWarner LoshPull requests are welcome. Some guidelines: 75f68a53dbSWarner Losh 76f68a53dbSWarner Losh* Please do not use functions or facilities that are not standard (e.g., 77f68a53dbSWarner Losh`strlcpy()`, `fpurge()`). 78f68a53dbSWarner Losh 79f68a53dbSWarner Losh* Please run the test suite and make sure that your changes pass before 80f68a53dbSWarner Loshposting the pull request. To do so: 81f68a53dbSWarner Losh 82f68a53dbSWarner Losh 1. Save the previous version of `awk` somewhere in your path. Call it `nawk` (for example). 83f68a53dbSWarner Losh 1. Run `oldawk=nawk make check > check.out 2>&1`. 84f68a53dbSWarner Losh 1. Search for `BAD` or `error` in the result. In general, look over it manually to make sure there are no errors. 85f68a53dbSWarner Losh 86f68a53dbSWarner Losh* Please create the pull request with a request 87f68a53dbSWarner Loshto merge into the `staging` branch instead of into the `master` branch. 88f68a53dbSWarner LoshThis allows us to do testing, and to make any additional edits or changes 89f68a53dbSWarner Loshafter the merge but before merging to `master`. 90f68a53dbSWarner Losh 91f68a53dbSWarner Losh## Building 92f68a53dbSWarner Losh 93f68a53dbSWarner LoshThe program itself is created by 94f68a53dbSWarner Losh 95f68a53dbSWarner Losh make 96f68a53dbSWarner Losh 97f68a53dbSWarner Loshwhich should produce a sequence of messages roughly like this: 98f68a53dbSWarner Losh 99f32a6403SWarner Losh bison -d awkgram.y 100f32a6403SWarner Losh awkgram.y: warning: 44 shift/reduce conflicts [-Wconflicts-sr] 101f32a6403SWarner Losh awkgram.y: warning: 85 reduce/reduce conflicts [-Wconflicts-rr] 102f32a6403SWarner Losh awkgram.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples 103f32a6403SWarner Losh gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o awkgram.tab.o awkgram.tab.c 104f32a6403SWarner Losh gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o b.o b.c 105f32a6403SWarner Losh gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o main.o main.c 106f32a6403SWarner Losh gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o parse.o parse.c 107f32a6403SWarner Losh gcc -g -Wall -pedantic -Wcast-qual -O2 maketab.c -o maketab 108f32a6403SWarner Losh ./maketab awkgram.tab.h >proctab.c 109f32a6403SWarner Losh gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o proctab.o proctab.c 110f32a6403SWarner Losh gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o tran.o tran.c 111f32a6403SWarner Losh gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o lib.o lib.c 112f32a6403SWarner Losh gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o run.o run.c 113f32a6403SWarner Losh gcc -g -Wall -pedantic -Wcast-qual -O2 -c -o lex.o lex.c 114f32a6403SWarner Losh gcc -g -Wall -pedantic -Wcast-qual -O2 awkgram.tab.o b.o main.o parse.o proctab.o tran.o lib.o run.o lex.o -lm 115f68a53dbSWarner Losh 116f68a53dbSWarner LoshThis produces an executable `a.out`; you will eventually want to 117f68a53dbSWarner Loshmove this to some place like `/usr/bin/awk`. 118f68a53dbSWarner Losh 119f68a53dbSWarner LoshIf your system does not have `yacc` or `bison` (the GNU 120f68a53dbSWarner Loshequivalent), you need to install one of them first. 121f32a6403SWarner LoshThe default in the `makefile` is `bison`; you will have 122f32a6403SWarner Loshto edit the `makefile` to use `yacc`. 123f68a53dbSWarner Losh 124f32a6403SWarner LoshNOTE: This version uses ISO/IEC C99, as you should also. We have 125f68a53dbSWarner Loshcompiled this without any changes using `gcc -Wall` and/or local C 126f68a53dbSWarner Loshcompilers on a variety of systems, but new systems or compilers 127f68a53dbSWarner Loshmay raise some new complaint; reports of difficulties are 128f68a53dbSWarner Loshwelcome. 129f68a53dbSWarner Losh 130f68a53dbSWarner LoshThis compiles without change on Macintosh OS X using `gcc` and 131f68a53dbSWarner Loshthe standard developer tools. 132f68a53dbSWarner Losh 133f68a53dbSWarner LoshYou can also use `make CC=g++` to build with the GNU C++ compiler, 134f68a53dbSWarner Loshshould you choose to do so. 135f68a53dbSWarner Losh 13623f24377SWarner Losh## A Note About Releases 13723f24377SWarner Losh 138f32a6403SWarner LoshWe don't usually do releases. 13923f24377SWarner Losh 140f68a53dbSWarner Losh## A Note About Maintenance 141f68a53dbSWarner Losh 14223f24377SWarner LoshNOTICE! Maintenance of this program is on a ''best effort'' 143f68a53dbSWarner Loshbasis. We try to get to issues and pull requests as quickly 144f68a53dbSWarner Loshas we can. Unfortunately, however, keeping this program going 145f68a53dbSWarner Loshis not at the top of our priority list. 146f68a53dbSWarner Losh 147f68a53dbSWarner Losh#### Last Updated 148f68a53dbSWarner Losh 149f32a6403SWarner LoshMon 05 Feb 2024 08:46:55 IST 150