xref: /freebsd/contrib/one-true-awk/FIXES (revision a134ebd6e63f658f2d3d04ac0c60d23bcaa86dd7)
1/****************************************************************
2Copyright (C) Lucent Technologies 1997
3All Rights Reserved
4
5Permission to use, copy, modify, and distribute this software and
6its documentation for any purpose and without fee is hereby
7granted, provided that the above copyright notice appear in all
8copies and that both that the copyright notice and this
9permission notice and warranty disclaimer appear in supporting
10documentation, and that the name Lucent Technologies or any of
11its entities not be used in advertising or publicity pertaining
12to distribution of the software without specific, written prior
13permission.
14
15LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
16INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
17IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
18SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
19WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
20IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
21ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
22THIS SOFTWARE.
23****************************************************************/
24
25This file lists all bug fixes, changes, etc., made since the AWK book
26was sent to the printers in August, 1987.
27
28May 29,2019:
29	Fix check for command line arguments to no longer require that
30	first character after '=' not be another '='. Reverts change of
31	August 11, 1989. Thanks to GitHub user Jamie Landeg Jones for
32	pointing out the issue; from Issue #38.
33
34Apr 7, 2019:
35	Update awktest.tar(p.50) to use modern options to sort. Needed
36	for Android development. Thanks to GitHub user mohd-akram (Mohamed
37	Akram).  From Comment #33.
38
39Mar 12, 2019:
40	Added very simplistic support for cross-compiling in the
41	makefile.  We are NOT going to go in the direction of the
42	autotools, though.  Thanks to GitHub user nee-san for
43	the basic change. (Merged from PR #34.)
44
45Mar 5, 2019:
46	Added support for POSIX-standard interval expressions (a.k.a.
47	bounds, a.k.a. repetition expressions) in regular expressions,
48	backported (via NetBSD) from Apple awk-24 (20070501).
49	Thanks to Martijn Dekker <martijn@inlv.org> for the port.
50	(Merged from PR #30.)
51
52Mar 3, 2019:
53	Merge PRs as follows:
54	#12: Avoid undefined behaviour when using ctype(3) functions in
55	     relex(). Thanks to GitHub user iamleot.
56	#31: Make getline handle numeric strings, and update FIXES. Thanks
57	     to GitHub user arnoldrobbins
58	#32: maketab: support build systems with read-only source. Thanks
59	     to GitHub user enh.
60
61Jan 25, 2019:
62	Make getline handle numeric strings properly in all cases.
63	(Thanks, Arnold.)
64
65Jan 21, 2019:
66	Merged a number of small fixes from GitHub pull requests.
67	Thanks to GitHub users Arnold Robbins (arnoldrobbins),
68	Cody Mello (melloc) and Christoph Junghans (junghans).
69	PR numbers: 13-21, 23, 24, 27.
70
71Oct 25, 2018:
72	Added test in maketab.c to prevent generating a proctab entry
73	for YYSTYPE_IS_DEFINED.  It was harmless but some gcc settings
74	generated a warning message.  Thanks to Nan Xiao for report.
75
76Aug 27, 2018:
77	Disallow '$' in printf formats; arguments evaluated in order
78	and printed in order.
79
80	Added some casts to silence warnings on debugging printfs.
81	(Thanks, Arnold.)
82
83Aug 23, 2018:
84        A long list of fixes courtesy of Arnold Robbins,
85        to whom profound thanks.
86
87        1. ofs-rebuild: OFS value used to rebuild the record was incorrect.
88        Fixed August 19, 2014. Revised fix August 2018.
89
90        2. system-status: Instead of a floating-point division by 256, use
91        the wait(2) macros to create a reasonable exit status.
92        Fixed March 12, 2016.
93
94        3. space: Use provided xisblank() function instead of ispace() for
95        matching [[:blank:]].
96
97        4. a-format: Add POSIX standard %a and %A to supported formats. Check
98        at runtime that this format is available.
99
100        5. decr-NF: Decrementing NF did not change $0. This is a decades-old
101        bug. There are interactions with the old and new value of OFS as well.
102        Most of the fix came from the NetBSD awk.
103
104        6. string-conv: String conversions of scalars were sticky.  Once a
105        conversion to string happened, even with OFMT, that value was used until
106        a new numeric value was assigned, even if OFMT differed from CONVFMT,
107        and also if CONVFMT changed.
108
109        7. unary-plus: Unary plus on a string constant returned the string.
110        Instead, it should convert the value to numeric and give that value.
111
112	Also added Arnold's tests for these to awktest.tar as T.arnold.
113
114Aug 15, 2018:
115	fixed mangled awktest.tar (thanks, Arnold), posted all
116	current (very minor) fixes to github / onetrueawk
117
118Jun 7, 2018:
119	(yes, a long layoff)
120	Updated some broken tests (beebe.tar, T.lilly)
121	[thanks to Arnold Robbins]
122
123Mar 26, 2015:
124	buffer overflow in error reporting; thanks to tobias ulmer
125	and john-mark gurney for spotting it and the fix.
126
127Feb 4, 2013:
128	cleaned up a handful of tests that didn't seem to actually
129	test for correct behavior: T.latin1, T.gawk.
130
131Jan 5, 2013:
132	added ,NULL initializer to static Cells in run.c; not really
133	needed but cleaner.  Thanks to Michael Bombardieri.
134
135Dec 20, 2012:
136	fiddled makefile to get correct yacc and bison flags.  pick yacc
137	(linux) or bison (mac) as necessary.
138
139	added  __attribute__((__noreturn__)) to a couple of lines in
140	proto.h, to silence someone's enthusiastic checker.
141
142	fixed obscure call by value bug in split(a[1],a) reported on
143	9fans.  the management of temporary values is just a mess; i
144	took a shortcut by making an extra string copy.  thanks
145	to paul patience and arnold robbins for passing it on and for
146	proposed patches.
147
148	tiny fiddle in setfval to eliminate -0 results in T.expr, which
149	has irritated me for 20+ years.
150
151Aug 10, 2011:
152	another fix to avoid core dump with delete(ARGV); again, many thanks
153	to ruslan ermilov.
154
155Aug 7, 2011:
156	split(s, a, //) now behaves the same as split(s, a, "")
157
158Jun 12, 2011:
159	/pat/, \n /pat/ {...} is now legal, though bad style to use.
160
161	added checks to new -v code that permits -vnospace; thanks to
162	ruslan ermilov for spotting this and providing the patch.
163
164	removed fixed limit on number of open files; thanks to aleksey
165	cheusov and christos zoulos.
166
167	fixed day 1 bug that resurrected deleted elements of ARGV when
168	used as filenames (in lib.c).
169
170	minor type fiddles to make gcc -Wall -pedantic happier (but not
171	totally so); turned on -fno-strict-aliasing in makefile.
172
173May 6, 2011:
174	added #ifdef for isblank.
175	now allows -ffoo as well as -f foo arguments.
176	(thanks, ruslan)
177
178May 1, 2011:
179	after advice from todd miller, kevin lo, ruslan ermilov,
180	and arnold robbins, changed srand() to return the previous
181	seed (which is 1 on the first call of srand).  the seed is
182	an Awkfloat internally though converted to unsigned int to
183	pass to the library srand().  thanks, everyone.
184
185	fixed a subtle (and i hope low-probability) overflow error
186	in fldbld, by adding space for one extra \0.  thanks to
187	robert bassett for spotting this one and providing a fix.
188
189	removed the files related to compilation on windows.  i no
190	longer have anything like a current windows environment, so
191	i can't test any of it.
192
193May 23, 2010:
194	fixed long-standing overflow bug in run.c; many thanks to
195	nelson beebe for spotting it and providing the fix.
196
197	fixed bug that didn't parse -vd=1 properly; thanks to santiago
198	vila for spotting it.
199
200Feb 8, 2010:
201	i give up.  replaced isblank with isspace in b.c; there are
202	no consistent header files.
203
204Nov 26, 2009:
205	fixed a long-standing issue with when FS takes effect.  a
206	change to FS is now noticed immediately for subsequent splits.
207
208	changed the name getline() to awkgetline() to avoid yet another
209	name conflict somewhere.
210
211Feb 11, 2009:
212	temporarily for now defined HAS_ISBLANK, since that seems to
213	be the best way through the thicket.  isblank arrived in C99,
214	but seems to be arriving at different systems at different
215	times.
216
217Oct 8, 2008:
218	fixed typo in b.c that set tmpvec wrongly.  no one had ever
219	run into the problem, apparently.  thanks to alistair crooks.
220
221Oct 23, 2007:
222	minor fix in lib.c: increase inputFS to 100, change malloc
223	for fields to n+1.
224
225	fixed memory fault caused by out of order test in setsval.
226
227	thanks to david o'brien, freebsd, for both fixes.
228
229May 1, 2007:
230	fiddle in makefile to fix for BSD make; thanks to igor sobrado.
231
232Mar 31, 2007:
233	fixed some null pointer refs calling adjbuf.
234
235Feb 21, 2007:
236	fixed a bug in matching the null RE in sub and gsub.  thanks to al aho
237	who actually did the fix (in b.c), and to wolfgang seeberg for finding
238	it and providing a very compact test case.
239
240	fixed quotation in b.c; thanks to Hal Pratt and the Princeton Dante
241	Project.
242
243	removed some no-effect asserts in run.c.
244
245	fiddled maketab.c to not complain about bison-generated values.
246
247	removed the obsolete -V argument; fixed --version to print the
248	version and exit.
249
250	fixed wording and an outright error in the usage message; thanks to igor
251	sobrado and jason mcintyre.
252
253	fixed a bug in -d that caused core dump if no program followed.
254
255Jan 1, 2007:
256	dropped mac.code from makefile; there are few non-MacOSX
257	mac's these days.
258
259Jan 17, 2006:
260	system() not flagged as unsafe in the unadvertised -safe option.
261	found it while enhancing tests before shipping the ;login: article.
262	practice what you preach.
263
264	removed the 9-years-obsolete -mr and -mf flags.
265
266	added -version and --version options.
267
268	core dump on linux with BEGIN {nextfile}, now fixed.
269
270	removed some #ifdef's in run.c and lex.c that appear to no
271	longer be necessary.
272
273Apr 24, 2005:
274	modified lib.c so that values of $0 et al are preserved in the END
275	block, apparently as required by posix.  thanks to havard eidnes
276	for the report and code.
277
278Jan 14, 2005:
279	fixed infinite loop in parsing, originally found by brian tsang.
280	thanks to arnold robbins for a suggestion that started me
281	rethinking it.
282
283Dec 31, 2004:
284	prevent overflow of -f array in main, head off potential error in
285	call of SYNTAX(), test malloc return in lib.c, all with thanks to
286	todd miller.
287
288Dec 22, 2004:
289	cranked up size of NCHARS; coverity thinks it can be overrun with
290	smaller size, and i think that's right.  added some assertions to b.c
291	to catch places where it might overrun.  the RE code is still fragile.
292
293Dec 5, 2004:
294	fixed a couple of overflow problems with ridiculous field numbers:
295	e.g., print $(2^32-1).  thanks to ruslan ermilov, giorgos keramidas
296	and david o'brien at freebsd.org for patches.  this really should
297	be re-done from scratch.
298
299Nov 21, 2004:
300	fixed another 25-year-old RE bug, in split.  it's another failure
301	to (re-)initialize.  thanks to steve fisher for spotting this and
302	providing a good test case.
303
304Nov 22, 2003:
305	fixed a bug in regular expressions that dates (so help me) from 1977;
306	it's been there from the beginning.  an anchored longest match that
307	was longer than the number of states triggered a failure to initialize
308	the machine properly.  many thanks to moinak ghosh for not only finding
309	this one but for providing a fix, in some of the most mysterious
310	code known to man.
311
312	fixed a storage leak in call() that appears to have been there since
313	1983 or so -- a function without an explicit return that assigns a
314	string to a parameter leaked a Cell.  thanks to moinak ghosh for
315	spotting this very subtle one.
316
317Jul 31, 2003:
318	fixed, thanks to andrey chernov and ruslan ermilov, a bug in lex.c
319	that mis-handled the character 255 in input.  (it was being compared
320	to EOF with a signed comparison.)
321
322Jul 29, 2003:
323	fixed (i think) the long-standing botch that included the beginning of
324	line state ^ for RE's in the set of valid characters; this led to a
325	variety of odd problems, including failure to properly match certain
326	regular expressions in non-US locales.  thanks to ruslan for keeping
327	at this one.
328
329Jul 28, 2003:
330	n-th try at getting internationalization right, with thanks to volker
331	kiefel, arnold robbins and ruslan ermilov for advice, though they
332	should not be blamed for the outcome.  according to posix, "."  is the
333	radix character in programs and command line arguments regardless of
334	the locale; otherwise, the locale should prevail for input and output
335	of numbers.  so it's intended to work that way.
336
337	i have rescinded the attempt to use strcoll in expanding shorthands in
338	regular expressions (cclenter).  its properties are much too
339	surprising; for example [a-c] matches aAbBc in locale en_US but abBcC
340	in locale fr_CA.  i can see how this might arise by implementation
341	but i cannot explain it to a human user.  (this behavior can be seen
342	in gawk as well; we're leaning on the same library.)
343
344	the issue appears to be that strcoll is meant for sorting, where
345	merging upper and lower case may make sense (though note that unix
346	sort does not do this by default either).  it is not appropriate
347	for regular expressions, where the goal is to match specific
348	patterns of characters.  in any case, the notations [:lower:], etc.,
349	are available in awk, and they are more likely to work correctly in
350	most locales.
351
352	a moratorium is hereby declared on internationalization changes.
353	i apologize to friends and colleagues in other parts of the world.
354	i would truly like to get this "right", but i don't know what
355	that is, and i do not want to keep making changes until it's clear.
356
357Jul 4, 2003:
358	fixed bug that permitted non-terminated RE, as in "awk /x".
359
360Jun 1, 2003:
361	subtle change to split: if source is empty, number of elems
362	is always 0 and the array is not set.
363
364Mar 21, 2003:
365	added some parens to isblank, in another attempt to make things
366	internationally portable.
367
368Mar 14, 2003:
369	the internationalization changes, somewhat modified, are now
370	reinstated.  in theory awk will now do character comparisons
371	and case conversions in national language, but "." will always
372	be the decimal point separator on input and output regardless
373	of national language.  isblank(){} has an #ifndef.
374
375	this no longer compiles on windows: LC_MESSAGES isn't defined
376	in vc6++.
377
378	fixed subtle behavior in field and record splitting: if FS is
379	a single character and RS is not empty, \n is NOT a separator.
380	this tortuous reading is found in the awk book; behavior now
381	matches gawk and mawk.
382
383Dec 13, 2002:
384	for the moment, the internationalization changes of nov 29 are
385	rolled back -- programs like x = 1.2 don't work in some locales,
386	because the parser is expecting x = 1,2.  until i understand this
387	better, this will have to wait.
388
389Nov 29, 2002:
390	modified b.c (with tiny changes in main and run) to support
391	locales, using strcoll and iswhatever tests for posix character
392	classes.  thanks to ruslan ermilov (ru@freebsd.org) for code.
393	the function isblank doesn't seem to have propagated to any
394	header file near me, so it's there explicitly.  not properly
395	tested on non-ascii character sets by me.
396
397Jun 28, 2002:
398	modified run/format() and tran/getsval() to do a slightly better
399	job on using OFMT for output from print and CONVFMT for other
400	number->string conversions, as promised by posix and done by
401	gawk and mawk.  there are still places where it doesn't work
402	right if CONVFMT is changed; by then the STR attribute of the
403	variable has been irrevocably set.  thanks to arnold robbins for
404	code and examples.
405
406	fixed subtle bug in format that could get core dump.  thanks to
407	Jaromir Dolecek <jdolecek@NetBSD.org> for finding and fixing.
408	minor cleanup in run.c / format() at the same time.
409
410	added some tests for null pointers to debugging printf's, which
411	were never intended for external consumption.  thanks to dave
412	kerns (dkerns@lucent.com) for pointing this out.
413
414	GNU compatibility: an empty regexp matches anything (thanks to
415	dag-erling smorgrav, des@ofug.org).  subject to reversion if
416	this does more harm than good.
417
418	pervasive small changes to make things more const-correct, as
419	reported by gcc's -Wwrite-strings.  as it says in the gcc manual,
420	this may be more nuisance than useful.  provoked by a suggestion
421	and code from arnaud desitter, arnaud@nimbus.geog.ox.ac.uk
422
423	minor documentation changes to note that this now compiles out
424	of the box on Mac OS X.
425
426Feb 10, 2002:
427	changed types in posix chars structure to quiet solaris cc.
428
429Jan 1, 2002:
430	fflush() or fflush("") flushes all files and pipes.
431
432	length(arrayname) returns number of elements; thanks to
433	arnold robbins for suggestion.
434
435	added a makefile.win to make it easier to build on windows.
436	based on dan allen's buildwin.bat.
437
438Nov 16, 2001:
439	added support for posix character class names like [:digit:],
440	which are not exactly shorter than [0-9] and perhaps no more
441	portable.  thanks to dag-erling smorgrav for code.
442
443Feb 16, 2001:
444	removed -m option; no longer needed, and it was actually
445	broken (noted thanks to volker kiefel).
446
447Feb 10, 2001:
448	fixed an appalling bug in gettok: any sequence of digits, +,-, E, e,
449	and period was accepted as a valid number if it started with a period.
450	this would never have happened with the lex version.
451
452	other 1-character botches, now fixed, include a bare $ and a
453	bare " at the end of the input.
454
455Feb 7, 2001:
456	more (const char *) casts in b.c and tran.c to silence warnings.
457
458Nov 15, 2000:
459	fixed a bug introduced in august 1997 that caused expressions
460	like $f[1] to be syntax errors.  thanks to arnold robbins for
461	noticing this and providing a fix.
462
463Oct 30, 2000:
464	fixed some nextfile bugs: not handling all cases.  thanks to
465	arnold robbins for pointing this out.  new regressions added.
466
467	close() is now a function.  it returns whatever the library
468	fclose returns, and -1 for closing a file or pipe that wasn't
469	opened.
470
471Sep 24, 2000:
472	permit \n explicitly in character classes; won't work right
473	if comes in as "[\n]" but ok as /[\n]/, because of multiple
474	processing of \'s.  thanks to arnold robbins.
475
476July 5, 2000:
477	minor fiddles in tran.c to keep compilers happy about uschar.
478	thanks to norman wilson.
479
480May 25, 2000:
481	yet another attempt at making 8-bit input work, with another
482	band-aid in b.c (member()), and some (uschar) casts to head
483	off potential errors in subscripts (like isdigit).  also
484	changed HAT to NCHARS-2.  thanks again to santiago vila.
485
486	changed maketab.c to ignore apparently out of range definitions
487	instead of halting; new freeBSD generates one.  thanks to
488	jon snader <jsnader@ix.netcom.com> for pointing out the problem.
489
490May 2, 2000:
491	fixed an 8-bit problem in b.c by making several char*'s into
492	unsigned char*'s.  not clear i have them all yet.  thanks to
493	Santiago Vila <sanvila@unex.es> for the bug report.
494
495Apr 21, 2000:
496	finally found and fixed a memory leak in function call; it's
497	been there since functions were added ~1983.  thanks to
498	jon bentley for the test case that found it.
499
500	added test in envinit to catch environment "variables" with
501	names beginning with '='; thanks to Berend Hasselman.
502
503Jul 28, 1999:
504	added test in defn() to catch function foo(foo), which
505	otherwise recurses until core dump.  thanks to arnold
506	robbins for noticing this.
507
508Jun 20, 1999:
509	added *bp in gettok in lex.c; appears possible to exit function
510	without terminating the string.  thanks to russ cox.
511
512Jun 2, 1999:
513	added function stdinit() to run to initialize files[] array,
514	in case stdin, etc., are not constants; some compilers care.
515
516May 10, 1999:
517	replaced the ERROR ... FATAL, etc., macros with functions
518	based on vprintf, to avoid problems caused by overrunning
519	fixed-size errbuf array.  thanks to ralph corderoy for the
520	impetus, and for pointing out a string termination bug in
521	qstring as well.
522
523Apr 21, 1999:
524	fixed bug that caused occasional core dumps with commandline
525	variable with value ending in \.  (thanks to nelson beebe for
526	the test case.)
527
528Apr 16, 1999:
529	with code kindly provided by Bruce Lilly, awk now parses
530	/=/ and similar constructs more sensibly in more places.
531	Bruce also provided some helpful test cases.
532
533Apr 5, 1999:
534	changed true/false to True/False in run.c to make it
535	easier to compile with C++.  Added some casts on malloc
536	and realloc to be honest about casts; ditto.  changed
537	ltype int to long in struct rrow to reduce some 64-bit
538	complaints; other changes scattered throughout for the
539	same purpose.  thanks to Nelson Beebe for these portability
540	improvements.
541
542	removed some horrible pointer-int casting in b.c and elsewhere
543	by adding ptoi and itonp to localize the casts, which are
544	all benign.  fixed one incipient bug that showed up on sgi
545	in 64-bit mode.
546
547	reset lineno for new source file; include filename in error
548	message.  also fixed line number error in continuation lines.
549	(thanks to Nelson Beebe for both of these.)
550
551Mar 24, 1999:
552	Nelson Beebe notes that irix 5.3 yacc dies with a bogus
553	error; use a newer version or switch to bison, since sgi
554	is unlikely to fix it.
555
556Mar 5, 1999:
557	changed isnumber to is_number to avoid the problem caused by
558	versions of ctype.h that include the name isnumber.
559
560	distribution now includes a script for building on a Mac,
561	thanks to Dan Allen.
562
563Feb 20, 1999:
564	fixed memory leaks in run.c (call) and tran.c (setfval).
565	thanks to Stephen Nutt for finding these and providing the fixes.
566
567Jan 13, 1999:
568	replaced srand argument by (unsigned int) in run.c;
569	avoids problem on Mac and potentially on Unix & Windows.
570	thanks to Dan Allen.
571
572	added a few (int) casts to silence useless compiler warnings.
573	e.g., errorflag= in run.c jump().
574
575	added proctab.c to the bundle outout; one less thing
576	to have to compile out of the box.
577
578	added calls to _popen and _pclose to the win95 stub for
579	pipes (thanks to Steve Adams for this helpful suggestion).
580	seems to work, though properties are not well understood
581	by me, and it appears that under some circumstances the
582	pipe output is truncated.  Be careful.
583
584Oct 19, 1998:
585	fixed a couple of bugs in getrec: could fail to update $0
586	after a getline var; because inputFS wasn't initialized,
587	could split $0 on every character, a misleading diversion.
588
589	fixed caching bug in makedfa: LRU was actually removing
590	least often used.
591
592	thanks to ross ridge for finding these, and for providing
593	great bug reports.
594
595May 12, 1998:
596	fixed potential bug in readrec: might fail to update record
597	pointer after growing.  thanks to dan levy for spotting this
598	and suggesting the fix.
599
600Mar 12, 1998:
601	added -V to print version number and die.
602
603[notify dave kerns, dkerns@dacsoup.ih.lucent.com]
604
605Feb 11, 1998:
606	subtle silent bug in lex.c: if the program ended with a number
607	longer than 1 digit, part of the input would be pushed back and
608	parsed again because token buffer wasn't terminated right.
609	example:  awk 'length($0) > 10'.  blush.  at least i found it
610	myself.
611
612Aug 31, 1997:
613	s/adelete/awkdelete/: SGI uses this in malloc.h.
614	thanks to nelson beebe for pointing this one out.
615
616Aug 21, 1997:
617	fixed some bugs in sub and gsub when replacement includes \\.
618	this is a dark, horrible corner, but at least now i believe that
619	the behavior is the same as gawk and the intended posix standard.
620	thanks to arnold robbins for advice here.
621
622Aug 9, 1997:
623	somewhat regretfully, replaced the ancient lex-based lexical
624	analyzer with one written in C.  it's longer, generates less code,
625	and more portable; the old one depended too much on mysterious
626	properties of lex that were not preserved in other environments.
627	in theory these recognize the same language.
628
629	now using strtod to test whether a string is a number, instead of
630	the convoluted original function.  should be more portable and
631	reliable if strtod is implemented right.
632
633	removed now-pointless optimization in makefile that tries to avoid
634	recompilation when awkgram.y is changed but symbols are not.
635
636	removed most fixed-size arrays, though a handful remain, some
637	of which are unchecked.  you have been warned.
638
639Aug 4, 1997:
640	with some trepidation, replaced the ancient code that managed
641	fields and $0 in fixed-size arrays with arrays that grow on
642	demand.  there is still some tension between trying to make this
643	run fast and making it clean; not sure it's right yet.
644
645	the ill-conceived -mr and -mf arguments are now useful only
646	for debugging.  previous dynamic string code removed.
647
648	numerous other minor cleanups along the way.
649
650Jul 30, 1997:
651	using code provided by dan levy (to whom profuse thanks), replaced
652	fixed-size arrays and awkward kludges by a fairly uniform mechanism
653	to grow arrays as needed for printf, sub, gsub, etc.
654
655Jul 23, 1997:
656	falling off the end of a function returns "" and 0, not 0.
657	thanks to arnold robbins.
658
659Jun 17, 1997:
660	replaced several fixed-size arrays by dynamically-created ones
661	in run.c; added overflow tests to some previously unchecked cases.
662	getline, toupper, tolower.
663
664	getline code is still broken in that recursive calls may wind
665	up using the same space.  [fixed later]
666
667	increased RECSIZE to 8192 to push problems further over the horizon.
668
669	added \r to \n as input line separator for programs, not data.
670	damn CRLFs.
671
672	modified format() to permit explicit printf("%c", 0) to include
673	a null byte in output.  thanks to ken stailey for the fix.
674
675	added a "-safe" argument that disables file output (print >,
676	print >>), process creation (cmd|getline, print |, system), and
677	access to the environment (ENVIRON).  this is a first approximation
678	to a "safe" version of awk, but don't rely on it too much.  thanks
679	to joan feigenbaum and matt blaze for the inspiration long ago.
680
681Jul 8, 1996:
682	fixed long-standing bug in sub, gsub(/a/, "\\\\&"); thanks to
683	ralph corderoy.
684
685Jun 29, 1996:
686	fixed awful bug in new field splitting; didn't get all the places
687	where input was done.
688
689Jun 28, 1996:
690	changed field-splitting to conform to posix definition: fields are
691	split using the value of FS at the time of input; it used to be
692	the value when the field or NF was first referred to, a much less
693	predictable definition.  thanks to arnold robbins for encouragement
694	to do the right thing.
695
696May 28, 1996:
697	fixed appalling but apparently unimportant bug in parsing octal
698	numbers in reg exprs.
699
700	explicit hex in reg exprs now limited to 2 chars: \xa, \xaa.
701
702May 27, 1996:
703	cleaned up some declarations so gcc -Wall is now almost silent.
704
705	makefile now includes backup copies of ytab.c and lexyy.c in case
706	one makes before looking; it also avoids recreating lexyy.c unless
707	really needed.
708
709	s/aprintf/awkprint, s/asprintf/awksprintf/ to avoid some name clashes
710	with unwisely-written header files.
711
712	thanks to jeffrey friedl for several of these.
713
714May 26, 1996:
715	an attempt to rationalize the (unsigned) char issue.  almost all
716	instances of unsigned char have been removed; the handful of places
717	in b.c where chars are used as table indices have been hand-crafted.
718	added some latin-1 tests to the regression, but i'm not confident;
719	none of my compilers seem to care much.  thanks to nelson beebe for
720	pointing out some others that do care.
721
722May 2, 1996:
723	removed all register declarations.
724
725	enhanced split(), as in gawk, etc:  split(s, a, "") splits s into
726	a[1]...a[length(s)] with each character a single element.
727
728	made the same changes for field-splitting if FS is "".
729
730	added nextfile, as in gawk: causes immediate advance to next
731	input file. (thanks to arnold robbins for inspiration and code).
732
733	small fixes to regexpr code:  can now handle []], [[], and
734	variants;  [] is now a syntax error, rather than matching
735	everything;  [z-a] is now empty, not z.  far from complete
736	or correct, however.  (thanks to jeffrey friedl for pointing out
737	some awful behaviors.)
738
739Apr 29, 1996:
740	replaced uchar by uschar everywhere; apparently some compilers
741	usurp this name and this causes conflicts.
742
743	fixed call to time in run.c (bltin); arg is time_t *.
744
745	replaced horrible pointer/long punning in b.c by a legitimate
746	union.  should be safer on 64-bit machines and cleaner everywhere.
747	(thanks to nelson beebe for pointing out some of these problems.)
748
749	replaced nested comments by #if 0...#endif in run.c, lib.c.
750
751	removed getsval, setsval, execute macros from run.c and lib.c.
752	machines are 100x faster than they were when these macros were
753	first used.
754
755	revised filenames: awk.g.y => awkgram.y, awk.lx.l => awklex.l,
756	y.tab.[ch] => ytab.[ch], lex.yy.c => lexyy.c, all in the aid of
757	portability to nameless systems.
758
759	"make bundle" now includes yacc and lex output files for recipients
760	who don't have yacc or lex.
761
762Aug 15, 1995:
763	initialized Cells in setsymtab more carefully; some fields
764	were not set.  (thanks to purify, all of whose complaints i
765	think i now understand.)
766
767	fixed at least one error in gsub that looked at -1-th element
768	of an array when substituting for a null match (e.g., $).
769
770	delete arrayname is now legal; it clears the elements but leaves
771	the array, which may not be the right behavior.
772
773	modified makefile: my current make can't cope with the test used
774	to avoid unnecessary yacc invocations.
775
776Jul 17, 1995:
777	added dynamically growing strings to awk.lx.l and b.c
778	to permit regular expressions to be much bigger.
779	the state arrays can still overflow.
780
781Aug 24, 1994:
782	detect duplicate arguments in function definitions (mdm).
783
784May 11, 1994:
785	trivial fix to printf to limit string size in sub().
786
787Apr 22, 1994:
788	fixed yet another subtle self-assignment problem:
789	$1 = $2; $1 = $1 clobbered $1.
790
791	Regression tests now use private echo, to avoid quoting problems.
792
793Feb 2, 1994:
794	changed error() to print line number as %d, not %g.
795
796Jul 23, 1993:
797	cosmetic changes: increased sizes of some arrays,
798	reworded some error messages.
799
800	added CONVFMT as in posix (just replaced OFMT in getsval)
801
802	FILENAME is now "" until the first thing that causes a file
803	to be opened.
804
805Nov 28, 1992:
806	deleted yyunput and yyoutput from proto.h;
807	different versions of lex give these different declarations.
808
809May 31, 1992:
810	added -mr N and -mf N options: more record and fields.
811	these really ought to adjust automatically.
812
813	cleaned up some error messages; "out of space" now means
814	malloc returned NULL in all cases.
815
816	changed rehash so that if it runs out, it just returns;
817	things will continue to run slow, but maybe a bit longer.
818
819Apr 24, 1992:
820	remove redundant close of stdin when using -f -.
821
822	got rid of core dump with -d; awk -d just prints date.
823
824Apr 12, 1992:
825	added explicit check for /dev/std(in,out,err) in redirection.
826	unlike gawk, no /dev/fd/n yet.
827
828	added (file/pipe) builtin.  hard to test satisfactorily.
829	not posix.
830
831Feb 20, 1992:
832	recompile after abortive changes;  should be unchanged.
833
834Dec 2, 1991:
835	die-casting time:  converted to ansi C, installed that.
836
837Nov 30, 1991:
838	fixed storage leak in freefa, failing to recover [N]CCL.
839	thanks to Bill Jones (jones@cs.usask.ca)
840
841Nov 19, 1991:
842	use RAND_MAX instead of literal in builtin().
843
844Nov 12, 1991:
845	cranked up some fixed-size arrays in b.c, and added a test for
846	overflow in penter.  thanks to mark larsen.
847
848Sep 24, 1991:
849	increased buffer in gsub.  a very crude fix to a general problem.
850	and again on Sep 26.
851
852Aug 18, 1991:
853	enforce variable name syntax for commandline variables: has to
854	start with letter or _.
855
856Jul 27, 1991:
857	allow newline after ; in for statements.
858
859Jul 21, 1991:
860	fixed so that in self-assignment like $1=$1, side effects
861	like recomputing $0 take place.  (this is getting subtle.)
862
863Jun 30, 1991:
864	better test for detecting too-long output record.
865
866Jun 2, 1991:
867	better defense against very long printf strings.
868	made break and continue illegal outside of loops.
869
870May 13, 1991:
871	removed extra arg on gettemp, tempfree.  minor error message rewording.
872
873May 6, 1991:
874	fixed silly bug in hex parsing in hexstr().
875	removed an apparently unnecessary test in isnumber().
876	warn about weird printf conversions.
877	fixed unchecked array overwrite in relex().
878
879	changed for (i in array) to access elements in sorted order.
880	then unchanged it -- it really does run slower in too many cases.
881	left the code in place, commented out.
882
883Feb 10, 1991:
884	check error status on all writes, to avoid banging on full disks.
885
886Jan 28, 1991:
887	awk -f - reads the program from stdin.
888
889Jan 11, 1991:
890	failed to set numeric state on $0 in cmd|getline context in run.c.
891
892Nov 2, 1990:
893	fixed sleazy test for integrality in getsval;  use modf.
894
895Oct 29, 1990:
896	fixed sleazy buggy code in lib.c that looked (incorrectly) for
897	too long input lines.
898
899Oct 14, 1990:
900	fixed the bug on p. 198 in which it couldn't deduce that an
901	argument was an array in some contexts.  replaced the error
902	message in intest() by code that damn well makes it an array.
903
904Oct 8, 1990:
905	fixed horrible bug:  types and values were not preserved in
906	some kinds of self-assignment. (in assign().)
907
908Aug 24, 1990:
909	changed NCHARS to 256 to handle 8-bit characters in strings
910	presented to match(), etc.
911
912Jun 26, 1990:
913	changed struct rrow (awk.h) to use long instead of int for lval,
914	since cfoll() stores a pointer in it.  now works better when int's
915	are smaller than pointers!
916
917May 6, 1990:
918	AVA fixed the grammar so that ! is uniformly of the same precedence as
919	unary + and -.  This renders illegal some constructs like !x=y, which
920	now has to be parenthesized as !(x=y), and makes others work properly:
921	!x+y is (!x)+y, and x!y is x !y, not two pattern-action statements.
922	(These problems were pointed out by Bob Lenk of Posix.)
923
924	Added \x to regular expressions (already in strings).
925	Limited octal to octal digits; \8 and \9 are not octal.
926	Centralized the code for parsing escapes in regular expressions.
927	Added a bunch of tests to T.re and T.sub to verify some of this.
928
929Feb 9, 1990:
930	fixed null pointer dereference bug in main.c:  -F[nothing].  sigh.
931
932	restored srand behavior:  it returns the current seed.
933
934Jan 18, 1990:
935	srand now returns previous seed value (0 to start).
936
937Jan 5, 1990:
938	fix potential problem in tran.c -- something was freed,
939	then used in freesymtab.
940
941Oct 18, 1989:
942	another try to get the max number of open files set with
943	relatively machine-independent code.
944
945	small fix to input() in case of multiple reads after EOF.
946
947Oct 11, 1989:
948	FILENAME is now defined in the BEGIN block -- too many old
949	programs broke.
950
951	"-" means stdin in getline as well as on the commandline.
952
953	added a bunch of casts to the code to tell the truth about
954	char * vs. unsigned char *, a right royal pain.  added a
955	setlocale call to the front of main, though probably no one
956	has it usefully implemented yet.
957
958Aug 24, 1989:
959	removed redundant relational tests against nullnode if parse
960	tree already had a relational at that point.
961
962Aug 11, 1989:
963	fixed bug:  commandline variable assignment has to look like
964	var=something.  (consider the man page for =, in file =.1)
965
966	changed number of arguments to functions to static arrays
967	to avoid repeated malloc calls.
968
969Aug 2, 1989:
970	restored -F (space) separator
971
972Jul 30, 1989:
973	added -v x=1 y=2 ... for immediate commandline variable assignment;
974	done before the BEGIN block for sure.  they have to precede the
975	program if the program is on the commandline.
976	Modified Aug 2 to require a separate -v for each assignment.
977
978Jul 10, 1989:
979	fixed ref-thru-zero bug in environment code in tran.c
980
981Jun 23, 1989:
982	add newline to usage message.
983
984Jun 14, 1989:
985	added some missing ansi printf conversion letters: %i %X %E %G.
986	no sensible meaning for h or L, so they may not do what one expects.
987
988	made %* conversions work.
989
990	changed x^y so that if n is a positive integer, it's done
991	by explicit multiplication, thus achieving maximum accuracy.
992	(this should be done by pow() but it seems not to be locally.)
993	done to x ^= y as well.
994
995Jun 4, 1989:
996	ENVIRON array contains environment: if shell variable V=thing,
997		ENVIRON["V"] is "thing"
998
999	multiple -f arguments permitted.  error reporting is naive.
1000	(they were permitted before, but only the last was used.)
1001
1002	fixed a really stupid botch in the debugging macro dprintf
1003
1004	fixed order of evaluation of commandline assignments to match
1005	what the book claims:  an argument of the form x=e is evaluated
1006	at the time it would have been opened if it were a filename (p 63).
1007	this invalidates the suggested answer to ex 4-1 (p 195).
1008
1009	removed some code that permitted -F (space) fieldseparator,
1010	since it didn't quite work right anyway.  (restored aug 2)
1011
1012Apr 27, 1989:
1013	Line number now accumulated correctly for comment lines.
1014
1015Apr 26, 1989:
1016	Debugging output now includes a version date,
1017	if one compiles it into the source each time.
1018
1019Apr 9, 1989:
1020	Changed grammar to prohibit constants as 3rd arg of sub and gsub;
1021	prevents class of overwriting-a-constant errors.  (Last one?)
1022	This invalidates the "banana" example on page 43 of the book.
1023
1024	Added \a ("alert"), \v (vertical tab), \xhhh (hexadecimal),
1025	as in ANSI, for strings.  Rescinded the sloppiness that permitted
1026	non-octal digits in \ooo.  Warning:  not all compilers and libraries
1027	will be able to deal with \x correctly.
1028
1029Jan 9, 1989:
1030	Fixed bug that caused tempcell list to contain a duplicate.
1031	The fix is kludgy.
1032
1033Dec 17, 1988:
1034	Catches some more commandline errors in main.
1035	Removed redundant decl of modf in run.c (confuses some compilers).
1036	Warning:  there's no single declaration of malloc, etc., in awk.h
1037	that seems to satisfy all compilers.
1038
1039Dec 7, 1988:
1040	Added a bit of code to error printing to avoid printing nulls.
1041	(Not clear that it actually would.)
1042
1043Nov 27, 1988:
1044	With fear and trembling, modified the grammar to permit
1045	multiple pattern-action statements on one line without
1046	an explicit separator.  By definition, this capitulation
1047	to the ghost of ancient implementations remains undefined
1048	and thus subject to change without notice or apology.
1049	DO NOT COUNT ON IT.
1050
1051Oct 30, 1988:
1052	Fixed bug in call() that failed to recover storage.
1053
1054	A warning is now generated if there are more arguments
1055	in the call than in the definition (in lieu of fixing
1056	another storage leak).
1057
1058Oct 20, 1988:
1059	Fixed %c:  if expr is numeric, use numeric value;
1060	otherwise print 1st char of string value.  still
1061	doesn't work if the value is 0 -- won't print \0.
1062
1063	Added a few more checks for running out of malloc.
1064
1065Oct 12, 1988:
1066	Fixed bug in call() that freed local arrays twice.
1067
1068	Fixed to handle deletion of non-existent array right;
1069	complains about attempt to delete non-array element.
1070
1071Sep 30, 1988:
1072	Now guarantees to evaluate all arguments of built-in
1073	functions, as in C;  the appearance is that arguments
1074	are evaluated before the function is called.  Places
1075	affected are sub (gsub was ok), substr, printf, and
1076	all the built-in arithmetic functions in bltin().
1077	A warning is generated if a bltin() is called with
1078	the wrong number of arguments.
1079
1080	This requires changing makeprof on p167 of the book.
1081
1082Aug 23, 1988:
1083	setting FILENAME in BEGIN caused core dump, apparently
1084	because it was freeing space not allocated by malloc.
1085
1086July 24, 1988:
1087	fixed egregious error in toupper/tolower functions.
1088	still subject to rescinding, however.
1089
1090July 2, 1988:
1091	flush stdout before opening file or pipe
1092
1093July 2, 1988:
1094	performance bug in b.c/cgoto(): not freeing some sets of states.
1095	partial fix only right now, and the number of states increased
1096	to make it less obvious.
1097
1098June 1, 1988:
1099	check error status on close
1100
1101May 28, 1988:
1102	srand returns seed value it's using.
1103	see 1/18/90
1104
1105May 22, 1988:
1106	Removed limit on depth of function calls.
1107
1108May 10, 1988:
1109	Fixed lib.c to permit _ in commandline variable names.
1110
1111Mar 25, 1988:
1112	main.c fixed to recognize -- as terminator of command-
1113	line options.  Illegal options flagged.
1114	Error reporting slightly cleaned up.
1115
1116Dec 2, 1987:
1117	Newer C compilers apply a strict scope rule to extern
1118	declarations within functions.  Two extern declarations in
1119	lib.c and tran.c have been moved to obviate this problem.
1120
1121Oct xx, 1987:
1122	Reluctantly added toupper and tolower functions.
1123	Subject to rescinding without notice.
1124
1125Sep 17, 1987:
1126	Error-message printer had printf(s) instead of
1127	printf("%s",s);  got core dumps when the message
1128	included a %.
1129
1130Sep 12, 1987:
1131	Very long printf strings caused core dump;
1132	fixed aprintf, asprintf, format to catch them.
1133	Can still get a core dump in printf itself.
1134
1135
1136