1/**************************************************************** 2Copyright (C) Lucent Technologies 1997 3All Rights Reserved 4 5Permission to use, copy, modify, and distribute this software and 6its documentation for any purpose and without fee is hereby 7granted, provided that the above copyright notice appear in all 8copies and that both that the copyright notice and this 9permission notice and warranty disclaimer appear in supporting 10documentation, and that the name Lucent Technologies or any of 11its entities not be used in advertising or publicity pertaining 12to distribution of the software without specific, written prior 13permission. 14 15LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, 16INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. 17IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY 18SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES 19WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER 20IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, 21ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF 22THIS SOFTWARE. 23****************************************************************/ 24 25This file lists all bug fixes, changes, etc., made since the 26second edition of the AWK book was published in September 2023. 27 28Apr 22, 2024: 29 fixed regex engine gototab reallocation issue that was 30 introduced during the Nov 24 rewrite. Thanks to Arnold Robbins. 31 Fixed a scan bug in split in the case the separator is a single 32 character. thanks to Oguz Ismail for spotting the issue. 33 34Mar 10, 2024: 35 fixed use-after-free bug in fnematch due to adjbuf invalidating 36 the pointers to buf. thanks to github user caffe3 for spotting 37 the issue and providing a fix, and to Miguel Pineiro Jr. 38 for the alternative fix. 39 MAX_UTF_BYTES in fnematch has been replaced with awk_mb_cur_max. 40 thanks to Miguel Pineiro Jr. 41 42Jan 22, 2024: 43 Restore the ability to compile with g++. Thanks to 44 Arnold Robbins. 45 46Dec 24, 2023: 47 Matchop dereference after free problem fix when the first 48 argument is a function call. Thanks to Oguz Ismail Uysal. 49 Fix inconsistent handling of --csv and FS set in the 50 command line. Thanks to Wilbert van der Poel. 51 Casting changes to int for is* functions. 52 53Nov 27, 2023: 54 Fix exit status of system on MacOS. Update to REGRESS. 55 Thanks to Arnold Robbins. 56 Fix inconsistent handling of -F and --csv, and loss of csv 57 mode when FS is set. 58 59Nov 24, 2023: 60 Fix issue #199: gototab improvements to dynamically resize the 61 table, qsort and bsearch to improve the lookup speed as the 62 table gets larger for multibyte input. Thanks to Arnold Robbins. 63 64Nov 23, 2023: 65 Fix Issue #169, related to escape sequences in strings. 66 Thanks to Github user rajeevvp. 67 Fix Issue #147, reported by Github user drawkula, and fixed 68 by Miguel Pineiro Jr. 69 70Nov 20, 2023: 71 Rewrite of fnematch to fix a number of issues, including 72 extraneous output, out-of-bounds access, number of bytes 73 to push back after a failed match etc. 74 Thanks to Miguel Pineiro Jr. 75 76Nov 15, 2023: 77 Man page edit, regression test fixes. Thanks to Arnold Robbins 78 Consolidation of sub and gsub into dosub, removing duplicate 79 code. Thanks to Miguel Pineiro Jr. 80 gcc replaced with cc everywhere. 81 82Oct 30, 2023: 83 Multiple fixes and a minor code cleanup. 84 Disabled utf-8 for non-multibyte locales, such as C or POSIX. 85 Fixed a bad char * cast that causes incorrect results on big-endian 86 systems. Also fixed an out-of-bounds read for empty CCL. 87 Fixed a buffer overflow in substr with utf-8 strings. 88 Many thanks to Todd C Miller. 89 90Sep 24, 2023: 91 fnematch and getrune have been overhauled to solve issues around 92 unicode FS and RS. Also fixed gsub null match issue with unicode. 93 Big thanks to Arnold Robbins. 94 95Sep 12, 2023: 96 Fixed a length error in u8_byte2char that set RSTART to 97 incorrect (cannot happen) value for EOL match(str, /$/). 98 99 100----------------------------------------------------------------- 101 102[This entry is a summary, not a precise list of changes.] 103 104 Added --csv option to enable processing of comma-separated 105 values inputs. When --csv is enabled, fields are separated 106 by commas, fields may be quoted with " double quotes, fields 107 may contain embedded newlines. 108 109 If no explicit separator argument is provided, split() uses 110 the setting of --csv to determine how fields are split. 111 112 Strings may now contain UTF-8 code points (not necessarily 113 characters). Functions that operate on characters, like 114 length, substr, index, match, etc., use UTF-8, so the length 115 of a string of 3 emojis is 3, not 12 as it would be if bytes 116 were counted. 117 118 Regular expressions are processed as UTF-8. 119 120 Unicode literals can be written as \u followed by one 121 to eight hexadecimal digits. These may appear in strings and 122 regular expressions. 123