1*b0c40a00SSimon J. Gerraty# $NetBSD: varmod-edge.mk,v 1.16 2021/02/23 15:56:30 rillig Exp $ 249caa483SSimon J. Gerraty# 349caa483SSimon J. Gerraty# Tests for edge cases in variable modifiers. 449caa483SSimon J. Gerraty# 549caa483SSimon J. Gerraty# These tests demonstrate the current implementation in small examples. 649caa483SSimon J. Gerraty# They may contain surprising behavior. 749caa483SSimon J. Gerraty# 849caa483SSimon J. Gerraty# Each test consists of: 949caa483SSimon J. Gerraty# - INP, the input to the test 1049caa483SSimon J. Gerraty# - MOD, the expression for testing the modifier 1149caa483SSimon J. Gerraty# - EXP, the expected output 1249caa483SSimon J. Gerraty 1349caa483SSimon J. GerratyTESTS+= M-paren 1449caa483SSimon J. GerratyINP.M-paren= (parentheses) {braces} (opening closing) () 1549caa483SSimon J. GerratyMOD.M-paren= ${INP.M-paren:M(*)} 1649caa483SSimon J. GerratyEXP.M-paren= (parentheses) () 1749caa483SSimon J. Gerraty 1849caa483SSimon J. Gerraty# The first closing brace matches the opening parenthesis. 1949caa483SSimon J. Gerraty# The second closing brace actually ends the variable expression. 2049caa483SSimon J. Gerraty# 2149caa483SSimon J. Gerraty# XXX: This is unexpected but rarely occurs in practice. 2249caa483SSimon J. GerratyTESTS+= M-mixed 2349caa483SSimon J. GerratyINP.M-mixed= (paren-brace} ( 2449caa483SSimon J. GerratyMOD.M-mixed= ${INP.M-mixed:M(*}} 2549caa483SSimon J. GerratyEXP.M-mixed= (paren-brace} 2649caa483SSimon J. Gerraty 2749caa483SSimon J. Gerraty# After the :M modifier has parsed the pattern, only the closing brace 2849caa483SSimon J. Gerraty# and the colon are unescaped. The other characters are left as-is. 2949caa483SSimon J. Gerraty# To actually see this effect, the backslashes in the :M modifier need 3049caa483SSimon J. Gerraty# to be doubled since single backslashes would simply be unescaped by 3149caa483SSimon J. Gerraty# Str_Match. 3249caa483SSimon J. Gerraty# 3349caa483SSimon J. Gerraty# XXX: This is unexpected. The opening brace should also be unescaped. 3449caa483SSimon J. GerratyTESTS+= M-unescape 3549caa483SSimon J. GerratyINP.M-unescape= ({}): \(\{\}\)\: \(\{}\): 3649caa483SSimon J. GerratyMOD.M-unescape= ${INP.M-unescape:M\\(\\{\\}\\)\\:} 3749caa483SSimon J. GerratyEXP.M-unescape= \(\{}\): 3849caa483SSimon J. Gerraty 3949caa483SSimon J. Gerraty# When the :M and :N modifiers are parsed, the pattern finishes as soon 4049caa483SSimon J. Gerraty# as open_parens + open_braces == closing_parens + closing_braces. This 4149caa483SSimon J. Gerraty# means that ( and } form a matching pair. 4249caa483SSimon J. Gerraty# 4349caa483SSimon J. Gerraty# Nested variable expressions are not parsed as such. Instead, only the 4449caa483SSimon J. Gerraty# parentheses and braces are counted. This leads to a parse error since 4549caa483SSimon J. Gerraty# the nested expression is not "${:U*)}" but only "${:U*)", which is 4649caa483SSimon J. Gerraty# missing the closing brace. The expression is evaluated anyway. 4749caa483SSimon J. Gerraty# The final brace in the output comes from the end of M.nest-mix. 4849caa483SSimon J. Gerraty# 4949caa483SSimon J. Gerraty# XXX: This is unexpected but rarely occurs in practice. 5049caa483SSimon J. GerratyTESTS+= M-nest-mix 5149caa483SSimon J. GerratyINP.M-nest-mix= (parentheses) 5249caa483SSimon J. GerratyMOD.M-nest-mix= ${INP.M-nest-mix:M${:U*)}} 5349caa483SSimon J. GerratyEXP.M-nest-mix= (parentheses)} 54*b0c40a00SSimon J. Gerraty# make: Unclosed variable expression, expecting '}' for modifier "U*)" of variable "" with value "*)" 5549caa483SSimon J. Gerraty 5649caa483SSimon J. Gerraty# In contrast to parentheses and braces, the brackets are not counted 5749caa483SSimon J. Gerraty# when the :M modifier is parsed since Makefile variables only take the 5849caa483SSimon J. Gerraty# ${VAR} or $(VAR) forms, but not $[VAR]. 5949caa483SSimon J. Gerraty# 6049caa483SSimon J. Gerraty# The final ] in the pattern is needed to close the character class. 6149caa483SSimon J. GerratyTESTS+= M-nest-brk 6249caa483SSimon J. GerratyINP.M-nest-brk= [ [[ [[[ 6349caa483SSimon J. GerratyMOD.M-nest-brk= ${INP.M-nest-brk:M${:U[[[[[]}} 6449caa483SSimon J. GerratyEXP.M-nest-brk= [ 6549caa483SSimon J. Gerraty 6649caa483SSimon J. Gerraty# The pattern in the nested variable has an unclosed character class. 6749caa483SSimon J. Gerraty# No error is reported though, and the pattern is closed implicitly. 6849caa483SSimon J. Gerraty# 6949caa483SSimon J. Gerraty# XXX: It is unexpected that no error is reported. 7049caa483SSimon J. Gerraty# See str.c, function Str_Match. 7149caa483SSimon J. Gerraty# 7249caa483SSimon J. Gerraty# Before 2019-12-02, this test case triggered an out-of-bounds read 7349caa483SSimon J. Gerraty# in Str_Match. 7449caa483SSimon J. GerratyTESTS+= M-pat-err 7549caa483SSimon J. GerratyINP.M-pat-err= [ [[ [[[ 7649caa483SSimon J. GerratyMOD.M-pat-err= ${INP.M-pat-err:M${:U[[}} 7749caa483SSimon J. GerratyEXP.M-pat-err= [ 7849caa483SSimon J. Gerraty 7949caa483SSimon J. Gerraty# The first backslash does not escape the second backslash. 8049caa483SSimon J. Gerraty# Therefore, the second backslash escapes the parenthesis. 8149caa483SSimon J. Gerraty# This means that the pattern ends there. 8249caa483SSimon J. Gerraty# The final } in the output comes from the end of MOD.M-bsbs. 8349caa483SSimon J. Gerraty# 8449caa483SSimon J. Gerraty# If the first backslash were to escape the second backslash, the first 8549caa483SSimon J. Gerraty# closing brace would match the opening parenthesis (see M-mixed), and 8649caa483SSimon J. Gerraty# the second closing brace would be needed to close the variable. 8749caa483SSimon J. Gerraty# After that, the remaining backslash would escape the parenthesis in 8849caa483SSimon J. Gerraty# the pattern, therefore (} would match. 8949caa483SSimon J. GerratyTESTS+= M-bsbs 9049caa483SSimon J. GerratyINP.M-bsbs= (} \( \(} 9149caa483SSimon J. GerratyMOD.M-bsbs= ${INP.M-bsbs:M\\(}} 9249caa483SSimon J. GerratyEXP.M-bsbs= \(} 9349caa483SSimon J. Gerraty#EXP.M-bsbs= (} # If the first backslash were to escape ... 9449caa483SSimon J. Gerraty 9549caa483SSimon J. Gerraty# The backslash in \( does not escape the parenthesis, therefore it 9649caa483SSimon J. Gerraty# counts for the nesting level and matches with the first closing brace. 9749caa483SSimon J. Gerraty# The second closing brace closes the variable, and the third is copied 9849caa483SSimon J. Gerraty# literally. 9949caa483SSimon J. Gerraty# 10049caa483SSimon J. Gerraty# The second :M in the pattern is nested between ( and }, therefore it 10149caa483SSimon J. Gerraty# does not start a new modifier. 10249caa483SSimon J. GerratyTESTS+= M-bs1-par 10349caa483SSimon J. GerratyINP.M-bs1-par= ( (:M (:M} \( \(:M \(:M} 10449caa483SSimon J. GerratyMOD.M-bs1-par= ${INP.M-bs1-par:M\(:M*}}} 10549caa483SSimon J. GerratyEXP.M-bs1-par= (:M}} 10649caa483SSimon J. Gerraty 10749caa483SSimon J. Gerraty# The double backslash is passed verbatim to the pattern matcher. 10849caa483SSimon J. Gerraty# The Str_Match pattern is \\(:M*}, and there the backslash is unescaped. 10949caa483SSimon J. Gerraty# Again, the ( takes place in the nesting level, and there is no way to 11049caa483SSimon J. Gerraty# prevent this, no matter how many backslashes are used. 11149caa483SSimon J. GerratyTESTS+= M-bs2-par 11249caa483SSimon J. GerratyINP.M-bs2-par= ( (:M (:M} \( \(:M \(:M} 11349caa483SSimon J. GerratyMOD.M-bs2-par= ${INP.M-bs2-par:M\\(:M*}}} 11449caa483SSimon J. GerratyEXP.M-bs2-par= \(:M}} 11549caa483SSimon J. Gerraty 11649caa483SSimon J. Gerraty# Str_Match uses a recursive algorithm for matching the * patterns. 11749caa483SSimon J. Gerraty# Make sure that it survives patterns with 128 asterisks. 11849caa483SSimon J. Gerraty# That should be enough for all practical purposes. 11949caa483SSimon J. Gerraty# To produce a stack overflow, just add more :Qs below. 12049caa483SSimon J. GerratyTESTS+= M-128 12149caa483SSimon J. GerratyINP.M-128= ${:U\\:Q:Q:Q:Q:Q:Q:Q:S,\\,x,g} 12249caa483SSimon J. GerratyPAT.M-128= ${:U\\:Q:Q:Q:Q:Q:Q:Q:S,\\,*,g} 12349caa483SSimon J. GerratyMOD.M-128= ${INP.M-128:M${PAT.M-128}} 12449caa483SSimon J. GerratyEXP.M-128= ${INP.M-128} 12549caa483SSimon J. Gerraty 12649caa483SSimon J. Gerraty# This is the normal SysV substitution. Nothing surprising here. 12749caa483SSimon J. GerratyTESTS+= eq-ext 12849caa483SSimon J. GerratyINP.eq-ext= file.c file.cc 12949caa483SSimon J. GerratyMOD.eq-ext= ${INP.eq-ext:%.c=%.o} 13049caa483SSimon J. GerratyEXP.eq-ext= file.o file.cc 13149caa483SSimon J. Gerraty 13249caa483SSimon J. Gerraty# The SysV := modifier is greedy and consumes all the modifier text 13349caa483SSimon J. Gerraty# up until the closing brace or parenthesis. The :Q may look like a 13449caa483SSimon J. Gerraty# modifier, but it really isn't, that's why it appears in the output. 13549caa483SSimon J. GerratyTESTS+= eq-q 13649caa483SSimon J. GerratyINP.eq-q= file.c file.cc 13749caa483SSimon J. GerratyMOD.eq-q= ${INP.eq-q:%.c=%.o:Q} 13849caa483SSimon J. GerratyEXP.eq-q= file.o:Q file.cc 13949caa483SSimon J. Gerraty 14049caa483SSimon J. Gerraty# The = in the := modifier can be escaped. 14149caa483SSimon J. GerratyTESTS+= eq-bs 14249caa483SSimon J. GerratyINP.eq-bs= file.c file.c=%.o 14349caa483SSimon J. GerratyMOD.eq-bs= ${INP.eq-bs:%.c\=%.o=%.ext} 14449caa483SSimon J. GerratyEXP.eq-bs= file.c file.ext 14549caa483SSimon J. Gerraty 1462c3632d1SSimon J. Gerraty# Having only an escaped '=' results in a parse error. 1472c3632d1SSimon J. Gerraty# The call to "pattern.lhs = ParseModifierPart" fails. 14849caa483SSimon J. GerratyTESTS+= eq-esc 14949caa483SSimon J. GerratyINP.eq-esc= file.c file... 15049caa483SSimon J. GerratyMOD.eq-esc= ${INP.eq-esc:a\=b} 15149caa483SSimon J. GerratyEXP.eq-esc= # empty 1522c3632d1SSimon J. Gerraty# make: Unfinished modifier for INP.eq-esc ('=' missing) 15349caa483SSimon J. Gerraty 1542c3632d1SSimon J. GerratyTESTS+= colon 1552c3632d1SSimon J. GerratyINP.colon= value 1562c3632d1SSimon J. GerratyMOD.colon= ${INP.colon:} 1572c3632d1SSimon J. GerratyEXP.colon= value 1582c3632d1SSimon J. Gerraty 1592c3632d1SSimon J. GerratyTESTS+= colons 1602c3632d1SSimon J. GerratyINP.colons= value 1612c3632d1SSimon J. GerratyMOD.colons= ${INP.colons::::} 1622c3632d1SSimon J. GerratyEXP.colons= # empty 1632c3632d1SSimon J. Gerraty 16449caa483SSimon J. Gerraty.for test in ${TESTS} 16549caa483SSimon J. Gerraty. if ${MOD.${test}} == ${EXP.${test}} 1662c3632d1SSimon J. Gerraty. info ok ${test} 16749caa483SSimon J. Gerraty. else 1682c3632d1SSimon J. Gerraty. warning error in ${test}: expected "${EXP.${test}}", got "${MOD.${test}}" 16949caa483SSimon J. Gerraty. endif 17049caa483SSimon J. Gerraty.endfor 1712c3632d1SSimon J. Gerraty 172*b0c40a00SSimon J. Gerraty# Even in expressions based on an unnamed variable, there may be errors. 173*b0c40a00SSimon J. Gerraty# XXX: The error message should mention the variable name of the expression, 174*b0c40a00SSimon J. Gerraty# even though that name is empty in this case. 175*b0c40a00SSimon J. Gerraty.if ${:Z} 176*b0c40a00SSimon J. Gerraty. error 177*b0c40a00SSimon J. Gerraty.else 178*b0c40a00SSimon J. Gerraty. error 179*b0c40a00SSimon J. Gerraty.endif 180*b0c40a00SSimon J. Gerraty 181*b0c40a00SSimon J. Gerraty# Even in expressions based on an unnamed variable, there may be errors. 182*b0c40a00SSimon J. Gerraty# 183*b0c40a00SSimon J. Gerraty# Before var.c 1.842 from 2021-02-23, the error message did not surround the 184*b0c40a00SSimon J. Gerraty# variable name with quotes, leading to the rather confusing "Unfinished 185*b0c40a00SSimon J. Gerraty# modifier for (',' missing)", having two spaces in a row. 186*b0c40a00SSimon J. Gerraty# 187*b0c40a00SSimon J. Gerraty# XXX: The error message should report the filename:lineno. 188*b0c40a00SSimon J. Gerraty.if ${:S,} 189*b0c40a00SSimon J. Gerraty. error 190*b0c40a00SSimon J. Gerraty.else 191*b0c40a00SSimon J. Gerraty. error 192*b0c40a00SSimon J. Gerraty.endif 193*b0c40a00SSimon J. Gerraty 1942c3632d1SSimon J. Gerratyall: 1952c3632d1SSimon J. Gerraty @echo ok 196