| a23d4c2f | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: fix syntax error for attribute before init-declarator
A longstanding issue with genksyms is that it has hidden syntax errors.
For example, genksyms fails to parse the following valid code
genksyms: fix syntax error for attribute before init-declarator
A longstanding issue with genksyms is that it has hidden syntax errors.
For example, genksyms fails to parse the following valid code:
int x, __attribute__((__section__(".init.data")))y;
Here, only 'y' is annotated by the attribute, although I am not aware of actual uses of this pattern in the kernel tree.
When a syntax error occurs, yyerror() is called. However, error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
$ echo 'int x, __attribute__((__section__(".init.data")))y;' | scripts/genksyms/genksyms -w <stdin>:1: syntax error
This commit allows attributes to be placed between a comma and init_declarator.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| c8258405 | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: fix syntax error for builtin (u)int*x*_t types
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However, error_with
genksyms: fix syntax error for builtin (u)int*x*_t types
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However, error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, genksyms fails to parse the following code in arch/arm64/lib/xor-neon.c:
static inline uint64x2_t eor3(uint64x2_t p, uint64x2_t q, uint64x2_t r) { [ snip ] }
The syntax error occurs because genksyms does not recognize the uint64x2_t keyword.
This commit adds support for builtin types described in Arm Neon Intrinsics Reference.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| 6494bd2d | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: fix syntax error for attribute after 'union'
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However, error_with_p
genksyms: fix syntax error for attribute after 'union'
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However, error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ fs/lockd/svc.i $ cat fs/lockd/svc.i | scripts/genksyms/genksyms -w [ snip ] ./include/net/addrconf.h:35: syntax error
The syntax error occurs in the following code in include/net/addrconf.h:
union __packed { [ snip ] };
The issue arises from __packed, which is defined as __attribute__((__packed__)), immediately after the 'union' keyword.
This commit allows the 'union' keyword to be followed by attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| 82db1c29 | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: fix syntax error for attribute after 'struct'
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However, error_with_
genksyms: fix syntax error for attribute after 'struct'
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However, error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ arch/x86/kernel/cpu/mshyperv.i $ cat arch/x86/kernel/cpu/mshyperv.i | scripts/genksyms/genksyms -w [ snip ] ./arch/x86/include/asm/svm.h:122: syntax error
The syntax error occurs in the following code in arch/x86/include/asm/svm.h:
struct __attribute__ ((__packed__)) vmcb_control_area { [ snip ] };
The issue arises from __attribute__ immediately after the 'struct' keyword.
This commit allows the 'struct' keyword to be followed by attributes.
The lexer must be adjusted because dont_want_brace_phase should not be decremented while processing attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| 2ac068cb | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: fix syntax error for attribute after abstact_declarator
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However, e
genksyms: fix syntax error for attribute after abstact_declarator
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However, error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ kernel/module/main.i $ cat kernel/module/main.i | scripts/genksyms/genksyms -w [ snip ] kernel/module/main.c:97: syntax error
The syntax error occurs in the following code in kernel/module/main.c:
static void __mod_update_bounds(enum mod_mem_type type __maybe_unused, void *base, unsigned int size, struct mod_tree_root *tree) { [ snip ] }
The issue arises from __maybe_unused, which is defined as __attribute__((__unused__)).
This commit allows direct_abstract_declarator to be followed with attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| a8b7d066 | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: fix syntax error for attribute before nested_declarator
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However, e
genksyms: fix syntax error for attribute before nested_declarator
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However, error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ drivers/acpi/prmt.i $ cat drivers/acpi/prmt.i | scripts/genksyms/genksyms -w [ snip ] drivers/acpi/prmt.c:56: syntax error
The syntax error occurs in the following code in drivers/acpi/prmt.c:
struct prm_handler_info { [ snip ] efi_status_t (__efiapi *handler_addr)(u64, void *); [ snip ] };
The issue arises from __efiapi, which is defined as either __attribute__((ms_abi)) or __attribute__((regparm(0))).
This commit allows nested_declarator to be prefixed with attributes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| 2966b66c | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: fix syntax error for attribute before abstract_declarator
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However,
genksyms: fix syntax error for attribute before abstract_declarator
A longstanding issue with genksyms is that it has hidden syntax errors.
When a syntax error occurs, yyerror() is called. However, error_with_pos() is a no-op unless the -w option is provided.
You can observe syntax errors by manually passing the -w option.
For example, with CONFIG_MODVERSIONS=y on v6.13-rc1:
$ make -s KCFLAGS=-D__GENKSYMS__ init/main.i $ cat init/main.i | scripts/genksyms/genksyms -w [ snip ] ./include/linux/efi.h:1225: syntax error
The syntax error occurs in the following code in include/linux/efi.h:
efi_status_t efi_call_acpi_prm_handler(efi_status_t (__efiapi *handler_addr)(u64, void *), u64 param_buffer_addr, void *context);
The issue arises from __efiapi, which is defined as either __attribute__((ms_abi)) or __attribute__((regparm(0))).
This commit allows abstract_declarator to be prefixed with attributes.
To avoid conflicts, I tweaked the rule for decl_specifier_seq. Due to this change, a standalone attribute cannot become decl_specifier_seq. Otherwise, I do not know how to resolve the conflicts.
The following code, which was previously accepted by genksyms, will now result in a syntax error:
void my_func(__attribute__((unused))x);
I do not think it is a big deal because GCC also fails to parse it.
$ echo 'void my_func(__attribute__((unused))x);' | gcc -c -x c - <stdin>:1:37: error: unknown type name 'x'
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| ec28bfff | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: decouple ATTRIBUTE_PHRASE from type-qualifier
The __attribute__ keyword can appear in more contexts than 'const' or 'volatile'.
To avoid grammatical conflicts with future changes, ATTRIBU
genksyms: decouple ATTRIBUTE_PHRASE from type-qualifier
The __attribute__ keyword can appear in more contexts than 'const' or 'volatile'.
To avoid grammatical conflicts with future changes, ATTRIBUTE_PHRASE should not be reduced into type_qualifier.
No functional changes are intended.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| ccc11a19 | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: record attributes consistently for init-declarator
I believe the missing action here is a bug.
For rules with no explicit action, the following default is used:
{ $$ = $1; }
However
genksyms: record attributes consistently for init-declarator
I believe the missing action here is a bug.
For rules with no explicit action, the following default is used:
{ $$ = $1; }
However, in this case, $1 is the value of attribute_opt itself. As a result, the value of attribute_opt is always NULL.
The following test code demonstrates inconsistent behavior.
int x __attribute__((__aligned__(4))); int y __attribute__((__aligned__(4))) = 0;
The attribute is recorded only when followed by an initializer.
This commit adds the correct action to propagate the value of the ATTRIBUTE_PHRASE token.
With this change, the attribute in the example above is consistently recorded for both 'x' and 'y'.
[Before]
$ cat <<EOF | scripts/genksyms/genksyms -d int x __attribute__((__aligned__(4))); int y __attribute__((__aligned__(4))) = 0; EOF Defn for type0 x == <int x > Defn for type0 y == <int y __attribute__ ( ( __aligned__ ( 4 ) ) ) > Hash table occupancy 2/4096 = 0.000488281
[After]
$ cat <<EOF | scripts/genksyms/genksyms -d int x __attribute__((__aligned__(4))); int y __attribute__((__aligned__(4))) = 0; EOF Defn for type0 x == <int x __attribute__ ( ( __aligned__ ( 4 ) ) ) > Defn for type0 y == <int y __attribute__ ( ( __aligned__ ( 4 ) ) ) > Hash table occupancy 2/4096 = 0.000488281
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| aa710cee | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: restrict direct-declarator to take one parameter-type-list
Similar to the previous commit, this change makes the parser logic a little more accurate.
Currently, genksyms accepts the follo
genksyms: restrict direct-declarator to take one parameter-type-list
Similar to the previous commit, this change makes the parser logic a little more accurate.
Currently, genksyms accepts the following invalid code:
struct foo { int (*callback)(int)(int)(int); };
A direct-declarator should not recursively absorb multiple ( parameter-type-list ) constructs.
In the example above, (*callback) should be followed by at most one (int).
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| c2f1846b | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: restrict direct-abstract-declarator to take one parameter-type-list
While there is no more grammatical ambiguity in genksyms, the parser logic is still inaccurate.
For example, genksyms a
genksyms: restrict direct-abstract-declarator to take one parameter-type-list
While there is no more grammatical ambiguity in genksyms, the parser logic is still inaccurate.
For example, genksyms accepts the following invalid C code:
void my_func(int ()(int));
This should result in a syntax error because () cannot be reduced to <direct-abstract-declarator>.
( <abstract-declarator> ) can be reduced, but <abstract-declarator> must not be empty in the following grammar from K&R [1]:
<direct-abstract-declarator> ::= ( <abstract-declarator> ) | {<direct-abstract-declarator>}? [ {<constant-expression>}? ] | {<direct-abstract-declarator>}? ( {<parameter-type-list>}? )
Furthermore, genksyms accepts the following weird code:
void my_func(int (*callback)(int)(int)(int));
The parser allows <direct-abstract-declarator> to recursively absorb multiple ( {<parameter-type-list>}? ), but this behavior is incorrect.
In the example above, (*callback) should be followed by at most one (int).
[1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| a9529865 | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: remove Makefile hack
This workaround was introduced for suppressing the reduce/reduce conflict warnings because the %expect-rr directive, which is applicable only to GLR parsers, cannot be
genksyms: remove Makefile hack
This workaround was introduced for suppressing the reduce/reduce conflict warnings because the %expect-rr directive, which is applicable only to GLR parsers, cannot be used for genksyms.
Since there are no longer any conflicts, this Makefile hack is now unnecessary.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| 668de2b9 | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: fix last 3 shift/reduce conflicts
The genksyms parser has ambiguities in its grammar, which are currently suppressed by a workaround in scripts/genksyms/Makefile.
Building genksyms with W
genksyms: fix last 3 shift/reduce conflicts
The genksyms parser has ambiguities in its grammar, which are currently suppressed by a workaround in scripts/genksyms/Makefile.
Building genksyms with W=1 generates the following warnings:
YACC scripts/genksyms/parse.tab.[ch] scripts/genksyms/parse.y: warning: 3 shift/reduce conflicts [-Wconflicts-sr] scripts/genksyms/parse.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples
The ambiguity arises when decl_specifier_seq is followed by '(' because the following two interpretations are possible:
- decl_specifier_seq direct_abstract_declarator '(' parameter_declaration_clause ')' - decl_specifier_seq '(' abstract_declarator ')'
This issue occurs because the current parser allows an empty string to be reduced to direct_abstract_declarator, which is incorrect.
K&R [1] explains the correct grammar:
<parameter-declaration> ::= {<declaration-specifier>}+ <declarator> | {<declaration-specifier>}+ <abstract-declarator> | {<declaration-specifier>}+
<abstract-declarator> ::= <pointer> | <pointer> <direct-abstract-declarator> | <direct-abstract-declarator>
<direct-abstract-declarator> ::= ( <abstract-declarator> ) | {<direct-abstract-declarator>}? [ {<constant-expression>}? ] | {<direct-abstract-declarator>}? ( {<parameter-type-list>}? )
This commit resolves all remaining conflicts.
We need to consider the difference between the following two examples:
[Example 1] ( <abstract-declarator> ) can become <direct-abstract-declarator>
void my_func(int (foo));
... is equivalent to:
void my_func(int foo);
[Example 2] ( <parameter-type-list> ) can become <direct-abstract-declarator>
typedef int foo; void my_func(int (foo));
... is equivalent to:
void my_func(int (*callback)(int));
Please note that the function declaration is identical in both examples, but the preceding typedef creates the distinction. I introduced a new term, open_paren, to enable the type lookup immediately after the '(' token. Without this, we cannot distinguish between [Example 1] and [Example 2].
[1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| 3ccda63a | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts
The genksyms parser has ambiguities in its grammar, which are currently suppressed by a workaround in scripts/genksyms/Makefile.
genksyms: fix 6 shift/reduce conflicts and 5 reduce/reduce conflicts
The genksyms parser has ambiguities in its grammar, which are currently suppressed by a workaround in scripts/genksyms/Makefile.
Building genksyms with W=1 generates the following warnings:
YACC scripts/genksyms/parse.tab.[ch] scripts/genksyms/parse.y: warning: 9 shift/reduce conflicts [-Wconflicts-sr] scripts/genksyms/parse.y: warning: 5 reduce/reduce conflicts [-Wconflicts-rr] scripts/genksyms/parse.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples
The comment in the parser describes the current problem:
/* This wasn't really a typedef name but an identifier that shadows one. */
Consider the following simple C code:
typedef int foo; void my_func(foo foo) {}
In the function parameter list (foo foo), the first 'foo' is a type specifier (typedef'ed as 'int'), while the second 'foo' is an identifier.
However, the lexer cannot distinguish between the two. Since 'foo' is already typedef'ed, the lexer returns TYPE for both instances, instead of returning IDENT for the second one.
To support shadowed identifiers, TYPE can be reduced to either a simple_type_specifier or a direct_abstract_declarator, which creates a grammatical ambiguity.
Without analyzing the grammar context, it is very difficult to resolve this correctly.
This commit introduces a flag, dont_want_type_specifier, which allows the parser to inform the lexer whether an identifier is expected. When dont_want_type_specifier is true, the type lookup is suppressed, and the lexer returns IDENT regardless of any preceding typedef.
After this commit, only 3 shift/reduce conflicts will remain.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| bc3a812b | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: reduce type_qualifier directly to decl_specifier
A type_qualifier (const, volatile, etc.) is not a type_specifier.
According to K&R [1], a type-qualifier should be directly reduced to a d
genksyms: reduce type_qualifier directly to decl_specifier
A type_qualifier (const, volatile, etc.) is not a type_specifier.
According to K&R [1], a type-qualifier should be directly reduced to a declaration-specifier.
<declaration-specifier> ::= <storage-class-specifier> | <type-specifier> | <type-qualifier>
[1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| f33bfbd1 | 13-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: rename cvar_qualifier to type_qualifier
I believe "cvar" stands for "Const, Volatile, Attribute, or Restrict".
This is called "type-qualifier" in K&R. [1]
Adopt this more generic naming.
genksyms: rename cvar_qualifier to type_qualifier
I believe "cvar" stands for "Const, Volatile, Attribute, or Restrict".
This is called "type-qualifier" in K&R. [1]
Adopt this more generic naming.
No functional changes are intended.
[1] https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Nicolas Schier <n.schier@avm.de>
show more ...
|
| a56fece7 | 03-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: use uint32_t instead of unsigned long for calculating CRC
Currently, 'unsigned long' is used for intermediate variables when calculating CRCs.
The size of 'long' differs depending on the
genksyms: use uint32_t instead of unsigned long for calculating CRC
Currently, 'unsigned long' is used for intermediate variables when calculating CRCs.
The size of 'long' differs depending on the architecture: it is 32 bits on 32-bit architectures and 64 bits on 64-bit architectures.
The CRC values generated by genksyms represent the compatibility of exported symbols. Therefore, reproducibility is important. In other words, we need to ensure that the output is the same when the kernel source is identical, regardless of whether genksyms is running on a 32-bit or 64-bit build machine.
Fortunately, the output from genksyms is not affected by the build machine's architecture because only the lower 32 bits of the 'unsigned long' variables are used.
To make it even clearer that the CRC calculation is independent of the build machine's architecture, this commit explicitly uses the fixed-width type, uint32_t.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
show more ...
|
| 2480f53f | 03-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: refactor the return points in the for-loop in __add_symbol()
free_list() must be called before returning from this for-loop.
Swap 'break' and the combination of free_list() and 'return'.
genksyms: refactor the return points in the for-loop in __add_symbol()
free_list() must be called before returning from this for-loop.
Swap 'break' and the combination of free_list() and 'return'.
This reduces the code and minimizes the risk of introducing memory leaks in future changes.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
show more ...
|
| f034d186 | 03-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: reduce the indentation in the for-loop in __add_symbol()
To improve readability, reduce the indentation as follows:
- Use 'continue' earlier when the symbol does not match
- flip !sy
genksyms: reduce the indentation in the for-loop in __add_symbol()
To improve readability, reduce the indentation as follows:
- Use 'continue' earlier when the symbol does not match
- flip !sym->is_declared to flatten the if-else chain
No functional changes are intended.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
show more ...
|
| be2fa44b | 03-Jan-2025 |
Masahiro Yamada <masahiroy@kernel.org> |
genksyms: fix memory leak when the same symbol is read from *.symref file
When a symbol that is already registered is read again from *.symref file, __add_symbol() removes the previous one from the
genksyms: fix memory leak when the same symbol is read from *.symref file
When a symbol that is already registered is read again from *.symref file, __add_symbol() removes the previous one from the hash table without freeing it.
[Test Case]
$ cat foo.c #include <linux/export.h> void foo(void); void foo(void) {} EXPORT_SYMBOL(foo);
$ cat foo.symref foo void foo ( void ) foo void foo ( void )
When a symbol is removed from the hash table, it must be freed along with its ->name and ->defn members. However, sym->name cannot be freed because it is sometimes shared with node->string, but not always. If sym->name and node->string share the same memory, free(sym->name) could lead to a double-free bug.
To resolve this issue, always assign a strdup'ed string to sym->name.
Fixes: 64e6c1e12372 ("genksyms: track symbol checksum changes") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
show more ...
|