xref: /linux/tools/net/sunrpc/xdrgen/README (revision 60675d4ca1ef0857e44eba5849b74a3a998d0c0f)
14b132aacSChuck Leverxdrgen - Linux Kernel XDR code generator
24b132aacSChuck Lever
34b132aacSChuck LeverIntroduction
44b132aacSChuck Lever------------
54b132aacSChuck Lever
64b132aacSChuck LeverSunRPC programs are typically specified using a language defined by
74b132aacSChuck LeverRFC 4506. In fact, all IETF-published NFS specifications provide a
84b132aacSChuck Leverdescription of the specified protocol using this language.
94b132aacSChuck Lever
104b132aacSChuck LeverSince the 1990's, user space consumers of SunRPC have had access to
114b132aacSChuck Levera tool that could read such XDR specifications and then generate C
124b132aacSChuck Levercode that implements the RPC portions of that protocol. This tool is
134b132aacSChuck Levercalled rpcgen.
144b132aacSChuck Lever
154b132aacSChuck LeverThis RPC-level code is code that handles input directly from the
164b132aacSChuck Levernetwork, and thus a high degree of memory safety and sanity checking
174b132aacSChuck Leveris needed to help ensure proper levels of security. Bugs in this
184b132aacSChuck Levercode can have significant impact on security and performance.
194b132aacSChuck Lever
204b132aacSChuck LeverHowever, it is code that is repetitive and tedious to write by hand.
214b132aacSChuck Lever
224b132aacSChuck LeverThe C code generated by rpcgen makes extensive use of the facilities
234b132aacSChuck Leverof the user space TI-RPC library and libc. Furthermore, the dialect
244b132aacSChuck Leverof the generated code is very traditional K&R C.
254b132aacSChuck Lever
264b132aacSChuck LeverThe Linux kernel's implementation of SunRPC-based protocols hand-roll
274b132aacSChuck Levertheir XDR implementation. There are two main reasons for this:
284b132aacSChuck Lever
294b132aacSChuck Lever1. libtirpc (and its predecessors) operate only in user space. The
304b132aacSChuck Lever   kernel's RPC implementation and its API are significantly
314b132aacSChuck Lever   different than libtirpc.
324b132aacSChuck Lever
334b132aacSChuck Lever2. rpcgen-generated code is believed to be less efficient than code
344b132aacSChuck Lever   that is hand-written.
354b132aacSChuck Lever
364b132aacSChuck LeverThese days, gcc and its kin are capable of optimizing code better
374b132aacSChuck Leverthan human authors. There are only a few instances where writing
384b132aacSChuck LeverXDR code by hand will make a measurable performance different.
394b132aacSChuck Lever
404b132aacSChuck LeverIn addition, the current hand-written code in the Linux kernel is
414b132aacSChuck Leverdifficult to audit and prove that it implements exactly what is in
424b132aacSChuck Leverthe protocol specification.
434b132aacSChuck Lever
444b132aacSChuck LeverIn order to accrue the benefits of machine-generated XDR code in the
454b132aacSChuck Leverkernel, a tool is needed that will output C code that works against
464b132aacSChuck Leverthe kernel's SunRPC implementation rather than libtirpc.
474b132aacSChuck Lever
484b132aacSChuck LeverEnter xdrgen.
494b132aacSChuck Lever
504b132aacSChuck Lever
514b132aacSChuck LeverDependencies
524b132aacSChuck Lever------------
534b132aacSChuck Lever
544b132aacSChuck LeverThese dependencies are typically packaged by Linux distributions:
554b132aacSChuck Lever
564b132aacSChuck Lever- python3
574b132aacSChuck Lever- python3-lark
584b132aacSChuck Lever- python3-jinja2
594b132aacSChuck Lever
604b132aacSChuck LeverThese dependencies are available via PyPi:
614b132aacSChuck Lever
624b132aacSChuck Lever- pip install 'lark[interegular]'
634b132aacSChuck Lever
644b132aacSChuck Lever
654b132aacSChuck LeverXDR Specifications
664b132aacSChuck Lever------------------
674b132aacSChuck Lever
684b132aacSChuck LeverWhen adding a new protocol implementation to the kernel, the XDR
694b132aacSChuck Leverspecification can be derived by feeding a .txt copy of the RFC to
704b132aacSChuck Leverthe script located in tools/net/sunrpc/extract.sh.
714b132aacSChuck Lever
724b132aacSChuck Lever   $ extract.sh < rfc0001.txt > new2.x
734b132aacSChuck Lever
744b132aacSChuck Lever
754b132aacSChuck LeverOperation
764b132aacSChuck Lever---------
774b132aacSChuck Lever
784b132aacSChuck LeverOnce a .x file is available, use xdrgen to generate source and
794b132aacSChuck Leverheader files containing an implementation of XDR encoding and
804b132aacSChuck Leverdecoding functions for the specified protocol.
814b132aacSChuck Lever
824b132aacSChuck Lever   $ ./xdrgen definitions new2.x > include/linux/sunrpc/xdrgen/new2.h
834b132aacSChuck Lever   $ ./xdrgen declarations new2.x > new2xdr_gen.h
844b132aacSChuck Lever
854b132aacSChuck Leverand
864b132aacSChuck Lever
874b132aacSChuck Lever   $ ./xdrgen source new2.x > new2xdr_gen.c
884b132aacSChuck Lever
894b132aacSChuck LeverThe files are ready to use for a server-side protocol implementation,
904b132aacSChuck Leveror may be used as a guide for implementing these routines by hand.
914b132aacSChuck Lever
924b132aacSChuck LeverBy default, the only comments added to this code are kdoc comments
934b132aacSChuck Leverthat appear directly in front of the public per-procedure APIs. For
944b132aacSChuck Leverdeeper introspection, specifying the "--annotate" flag will insert
954b132aacSChuck Leveradditional comments in the generated code to help readers match the
964b132aacSChuck Levergenerated code to specific parts of the XDR specification.
974b132aacSChuck Lever
984b132aacSChuck LeverBecause the generated code is targeted for the Linux kernel, it
994b132aacSChuck Leveris tagged with a GPLv2-only license.
1004b132aacSChuck Lever
1014b132aacSChuck LeverThe xdrgen tool can also provide lexical and syntax checking of
1024b132aacSChuck Leveran XDR specification:
1034b132aacSChuck Lever
1044b132aacSChuck Lever   $ ./xdrgen lint xdr/new.x
1054b132aacSChuck Lever
1064b132aacSChuck Lever
1074b132aacSChuck LeverHow It Works
1084b132aacSChuck Lever------------
1094b132aacSChuck Lever
1104b132aacSChuck Leverxdrgen does not use machine learning to generate source code. The
1114b132aacSChuck Levertranslation is entirely deterministic.
1124b132aacSChuck Lever
1134b132aacSChuck LeverRFC 4506 Section 6 contains a BNF grammar of the XDR specification
1144b132aacSChuck Leverlanguage. The grammar has been adapted for use by the Python Lark
1154b132aacSChuck Levermodule.
1164b132aacSChuck Lever
1174b132aacSChuck LeverThe xdr.ebnf file in this directory contains the grammar used to
1184b132aacSChuck Leverparse XDR specifications. xdrgen configures Lark using the grammar
1194b132aacSChuck Leverin xdr.ebnf. Lark parses the target XDR specification using this
1204b132aacSChuck Levergrammar, creating a parse tree.
1214b132aacSChuck Lever
1224b132aacSChuck Leverxdrgen then transforms the parse tree into an abstract syntax tree.
1234b132aacSChuck LeverThis tree is passed to a series of code generators.
1244b132aacSChuck Lever
1254b132aacSChuck LeverThe generators are implemented as Python classes residing in the
1264b132aacSChuck Levergenerators/ directory. Each generator emits code created from Jinja2
1274b132aacSChuck Levertemplates stored in the templates/ directory.
1284b132aacSChuck Lever
1294b132aacSChuck LeverThe source code is generated in the same order in which they appear
1304b132aacSChuck Leverin the specification to ensure the generated code compiles. This
1314b132aacSChuck Leverconforms with the behavior of rpcgen.
1324b132aacSChuck Lever
1334b132aacSChuck Leverxdrgen assumes that the generated source code is further compiled by
1344b132aacSChuck Levera compiler that can optimize in a number of ways, including:
1354b132aacSChuck Lever
1364b132aacSChuck Lever - Unused functions are discarded (ie, not added to the executable)
1374b132aacSChuck Lever
1384b132aacSChuck Lever - Aggressive function inlining removes unnecessary stack frames
1394b132aacSChuck Lever
1404b132aacSChuck Lever - Single-arm switch statements are replaced by a single conditional
1414b132aacSChuck Lever   branch
1424b132aacSChuck Lever
1434b132aacSChuck LeverAnd so on.
1444b132aacSChuck Lever
1454b132aacSChuck Lever
1464b132aacSChuck LeverPragmas
1474b132aacSChuck Lever-------
1484b132aacSChuck Lever
1494b132aacSChuck LeverPragma directives specify exceptions to the normal generation of
1504b132aacSChuck Leverencoding and decoding functions. Currently one directive is
1514b132aacSChuck Leverimplemented: "public".
1524b132aacSChuck Lever
153*b376d519SChuck LeverPragma big_endian
154*b376d519SChuck Lever------ ----------
155*b376d519SChuck Lever
156*b376d519SChuck Lever  pragma big_endian <enum> ;
157*b376d519SChuck Lever
158*b376d519SChuck LeverFor variables that might contain only a small number values, it
159*b376d519SChuck Leveris more efficient to avoid the byte-swap when encoding or decoding
160*b376d519SChuck Leveron little-endian machines. Such is often the case with error status
161*b376d519SChuck Levercodes. For example:
162*b376d519SChuck Lever
163*b376d519SChuck Lever  pragma big_endian nfsstat3;
164*b376d519SChuck Lever
165*b376d519SChuck LeverIn this case, when generating an XDR struct or union containing a
166*b376d519SChuck Leverfield of type "nfsstat3", xdrgen will make the type of that field
167*b376d519SChuck Lever"__be32" instead of "enum nfsstat3". XDR unions then switch on the
168*b376d519SChuck Levernon-byte-swapped value of that field.
169*b376d519SChuck Lever
1704b132aacSChuck LeverPragma exclude
1714b132aacSChuck Lever------ -------
1724b132aacSChuck Lever
1734b132aacSChuck Lever  pragma exclude <RPC procedure> ;
1744b132aacSChuck Lever
1754b132aacSChuck LeverIn some cases, a procedure encoder or decoder function might need
1764b132aacSChuck Leverspecial processing that cannot be automatically generated. The
1774b132aacSChuck Leverautomatically-generated functions might conflict or interfere with
1784b132aacSChuck Leverthe hand-rolled function. To avoid editing the generated source code
1794b132aacSChuck Leverby hand, a pragma can specify that the procedure's encoder and
1804b132aacSChuck Leverdecoder functions are not included in the generated header and
1814b132aacSChuck Leversource.
1824b132aacSChuck Lever
1834b132aacSChuck LeverFor example:
1844b132aacSChuck Lever
1854b132aacSChuck Lever  pragma exclude NFSPROC3_READDIRPLUS;
1864b132aacSChuck Lever
1874b132aacSChuck LeverExcludes the decoder function for the READDIRPLUS argument and the
1884b132aacSChuck Leverencoder function for the READDIRPLUS result.
1894b132aacSChuck Lever
1904b132aacSChuck LeverNote that because data item encoder and decoder functions are
1914b132aacSChuck Leverdefined "static __maybe_unused", subsequent compilation
1924b132aacSChuck Leverautomatically excludes data item encoder and decoder functions that
1934b132aacSChuck Leverare used only by excluded procedure.
1944b132aacSChuck Lever
1954b132aacSChuck LeverPragma header
1964b132aacSChuck Lever------ ------
1974b132aacSChuck Lever
1984b132aacSChuck Lever  pragma header <string> ;
1994b132aacSChuck Lever
2004b132aacSChuck LeverProvide a name to use for the header file. For example:
2014b132aacSChuck Lever
2024b132aacSChuck Lever  pragma header nlm4;
2034b132aacSChuck Lever
2044b132aacSChuck LeverAdds
2054b132aacSChuck Lever
2064b132aacSChuck Lever  #include "nlm4xdr_gen.h"
2074b132aacSChuck Lever
2084b132aacSChuck Leverto the generated source file.
2094b132aacSChuck Lever
2104b132aacSChuck LeverPragma public
2114b132aacSChuck Lever------ ------
2124b132aacSChuck Lever
2134b132aacSChuck Lever  pragma public <XDR data item> ;
2144b132aacSChuck Lever
2154b132aacSChuck LeverNormally XDR encoder and decoder functions are "static". In case an
2164b132aacSChuck Leverimplementer wants to call these functions from other source code,
2174b132aacSChuck Levers/he can add a public pragma in the input .x file to indicate a set
2184b132aacSChuck Leverof functions that should get a prototype in the generated header,
2194b132aacSChuck Leverand the function definitions will not be declared static.
2204b132aacSChuck Lever
2214b132aacSChuck LeverFor example:
2224b132aacSChuck Lever
2234b132aacSChuck Lever  pragma public nfsstat3;
2244b132aacSChuck Lever
2254b132aacSChuck LeverAdds these prototypes in the generated header:
2264b132aacSChuck Lever
2274b132aacSChuck Lever  bool xdrgen_decode_nfsstat3(struct xdr_stream *xdr, enum nfsstat3 *ptr);
2284b132aacSChuck Lever  bool xdrgen_encode_nfsstat3(struct xdr_stream *xdr, enum nfsstat3 value);
2294b132aacSChuck Lever
2304b132aacSChuck LeverAnd, in the generated source code, both of these functions appear
2314b132aacSChuck Leverwithout the "static __maybe_unused" modifiers.
2324b132aacSChuck Lever
2334b132aacSChuck Lever
2344b132aacSChuck LeverFuture Work
2354b132aacSChuck Lever-----------
2364b132aacSChuck Lever
2374b132aacSChuck LeverFinish implementing XDR pointer and list types.
2384b132aacSChuck Lever
2394b132aacSChuck LeverGenerate client-side procedure functions
2404b132aacSChuck Lever
2414b132aacSChuck LeverExpand the README into a user guide similar to rpcgen(1)
2424b132aacSChuck Lever
2434b132aacSChuck LeverAdd more pragma directives:
2444b132aacSChuck Lever
2454b132aacSChuck Lever  * @pages -- use xdr_read/write_pages() for the specified opaque
2464b132aacSChuck Lever    field
2474b132aacSChuck Lever  * @skip -- do not decode, but rather skip, the specified argument
2484b132aacSChuck Lever    field
2494b132aacSChuck Lever
2504b132aacSChuck LeverEnable something like a #include to dynamically insert the content
2514b132aacSChuck Leverof other specification files
2524b132aacSChuck Lever
2534b132aacSChuck LeverProperly support line-by-line pass-through via the "%" decorator
2544b132aacSChuck Lever
2554b132aacSChuck LeverBuild a unit test suite for verifying translation of XDR language
2564b132aacSChuck Leverinto compilable code
2574b132aacSChuck Lever
2584b132aacSChuck LeverAdd a command-line option to insert trace_printk call sites in the
2594b132aacSChuck Levergenerated source code, for improved (temporary) observability
2604b132aacSChuck Lever
2614b132aacSChuck LeverGenerate kernel Rust code as well as C code
262