1*4b132aacSChuck Leverxdrgen - Linux Kernel XDR code generator 2*4b132aacSChuck Lever 3*4b132aacSChuck LeverIntroduction 4*4b132aacSChuck Lever------------ 5*4b132aacSChuck Lever 6*4b132aacSChuck LeverSunRPC programs are typically specified using a language defined by 7*4b132aacSChuck LeverRFC 4506. In fact, all IETF-published NFS specifications provide a 8*4b132aacSChuck Leverdescription of the specified protocol using this language. 9*4b132aacSChuck Lever 10*4b132aacSChuck LeverSince the 1990's, user space consumers of SunRPC have had access to 11*4b132aacSChuck Levera tool that could read such XDR specifications and then generate C 12*4b132aacSChuck Levercode that implements the RPC portions of that protocol. This tool is 13*4b132aacSChuck Levercalled rpcgen. 14*4b132aacSChuck Lever 15*4b132aacSChuck LeverThis RPC-level code is code that handles input directly from the 16*4b132aacSChuck Levernetwork, and thus a high degree of memory safety and sanity checking 17*4b132aacSChuck Leveris needed to help ensure proper levels of security. Bugs in this 18*4b132aacSChuck Levercode can have significant impact on security and performance. 19*4b132aacSChuck Lever 20*4b132aacSChuck LeverHowever, it is code that is repetitive and tedious to write by hand. 21*4b132aacSChuck Lever 22*4b132aacSChuck LeverThe C code generated by rpcgen makes extensive use of the facilities 23*4b132aacSChuck Leverof the user space TI-RPC library and libc. Furthermore, the dialect 24*4b132aacSChuck Leverof the generated code is very traditional K&R C. 25*4b132aacSChuck Lever 26*4b132aacSChuck LeverThe Linux kernel's implementation of SunRPC-based protocols hand-roll 27*4b132aacSChuck Levertheir XDR implementation. There are two main reasons for this: 28*4b132aacSChuck Lever 29*4b132aacSChuck Lever1. libtirpc (and its predecessors) operate only in user space. The 30*4b132aacSChuck Lever kernel's RPC implementation and its API are significantly 31*4b132aacSChuck Lever different than libtirpc. 32*4b132aacSChuck Lever 33*4b132aacSChuck Lever2. rpcgen-generated code is believed to be less efficient than code 34*4b132aacSChuck Lever that is hand-written. 35*4b132aacSChuck Lever 36*4b132aacSChuck LeverThese days, gcc and its kin are capable of optimizing code better 37*4b132aacSChuck Leverthan human authors. There are only a few instances where writing 38*4b132aacSChuck LeverXDR code by hand will make a measurable performance different. 39*4b132aacSChuck Lever 40*4b132aacSChuck LeverIn addition, the current hand-written code in the Linux kernel is 41*4b132aacSChuck Leverdifficult to audit and prove that it implements exactly what is in 42*4b132aacSChuck Leverthe protocol specification. 43*4b132aacSChuck Lever 44*4b132aacSChuck LeverIn order to accrue the benefits of machine-generated XDR code in the 45*4b132aacSChuck Leverkernel, a tool is needed that will output C code that works against 46*4b132aacSChuck Leverthe kernel's SunRPC implementation rather than libtirpc. 47*4b132aacSChuck Lever 48*4b132aacSChuck LeverEnter xdrgen. 49*4b132aacSChuck Lever 50*4b132aacSChuck Lever 51*4b132aacSChuck LeverDependencies 52*4b132aacSChuck Lever------------ 53*4b132aacSChuck Lever 54*4b132aacSChuck LeverThese dependencies are typically packaged by Linux distributions: 55*4b132aacSChuck Lever 56*4b132aacSChuck Lever- python3 57*4b132aacSChuck Lever- python3-lark 58*4b132aacSChuck Lever- python3-jinja2 59*4b132aacSChuck Lever 60*4b132aacSChuck LeverThese dependencies are available via PyPi: 61*4b132aacSChuck Lever 62*4b132aacSChuck Lever- pip install 'lark[interegular]' 63*4b132aacSChuck Lever 64*4b132aacSChuck Lever 65*4b132aacSChuck LeverXDR Specifications 66*4b132aacSChuck Lever------------------ 67*4b132aacSChuck Lever 68*4b132aacSChuck LeverWhen adding a new protocol implementation to the kernel, the XDR 69*4b132aacSChuck Leverspecification can be derived by feeding a .txt copy of the RFC to 70*4b132aacSChuck Leverthe script located in tools/net/sunrpc/extract.sh. 71*4b132aacSChuck Lever 72*4b132aacSChuck Lever $ extract.sh < rfc0001.txt > new2.x 73*4b132aacSChuck Lever 74*4b132aacSChuck Lever 75*4b132aacSChuck LeverOperation 76*4b132aacSChuck Lever--------- 77*4b132aacSChuck Lever 78*4b132aacSChuck LeverOnce a .x file is available, use xdrgen to generate source and 79*4b132aacSChuck Leverheader files containing an implementation of XDR encoding and 80*4b132aacSChuck Leverdecoding functions for the specified protocol. 81*4b132aacSChuck Lever 82*4b132aacSChuck Lever $ ./xdrgen definitions new2.x > include/linux/sunrpc/xdrgen/new2.h 83*4b132aacSChuck Lever $ ./xdrgen declarations new2.x > new2xdr_gen.h 84*4b132aacSChuck Lever 85*4b132aacSChuck Leverand 86*4b132aacSChuck Lever 87*4b132aacSChuck Lever $ ./xdrgen source new2.x > new2xdr_gen.c 88*4b132aacSChuck Lever 89*4b132aacSChuck LeverThe files are ready to use for a server-side protocol implementation, 90*4b132aacSChuck Leveror may be used as a guide for implementing these routines by hand. 91*4b132aacSChuck Lever 92*4b132aacSChuck LeverBy default, the only comments added to this code are kdoc comments 93*4b132aacSChuck Leverthat appear directly in front of the public per-procedure APIs. For 94*4b132aacSChuck Leverdeeper introspection, specifying the "--annotate" flag will insert 95*4b132aacSChuck Leveradditional comments in the generated code to help readers match the 96*4b132aacSChuck Levergenerated code to specific parts of the XDR specification. 97*4b132aacSChuck Lever 98*4b132aacSChuck LeverBecause the generated code is targeted for the Linux kernel, it 99*4b132aacSChuck Leveris tagged with a GPLv2-only license. 100*4b132aacSChuck Lever 101*4b132aacSChuck LeverThe xdrgen tool can also provide lexical and syntax checking of 102*4b132aacSChuck Leveran XDR specification: 103*4b132aacSChuck Lever 104*4b132aacSChuck Lever $ ./xdrgen lint xdr/new.x 105*4b132aacSChuck Lever 106*4b132aacSChuck Lever 107*4b132aacSChuck LeverHow It Works 108*4b132aacSChuck Lever------------ 109*4b132aacSChuck Lever 110*4b132aacSChuck Leverxdrgen does not use machine learning to generate source code. The 111*4b132aacSChuck Levertranslation is entirely deterministic. 112*4b132aacSChuck Lever 113*4b132aacSChuck LeverRFC 4506 Section 6 contains a BNF grammar of the XDR specification 114*4b132aacSChuck Leverlanguage. The grammar has been adapted for use by the Python Lark 115*4b132aacSChuck Levermodule. 116*4b132aacSChuck Lever 117*4b132aacSChuck LeverThe xdr.ebnf file in this directory contains the grammar used to 118*4b132aacSChuck Leverparse XDR specifications. xdrgen configures Lark using the grammar 119*4b132aacSChuck Leverin xdr.ebnf. Lark parses the target XDR specification using this 120*4b132aacSChuck Levergrammar, creating a parse tree. 121*4b132aacSChuck Lever 122*4b132aacSChuck Leverxdrgen then transforms the parse tree into an abstract syntax tree. 123*4b132aacSChuck LeverThis tree is passed to a series of code generators. 124*4b132aacSChuck Lever 125*4b132aacSChuck LeverThe generators are implemented as Python classes residing in the 126*4b132aacSChuck Levergenerators/ directory. Each generator emits code created from Jinja2 127*4b132aacSChuck Levertemplates stored in the templates/ directory. 128*4b132aacSChuck Lever 129*4b132aacSChuck LeverThe source code is generated in the same order in which they appear 130*4b132aacSChuck Leverin the specification to ensure the generated code compiles. This 131*4b132aacSChuck Leverconforms with the behavior of rpcgen. 132*4b132aacSChuck Lever 133*4b132aacSChuck Leverxdrgen assumes that the generated source code is further compiled by 134*4b132aacSChuck Levera compiler that can optimize in a number of ways, including: 135*4b132aacSChuck Lever 136*4b132aacSChuck Lever - Unused functions are discarded (ie, not added to the executable) 137*4b132aacSChuck Lever 138*4b132aacSChuck Lever - Aggressive function inlining removes unnecessary stack frames 139*4b132aacSChuck Lever 140*4b132aacSChuck Lever - Single-arm switch statements are replaced by a single conditional 141*4b132aacSChuck Lever branch 142*4b132aacSChuck Lever 143*4b132aacSChuck LeverAnd so on. 144*4b132aacSChuck Lever 145*4b132aacSChuck Lever 146*4b132aacSChuck LeverPragmas 147*4b132aacSChuck Lever------- 148*4b132aacSChuck Lever 149*4b132aacSChuck LeverPragma directives specify exceptions to the normal generation of 150*4b132aacSChuck Leverencoding and decoding functions. Currently one directive is 151*4b132aacSChuck Leverimplemented: "public". 152*4b132aacSChuck Lever 153*4b132aacSChuck LeverPragma exclude 154*4b132aacSChuck Lever------ ------- 155*4b132aacSChuck Lever 156*4b132aacSChuck Lever pragma exclude <RPC procedure> ; 157*4b132aacSChuck Lever 158*4b132aacSChuck LeverIn some cases, a procedure encoder or decoder function might need 159*4b132aacSChuck Leverspecial processing that cannot be automatically generated. The 160*4b132aacSChuck Leverautomatically-generated functions might conflict or interfere with 161*4b132aacSChuck Leverthe hand-rolled function. To avoid editing the generated source code 162*4b132aacSChuck Leverby hand, a pragma can specify that the procedure's encoder and 163*4b132aacSChuck Leverdecoder functions are not included in the generated header and 164*4b132aacSChuck Leversource. 165*4b132aacSChuck Lever 166*4b132aacSChuck LeverFor example: 167*4b132aacSChuck Lever 168*4b132aacSChuck Lever pragma exclude NFSPROC3_READDIRPLUS; 169*4b132aacSChuck Lever 170*4b132aacSChuck LeverExcludes the decoder function for the READDIRPLUS argument and the 171*4b132aacSChuck Leverencoder function for the READDIRPLUS result. 172*4b132aacSChuck Lever 173*4b132aacSChuck LeverNote that because data item encoder and decoder functions are 174*4b132aacSChuck Leverdefined "static __maybe_unused", subsequent compilation 175*4b132aacSChuck Leverautomatically excludes data item encoder and decoder functions that 176*4b132aacSChuck Leverare used only by excluded procedure. 177*4b132aacSChuck Lever 178*4b132aacSChuck LeverPragma header 179*4b132aacSChuck Lever------ ------ 180*4b132aacSChuck Lever 181*4b132aacSChuck Lever pragma header <string> ; 182*4b132aacSChuck Lever 183*4b132aacSChuck LeverProvide a name to use for the header file. For example: 184*4b132aacSChuck Lever 185*4b132aacSChuck Lever pragma header nlm4; 186*4b132aacSChuck Lever 187*4b132aacSChuck LeverAdds 188*4b132aacSChuck Lever 189*4b132aacSChuck Lever #include "nlm4xdr_gen.h" 190*4b132aacSChuck Lever 191*4b132aacSChuck Leverto the generated source file. 192*4b132aacSChuck Lever 193*4b132aacSChuck LeverPragma public 194*4b132aacSChuck Lever------ ------ 195*4b132aacSChuck Lever 196*4b132aacSChuck Lever pragma public <XDR data item> ; 197*4b132aacSChuck Lever 198*4b132aacSChuck LeverNormally XDR encoder and decoder functions are "static". In case an 199*4b132aacSChuck Leverimplementer wants to call these functions from other source code, 200*4b132aacSChuck Levers/he can add a public pragma in the input .x file to indicate a set 201*4b132aacSChuck Leverof functions that should get a prototype in the generated header, 202*4b132aacSChuck Leverand the function definitions will not be declared static. 203*4b132aacSChuck Lever 204*4b132aacSChuck LeverFor example: 205*4b132aacSChuck Lever 206*4b132aacSChuck Lever pragma public nfsstat3; 207*4b132aacSChuck Lever 208*4b132aacSChuck LeverAdds these prototypes in the generated header: 209*4b132aacSChuck Lever 210*4b132aacSChuck Lever bool xdrgen_decode_nfsstat3(struct xdr_stream *xdr, enum nfsstat3 *ptr); 211*4b132aacSChuck Lever bool xdrgen_encode_nfsstat3(struct xdr_stream *xdr, enum nfsstat3 value); 212*4b132aacSChuck Lever 213*4b132aacSChuck LeverAnd, in the generated source code, both of these functions appear 214*4b132aacSChuck Leverwithout the "static __maybe_unused" modifiers. 215*4b132aacSChuck Lever 216*4b132aacSChuck Lever 217*4b132aacSChuck LeverFuture Work 218*4b132aacSChuck Lever----------- 219*4b132aacSChuck Lever 220*4b132aacSChuck LeverFinish implementing XDR pointer and list types. 221*4b132aacSChuck Lever 222*4b132aacSChuck LeverGenerate client-side procedure functions 223*4b132aacSChuck Lever 224*4b132aacSChuck LeverExpand the README into a user guide similar to rpcgen(1) 225*4b132aacSChuck Lever 226*4b132aacSChuck LeverAdd more pragma directives: 227*4b132aacSChuck Lever 228*4b132aacSChuck Lever * @pages -- use xdr_read/write_pages() for the specified opaque 229*4b132aacSChuck Lever field 230*4b132aacSChuck Lever * @skip -- do not decode, but rather skip, the specified argument 231*4b132aacSChuck Lever field 232*4b132aacSChuck Lever 233*4b132aacSChuck LeverEnable something like a #include to dynamically insert the content 234*4b132aacSChuck Leverof other specification files 235*4b132aacSChuck Lever 236*4b132aacSChuck LeverProperly support line-by-line pass-through via the "%" decorator 237*4b132aacSChuck Lever 238*4b132aacSChuck LeverBuild a unit test suite for verifying translation of XDR language 239*4b132aacSChuck Leverinto compilable code 240*4b132aacSChuck Lever 241*4b132aacSChuck LeverAdd a command-line option to insert trace_printk call sites in the 242*4b132aacSChuck Levergenerated source code, for improved (temporary) observability 243*4b132aacSChuck Lever 244*4b132aacSChuck LeverGenerate kernel Rust code as well as C code 245