1*41db511aSChristoph Hellwig 2*41db511aSChristoph Hellwig=================== 3*41db511aSChristoph HellwigClassic BPF vs eBPF 4*41db511aSChristoph Hellwig=================== 5*41db511aSChristoph Hellwig 6*41db511aSChristoph HellwigeBPF is designed to be JITed with one to one mapping, which can also open up 7*41db511aSChristoph Hellwigthe possibility for GCC/LLVM compilers to generate optimized eBPF code through 8*41db511aSChristoph Hellwigan eBPF backend that performs almost as fast as natively compiled code. 9*41db511aSChristoph Hellwig 10*41db511aSChristoph HellwigSome core changes of the eBPF format from classic BPF: 11*41db511aSChristoph Hellwig 12*41db511aSChristoph Hellwig- Number of registers increase from 2 to 10: 13*41db511aSChristoph Hellwig 14*41db511aSChristoph Hellwig The old format had two registers A and X, and a hidden frame pointer. The 15*41db511aSChristoph Hellwig new layout extends this to be 10 internal registers and a read-only frame 16*41db511aSChristoph Hellwig pointer. Since 64-bit CPUs are passing arguments to functions via registers 17*41db511aSChristoph Hellwig the number of args from eBPF program to in-kernel function is restricted 18*41db511aSChristoph Hellwig to 5 and one register is used to accept return value from an in-kernel 19*41db511aSChristoph Hellwig function. Natively, x86_64 passes first 6 arguments in registers, aarch64/ 20*41db511aSChristoph Hellwig sparcv9/mips64 have 7 - 8 registers for arguments; x86_64 has 6 callee saved 21*41db511aSChristoph Hellwig registers, and aarch64/sparcv9/mips64 have 11 or more callee saved registers. 22*41db511aSChristoph Hellwig 23*41db511aSChristoph Hellwig Thus, all eBPF registers map one to one to HW registers on x86_64, aarch64, 24*41db511aSChristoph Hellwig etc, and eBPF calling convention maps directly to ABIs used by the kernel on 25*41db511aSChristoph Hellwig 64-bit architectures. 26*41db511aSChristoph Hellwig 27*41db511aSChristoph Hellwig On 32-bit architectures JIT may map programs that use only 32-bit arithmetic 28*41db511aSChristoph Hellwig and may let more complex programs to be interpreted. 29*41db511aSChristoph Hellwig 30*41db511aSChristoph Hellwig R0 - R5 are scratch registers and eBPF program needs spill/fill them if 31*41db511aSChristoph Hellwig necessary across calls. Note that there is only one eBPF program (== one 32*41db511aSChristoph Hellwig eBPF main routine) and it cannot call other eBPF functions, it can only 33*41db511aSChristoph Hellwig call predefined in-kernel functions, though. 34*41db511aSChristoph Hellwig 35*41db511aSChristoph Hellwig- Register width increases from 32-bit to 64-bit: 36*41db511aSChristoph Hellwig 37*41db511aSChristoph Hellwig Still, the semantics of the original 32-bit ALU operations are preserved 38*41db511aSChristoph Hellwig via 32-bit subregisters. All eBPF registers are 64-bit with 32-bit lower 39*41db511aSChristoph Hellwig subregisters that zero-extend into 64-bit if they are being written to. 40*41db511aSChristoph Hellwig That behavior maps directly to x86_64 and arm64 subregister definition, but 41*41db511aSChristoph Hellwig makes other JITs more difficult. 42*41db511aSChristoph Hellwig 43*41db511aSChristoph Hellwig 32-bit architectures run 64-bit eBPF programs via interpreter. 44*41db511aSChristoph Hellwig Their JITs may convert BPF programs that only use 32-bit subregisters into 45*41db511aSChristoph Hellwig native instruction set and let the rest being interpreted. 46*41db511aSChristoph Hellwig 47*41db511aSChristoph Hellwig Operation is 64-bit, because on 64-bit architectures, pointers are also 48*41db511aSChristoph Hellwig 64-bit wide, and we want to pass 64-bit values in/out of kernel functions, 49*41db511aSChristoph Hellwig so 32-bit eBPF registers would otherwise require to define register-pair 50*41db511aSChristoph Hellwig ABI, thus, there won't be able to use a direct eBPF register to HW register 51*41db511aSChristoph Hellwig mapping and JIT would need to do combine/split/move operations for every 52*41db511aSChristoph Hellwig register in and out of the function, which is complex, bug prone and slow. 53*41db511aSChristoph Hellwig Another reason is the use of atomic 64-bit counters. 54*41db511aSChristoph Hellwig 55*41db511aSChristoph Hellwig- Conditional jt/jf targets replaced with jt/fall-through: 56*41db511aSChristoph Hellwig 57*41db511aSChristoph Hellwig While the original design has constructs such as ``if (cond) jump_true; 58*41db511aSChristoph Hellwig else jump_false;``, they are being replaced into alternative constructs like 59*41db511aSChristoph Hellwig ``if (cond) jump_true; /* else fall-through */``. 60*41db511aSChristoph Hellwig 61*41db511aSChristoph Hellwig- Introduces bpf_call insn and register passing convention for zero overhead 62*41db511aSChristoph Hellwig calls from/to other kernel functions: 63*41db511aSChristoph Hellwig 64*41db511aSChristoph Hellwig Before an in-kernel function call, the eBPF program needs to 65*41db511aSChristoph Hellwig place function arguments into R1 to R5 registers to satisfy calling 66*41db511aSChristoph Hellwig convention, then the interpreter will take them from registers and pass 67*41db511aSChristoph Hellwig to in-kernel function. If R1 - R5 registers are mapped to CPU registers 68*41db511aSChristoph Hellwig that are used for argument passing on given architecture, the JIT compiler 69*41db511aSChristoph Hellwig doesn't need to emit extra moves. Function arguments will be in the correct 70*41db511aSChristoph Hellwig registers and BPF_CALL instruction will be JITed as single 'call' HW 71*41db511aSChristoph Hellwig instruction. This calling convention was picked to cover common call 72*41db511aSChristoph Hellwig situations without performance penalty. 73*41db511aSChristoph Hellwig 74*41db511aSChristoph Hellwig After an in-kernel function call, R1 - R5 are reset to unreadable and R0 has 75*41db511aSChristoph Hellwig a return value of the function. Since R6 - R9 are callee saved, their state 76*41db511aSChristoph Hellwig is preserved across the call. 77*41db511aSChristoph Hellwig 78*41db511aSChristoph Hellwig For example, consider three C functions:: 79*41db511aSChristoph Hellwig 80*41db511aSChristoph Hellwig u64 f1() { return (*_f2)(1); } 81*41db511aSChristoph Hellwig u64 f2(u64 a) { return f3(a + 1, a); } 82*41db511aSChristoph Hellwig u64 f3(u64 a, u64 b) { return a - b; } 83*41db511aSChristoph Hellwig 84*41db511aSChristoph Hellwig GCC can compile f1, f3 into x86_64:: 85*41db511aSChristoph Hellwig 86*41db511aSChristoph Hellwig f1: 87*41db511aSChristoph Hellwig movl $1, %edi 88*41db511aSChristoph Hellwig movq _f2(%rip), %rax 89*41db511aSChristoph Hellwig jmp *%rax 90*41db511aSChristoph Hellwig f3: 91*41db511aSChristoph Hellwig movq %rdi, %rax 92*41db511aSChristoph Hellwig subq %rsi, %rax 93*41db511aSChristoph Hellwig ret 94*41db511aSChristoph Hellwig 95*41db511aSChristoph Hellwig Function f2 in eBPF may look like:: 96*41db511aSChristoph Hellwig 97*41db511aSChristoph Hellwig f2: 98*41db511aSChristoph Hellwig bpf_mov R2, R1 99*41db511aSChristoph Hellwig bpf_add R1, 1 100*41db511aSChristoph Hellwig bpf_call f3 101*41db511aSChristoph Hellwig bpf_exit 102*41db511aSChristoph Hellwig 103*41db511aSChristoph Hellwig If f2 is JITed and the pointer stored to ``_f2``. The calls f1 -> f2 -> f3 and 104*41db511aSChristoph Hellwig returns will be seamless. Without JIT, __bpf_prog_run() interpreter needs to 105*41db511aSChristoph Hellwig be used to call into f2. 106*41db511aSChristoph Hellwig 107*41db511aSChristoph Hellwig For practical reasons all eBPF programs have only one argument 'ctx' which is 108*41db511aSChristoph Hellwig already placed into R1 (e.g. on __bpf_prog_run() startup) and the programs 109*41db511aSChristoph Hellwig can call kernel functions with up to 5 arguments. Calls with 6 or more arguments 110*41db511aSChristoph Hellwig are currently not supported, but these restrictions can be lifted if necessary 111*41db511aSChristoph Hellwig in the future. 112*41db511aSChristoph Hellwig 113*41db511aSChristoph Hellwig On 64-bit architectures all register map to HW registers one to one. For 114*41db511aSChristoph Hellwig example, x86_64 JIT compiler can map them as ... 115*41db511aSChristoph Hellwig 116*41db511aSChristoph Hellwig :: 117*41db511aSChristoph Hellwig 118*41db511aSChristoph Hellwig R0 - rax 119*41db511aSChristoph Hellwig R1 - rdi 120*41db511aSChristoph Hellwig R2 - rsi 121*41db511aSChristoph Hellwig R3 - rdx 122*41db511aSChristoph Hellwig R4 - rcx 123*41db511aSChristoph Hellwig R5 - r8 124*41db511aSChristoph Hellwig R6 - rbx 125*41db511aSChristoph Hellwig R7 - r13 126*41db511aSChristoph Hellwig R8 - r14 127*41db511aSChristoph Hellwig R9 - r15 128*41db511aSChristoph Hellwig R10 - rbp 129*41db511aSChristoph Hellwig 130*41db511aSChristoph Hellwig ... since x86_64 ABI mandates rdi, rsi, rdx, rcx, r8, r9 for argument passing 131*41db511aSChristoph Hellwig and rbx, r12 - r15 are callee saved. 132*41db511aSChristoph Hellwig 133*41db511aSChristoph Hellwig Then the following eBPF pseudo-program:: 134*41db511aSChristoph Hellwig 135*41db511aSChristoph Hellwig bpf_mov R6, R1 /* save ctx */ 136*41db511aSChristoph Hellwig bpf_mov R2, 2 137*41db511aSChristoph Hellwig bpf_mov R3, 3 138*41db511aSChristoph Hellwig bpf_mov R4, 4 139*41db511aSChristoph Hellwig bpf_mov R5, 5 140*41db511aSChristoph Hellwig bpf_call foo 141*41db511aSChristoph Hellwig bpf_mov R7, R0 /* save foo() return value */ 142*41db511aSChristoph Hellwig bpf_mov R1, R6 /* restore ctx for next call */ 143*41db511aSChristoph Hellwig bpf_mov R2, 6 144*41db511aSChristoph Hellwig bpf_mov R3, 7 145*41db511aSChristoph Hellwig bpf_mov R4, 8 146*41db511aSChristoph Hellwig bpf_mov R5, 9 147*41db511aSChristoph Hellwig bpf_call bar 148*41db511aSChristoph Hellwig bpf_add R0, R7 149*41db511aSChristoph Hellwig bpf_exit 150*41db511aSChristoph Hellwig 151*41db511aSChristoph Hellwig After JIT to x86_64 may look like:: 152*41db511aSChristoph Hellwig 153*41db511aSChristoph Hellwig push %rbp 154*41db511aSChristoph Hellwig mov %rsp,%rbp 155*41db511aSChristoph Hellwig sub $0x228,%rsp 156*41db511aSChristoph Hellwig mov %rbx,-0x228(%rbp) 157*41db511aSChristoph Hellwig mov %r13,-0x220(%rbp) 158*41db511aSChristoph Hellwig mov %rdi,%rbx 159*41db511aSChristoph Hellwig mov $0x2,%esi 160*41db511aSChristoph Hellwig mov $0x3,%edx 161*41db511aSChristoph Hellwig mov $0x4,%ecx 162*41db511aSChristoph Hellwig mov $0x5,%r8d 163*41db511aSChristoph Hellwig callq foo 164*41db511aSChristoph Hellwig mov %rax,%r13 165*41db511aSChristoph Hellwig mov %rbx,%rdi 166*41db511aSChristoph Hellwig mov $0x6,%esi 167*41db511aSChristoph Hellwig mov $0x7,%edx 168*41db511aSChristoph Hellwig mov $0x8,%ecx 169*41db511aSChristoph Hellwig mov $0x9,%r8d 170*41db511aSChristoph Hellwig callq bar 171*41db511aSChristoph Hellwig add %r13,%rax 172*41db511aSChristoph Hellwig mov -0x228(%rbp),%rbx 173*41db511aSChristoph Hellwig mov -0x220(%rbp),%r13 174*41db511aSChristoph Hellwig leaveq 175*41db511aSChristoph Hellwig retq 176*41db511aSChristoph Hellwig 177*41db511aSChristoph Hellwig Which is in this example equivalent in C to:: 178*41db511aSChristoph Hellwig 179*41db511aSChristoph Hellwig u64 bpf_filter(u64 ctx) 180*41db511aSChristoph Hellwig { 181*41db511aSChristoph Hellwig return foo(ctx, 2, 3, 4, 5) + bar(ctx, 6, 7, 8, 9); 182*41db511aSChristoph Hellwig } 183*41db511aSChristoph Hellwig 184*41db511aSChristoph Hellwig In-kernel functions foo() and bar() with prototype: u64 (*)(u64 arg1, u64 185*41db511aSChristoph Hellwig arg2, u64 arg3, u64 arg4, u64 arg5); will receive arguments in proper 186*41db511aSChristoph Hellwig registers and place their return value into ``%rax`` which is R0 in eBPF. 187*41db511aSChristoph Hellwig Prologue and epilogue are emitted by JIT and are implicit in the 188*41db511aSChristoph Hellwig interpreter. R0-R5 are scratch registers, so eBPF program needs to preserve 189*41db511aSChristoph Hellwig them across the calls as defined by calling convention. 190*41db511aSChristoph Hellwig 191*41db511aSChristoph Hellwig For example the following program is invalid:: 192*41db511aSChristoph Hellwig 193*41db511aSChristoph Hellwig bpf_mov R1, 1 194*41db511aSChristoph Hellwig bpf_call foo 195*41db511aSChristoph Hellwig bpf_mov R0, R1 196*41db511aSChristoph Hellwig bpf_exit 197*41db511aSChristoph Hellwig 198*41db511aSChristoph Hellwig After the call the registers R1-R5 contain junk values and cannot be read. 199*41db511aSChristoph Hellwig An in-kernel verifier.rst is used to validate eBPF programs. 200*41db511aSChristoph Hellwig 201*41db511aSChristoph HellwigAlso in the new design, eBPF is limited to 4096 insns, which means that any 202*41db511aSChristoph Hellwigprogram will terminate quickly and will only call a fixed number of kernel 203*41db511aSChristoph Hellwigfunctions. Original BPF and eBPF are two operand instructions, 204*41db511aSChristoph Hellwigwhich helps to do one-to-one mapping between eBPF insn and x86 insn during JIT. 205*41db511aSChristoph Hellwig 206*41db511aSChristoph HellwigThe input context pointer for invoking the interpreter function is generic, 207*41db511aSChristoph Hellwigits content is defined by a specific use case. For seccomp register R1 points 208*41db511aSChristoph Hellwigto seccomp_data, for converted BPF filters R1 points to a skb. 209*41db511aSChristoph Hellwig 210*41db511aSChristoph HellwigA program, that is translated internally consists of the following elements:: 211*41db511aSChristoph Hellwig 212*41db511aSChristoph Hellwig op:16, jt:8, jf:8, k:32 ==> op:8, dst_reg:4, src_reg:4, off:16, imm:32 213*41db511aSChristoph Hellwig 214*41db511aSChristoph HellwigSo far 87 eBPF instructions were implemented. 8-bit 'op' opcode field 215*41db511aSChristoph Hellwighas room for new instructions. Some of them may use 16/24/32 byte encoding. New 216*41db511aSChristoph Hellwiginstructions must be multiple of 8 bytes to preserve backward compatibility. 217*41db511aSChristoph Hellwig 218*41db511aSChristoph HellwigeBPF is a general purpose RISC instruction set. Not every register and 219*41db511aSChristoph Hellwigevery instruction are used during translation from original BPF to eBPF. 220*41db511aSChristoph HellwigFor example, socket filters are not using ``exclusive add`` instruction, but 221*41db511aSChristoph Hellwigtracing filters may do to maintain counters of events, for example. Register R9 222*41db511aSChristoph Hellwigis not used by socket filters either, but more complex filters may be running 223*41db511aSChristoph Hellwigout of registers and would have to resort to spill/fill to stack. 224*41db511aSChristoph Hellwig 225*41db511aSChristoph HellwigeBPF can be used as a generic assembler for last step performance 226*41db511aSChristoph Hellwigoptimizations, socket filters and seccomp are using it as assembler. Tracing 227*41db511aSChristoph Hellwigfilters may use it as assembler to generate code from kernel. In kernel usage 228*41db511aSChristoph Hellwigmay not be bounded by security considerations, since generated eBPF code 229*41db511aSChristoph Hellwigmay be optimizing internal code path and not being exposed to the user space. 230*41db511aSChristoph HellwigSafety of eBPF can come from the verifier.rst. In such use cases as 231*41db511aSChristoph Hellwigdescribed, it may be used as safe instruction set. 232*41db511aSChristoph Hellwig 233*41db511aSChristoph HellwigJust like the original BPF, eBPF runs within a controlled environment, 234*41db511aSChristoph Hellwigis deterministic and the kernel can easily prove that. The safety of the program 235*41db511aSChristoph Hellwigcan be determined in two steps: first step does depth-first-search to disallow 236*41db511aSChristoph Hellwigloops and other CFG validation; second step starts from the first insn and 237*41db511aSChristoph Hellwigdescends all possible paths. It simulates execution of every insn and observes 238*41db511aSChristoph Hellwigthe state change of registers and stack. 239*41db511aSChristoph Hellwig 240*41db511aSChristoph Hellwigopcode encoding 241*41db511aSChristoph Hellwig=============== 242*41db511aSChristoph Hellwig 243*41db511aSChristoph HellwigeBPF is reusing most of the opcode encoding from classic to simplify conversion 244*41db511aSChristoph Hellwigof classic BPF to eBPF. 245*41db511aSChristoph Hellwig 246*41db511aSChristoph HellwigFor arithmetic and jump instructions the 8-bit 'code' field is divided into three 247*41db511aSChristoph Hellwigparts:: 248*41db511aSChristoph Hellwig 249*41db511aSChristoph Hellwig +----------------+--------+--------------------+ 250*41db511aSChristoph Hellwig | 4 bits | 1 bit | 3 bits | 251*41db511aSChristoph Hellwig | operation code | source | instruction class | 252*41db511aSChristoph Hellwig +----------------+--------+--------------------+ 253*41db511aSChristoph Hellwig (MSB) (LSB) 254*41db511aSChristoph Hellwig 255*41db511aSChristoph HellwigThree LSB bits store instruction class which is one of: 256*41db511aSChristoph Hellwig 257*41db511aSChristoph Hellwig =================== =============== 258*41db511aSChristoph Hellwig Classic BPF classes eBPF classes 259*41db511aSChristoph Hellwig =================== =============== 260*41db511aSChristoph Hellwig BPF_LD 0x00 BPF_LD 0x00 261*41db511aSChristoph Hellwig BPF_LDX 0x01 BPF_LDX 0x01 262*41db511aSChristoph Hellwig BPF_ST 0x02 BPF_ST 0x02 263*41db511aSChristoph Hellwig BPF_STX 0x03 BPF_STX 0x03 264*41db511aSChristoph Hellwig BPF_ALU 0x04 BPF_ALU 0x04 265*41db511aSChristoph Hellwig BPF_JMP 0x05 BPF_JMP 0x05 266*41db511aSChristoph Hellwig BPF_RET 0x06 BPF_JMP32 0x06 267*41db511aSChristoph Hellwig BPF_MISC 0x07 BPF_ALU64 0x07 268*41db511aSChristoph Hellwig =================== =============== 269*41db511aSChristoph Hellwig 270*41db511aSChristoph HellwigThe 4th bit encodes the source operand ... 271*41db511aSChristoph Hellwig 272*41db511aSChristoph Hellwig :: 273*41db511aSChristoph Hellwig 274*41db511aSChristoph Hellwig BPF_K 0x00 275*41db511aSChristoph Hellwig BPF_X 0x08 276*41db511aSChristoph Hellwig 277*41db511aSChristoph Hellwig * in classic BPF, this means:: 278*41db511aSChristoph Hellwig 279*41db511aSChristoph Hellwig BPF_SRC(code) == BPF_X - use register X as source operand 280*41db511aSChristoph Hellwig BPF_SRC(code) == BPF_K - use 32-bit immediate as source operand 281*41db511aSChristoph Hellwig 282*41db511aSChristoph Hellwig * in eBPF, this means:: 283*41db511aSChristoph Hellwig 284*41db511aSChristoph Hellwig BPF_SRC(code) == BPF_X - use 'src_reg' register as source operand 285*41db511aSChristoph Hellwig BPF_SRC(code) == BPF_K - use 32-bit immediate as source operand 286*41db511aSChristoph Hellwig 287*41db511aSChristoph Hellwig... and four MSB bits store operation code. 288*41db511aSChristoph Hellwig 289*41db511aSChristoph HellwigIf BPF_CLASS(code) == BPF_ALU or BPF_ALU64 [ in eBPF ], BPF_OP(code) is one of:: 290*41db511aSChristoph Hellwig 291*41db511aSChristoph Hellwig BPF_ADD 0x00 292*41db511aSChristoph Hellwig BPF_SUB 0x10 293*41db511aSChristoph Hellwig BPF_MUL 0x20 294*41db511aSChristoph Hellwig BPF_DIV 0x30 295*41db511aSChristoph Hellwig BPF_OR 0x40 296*41db511aSChristoph Hellwig BPF_AND 0x50 297*41db511aSChristoph Hellwig BPF_LSH 0x60 298*41db511aSChristoph Hellwig BPF_RSH 0x70 299*41db511aSChristoph Hellwig BPF_NEG 0x80 300*41db511aSChristoph Hellwig BPF_MOD 0x90 301*41db511aSChristoph Hellwig BPF_XOR 0xa0 302*41db511aSChristoph Hellwig BPF_MOV 0xb0 /* eBPF only: mov reg to reg */ 303*41db511aSChristoph Hellwig BPF_ARSH 0xc0 /* eBPF only: sign extending shift right */ 304*41db511aSChristoph Hellwig BPF_END 0xd0 /* eBPF only: endianness conversion */ 305*41db511aSChristoph Hellwig 306*41db511aSChristoph HellwigIf BPF_CLASS(code) == BPF_JMP or BPF_JMP32 [ in eBPF ], BPF_OP(code) is one of:: 307*41db511aSChristoph Hellwig 308*41db511aSChristoph Hellwig BPF_JA 0x00 /* BPF_JMP only */ 309*41db511aSChristoph Hellwig BPF_JEQ 0x10 310*41db511aSChristoph Hellwig BPF_JGT 0x20 311*41db511aSChristoph Hellwig BPF_JGE 0x30 312*41db511aSChristoph Hellwig BPF_JSET 0x40 313*41db511aSChristoph Hellwig BPF_JNE 0x50 /* eBPF only: jump != */ 314*41db511aSChristoph Hellwig BPF_JSGT 0x60 /* eBPF only: signed '>' */ 315*41db511aSChristoph Hellwig BPF_JSGE 0x70 /* eBPF only: signed '>=' */ 316*41db511aSChristoph Hellwig BPF_CALL 0x80 /* eBPF BPF_JMP only: function call */ 317*41db511aSChristoph Hellwig BPF_EXIT 0x90 /* eBPF BPF_JMP only: function return */ 318*41db511aSChristoph Hellwig BPF_JLT 0xa0 /* eBPF only: unsigned '<' */ 319*41db511aSChristoph Hellwig BPF_JLE 0xb0 /* eBPF only: unsigned '<=' */ 320*41db511aSChristoph Hellwig BPF_JSLT 0xc0 /* eBPF only: signed '<' */ 321*41db511aSChristoph Hellwig BPF_JSLE 0xd0 /* eBPF only: signed '<=' */ 322*41db511aSChristoph Hellwig 323*41db511aSChristoph HellwigSo BPF_ADD | BPF_X | BPF_ALU means 32-bit addition in both classic BPF 324*41db511aSChristoph Hellwigand eBPF. There are only two registers in classic BPF, so it means A += X. 325*41db511aSChristoph HellwigIn eBPF it means dst_reg = (u32) dst_reg + (u32) src_reg; similarly, 326*41db511aSChristoph HellwigBPF_XOR | BPF_K | BPF_ALU means A ^= imm32 in classic BPF and analogous 327*41db511aSChristoph Hellwigsrc_reg = (u32) src_reg ^ (u32) imm32 in eBPF. 328*41db511aSChristoph Hellwig 329*41db511aSChristoph HellwigClassic BPF is using BPF_MISC class to represent A = X and X = A moves. 330*41db511aSChristoph HellwigeBPF is using BPF_MOV | BPF_X | BPF_ALU code instead. Since there are no 331*41db511aSChristoph HellwigBPF_MISC operations in eBPF, the class 7 is used as BPF_ALU64 to mean 332*41db511aSChristoph Hellwigexactly the same operations as BPF_ALU, but with 64-bit wide operands 333*41db511aSChristoph Hellwiginstead. So BPF_ADD | BPF_X | BPF_ALU64 means 64-bit addition, i.e.: 334*41db511aSChristoph Hellwigdst_reg = dst_reg + src_reg 335*41db511aSChristoph Hellwig 336*41db511aSChristoph HellwigClassic BPF wastes the whole BPF_RET class to represent a single ``ret`` 337*41db511aSChristoph Hellwigoperation. Classic BPF_RET | BPF_K means copy imm32 into return register 338*41db511aSChristoph Hellwigand perform function exit. eBPF is modeled to match CPU, so BPF_JMP | BPF_EXIT 339*41db511aSChristoph Hellwigin eBPF means function exit only. The eBPF program needs to store return 340*41db511aSChristoph Hellwigvalue into register R0 before doing a BPF_EXIT. Class 6 in eBPF is used as 341*41db511aSChristoph HellwigBPF_JMP32 to mean exactly the same operations as BPF_JMP, but with 32-bit wide 342*41db511aSChristoph Hellwigoperands for the comparisons instead. 343*41db511aSChristoph Hellwig 344*41db511aSChristoph HellwigFor load and store instructions the 8-bit 'code' field is divided as:: 345*41db511aSChristoph Hellwig 346*41db511aSChristoph Hellwig +--------+--------+-------------------+ 347*41db511aSChristoph Hellwig | 3 bits | 2 bits | 3 bits | 348*41db511aSChristoph Hellwig | mode | size | instruction class | 349*41db511aSChristoph Hellwig +--------+--------+-------------------+ 350*41db511aSChristoph Hellwig (MSB) (LSB) 351*41db511aSChristoph Hellwig 352*41db511aSChristoph HellwigSize modifier is one of ... 353*41db511aSChristoph Hellwig 354*41db511aSChristoph Hellwig:: 355*41db511aSChristoph Hellwig 356*41db511aSChristoph Hellwig BPF_W 0x00 /* word */ 357*41db511aSChristoph Hellwig BPF_H 0x08 /* half word */ 358*41db511aSChristoph Hellwig BPF_B 0x10 /* byte */ 359*41db511aSChristoph Hellwig BPF_DW 0x18 /* eBPF only, double word */ 360*41db511aSChristoph Hellwig 361*41db511aSChristoph Hellwig... which encodes size of load/store operation:: 362*41db511aSChristoph Hellwig 363*41db511aSChristoph Hellwig B - 1 byte 364*41db511aSChristoph Hellwig H - 2 byte 365*41db511aSChristoph Hellwig W - 4 byte 366*41db511aSChristoph Hellwig DW - 8 byte (eBPF only) 367*41db511aSChristoph Hellwig 368*41db511aSChristoph HellwigMode modifier is one of:: 369*41db511aSChristoph Hellwig 370*41db511aSChristoph Hellwig BPF_IMM 0x00 /* used for 32-bit mov in classic BPF and 64-bit in eBPF */ 371*41db511aSChristoph Hellwig BPF_ABS 0x20 372*41db511aSChristoph Hellwig BPF_IND 0x40 373*41db511aSChristoph Hellwig BPF_MEM 0x60 374*41db511aSChristoph Hellwig BPF_LEN 0x80 /* classic BPF only, reserved in eBPF */ 375*41db511aSChristoph Hellwig BPF_MSH 0xa0 /* classic BPF only, reserved in eBPF */ 376*41db511aSChristoph Hellwig BPF_ATOMIC 0xc0 /* eBPF only, atomic operations */ 377