1*215a3f91SJoel Fernandes.. SPDX-License-Identifier: GPL-2.0 2*215a3f91SJoel Fernandes 3*215a3f91SJoel Fernandes============================== 4*215a3f91SJoel FernandesFalcon (FAst Logic Controller) 5*215a3f91SJoel Fernandes============================== 6*215a3f91SJoel FernandesThe following sections describe the Falcon core and the ucode running on it. 7*215a3f91SJoel FernandesThe descriptions are based on the Ampere GPU or earlier designs; however, they 8*215a3f91SJoel Fernandesshould mostly apply to future designs as well, but everything is subject to 9*215a3f91SJoel Fernandeschange. The overview provided here is mainly tailored towards understanding the 10*215a3f91SJoel Fernandesinteractions of nova-core driver with the Falcon. 11*215a3f91SJoel Fernandes 12*215a3f91SJoel FernandesNVIDIA GPUs embed small RISC-like microcontrollers called Falcon cores, which 13*215a3f91SJoel Fernandeshandle secure firmware tasks, initialization, and power management. Modern 14*215a3f91SJoel FernandesNVIDIA GPUs may have multiple such Falcon instances (e.g., GSP (the GPU system 15*215a3f91SJoel Fernandesprocessor) and SEC2 (the security engine)) and also may integrate a RISC-V core. 16*215a3f91SJoel FernandesThis core is capable of running both RISC-V and Falcon code. 17*215a3f91SJoel Fernandes 18*215a3f91SJoel FernandesThe code running on the Falcon cores is also called 'ucode', and will be 19*215a3f91SJoel Fernandesreferred to as such in the following sections. 20*215a3f91SJoel Fernandes 21*215a3f91SJoel FernandesFalcons have separate instruction and data memories (IMEM/DMEM) and provide a 22*215a3f91SJoel Fernandessmall DMA engine (via the FBIF - "Frame Buffer Interface") to load code from 23*215a3f91SJoel Fernandessystem memory. The nova-core driver must reset and configure the Falcon, load 24*215a3f91SJoel Fernandesits firmware via DMA, and start its CPU. 25*215a3f91SJoel Fernandes 26*215a3f91SJoel FernandesFalcon security levels 27*215a3f91SJoel Fernandes====================== 28*215a3f91SJoel FernandesFalcons can run in Non-secure (NS), Light Secure (LS), or Heavy Secure (HS) 29*215a3f91SJoel Fernandesmodes. 30*215a3f91SJoel Fernandes 31*215a3f91SJoel FernandesHeavy Secured (HS) also known as Privilege Level 3 (PL3) 32*215a3f91SJoel Fernandes-------------------------------------------------------- 33*215a3f91SJoel FernandesHS ucode is the most trusted code and has access to pretty much everything on 34*215a3f91SJoel Fernandesthe chip. The HS binary includes a signature in it which is verified at boot. 35*215a3f91SJoel FernandesThis signature verification is done by the hardware itself, thus establishing a 36*215a3f91SJoel Fernandesroot of trust. For example, the FWSEC-FRTS command (see fwsec.rst) runs on the 37*215a3f91SJoel FernandesGSP in HS mode. FRTS, which involves setting up and loading content into the WPR 38*215a3f91SJoel Fernandes(Write Protect Region), has to be done by the HS ucode and cannot be done by the 39*215a3f91SJoel Fernandeshost CPU or LS ucode. 40*215a3f91SJoel Fernandes 41*215a3f91SJoel FernandesLight Secured (LS or PL2) and Non Secured (NS or PL0) 42*215a3f91SJoel Fernandes----------------------------------------------------- 43*215a3f91SJoel FernandesThese modes are less secure than HS. Like HS, the LS or NS ucode binary also 44*215a3f91SJoel Fernandestypically includes a signature in it. To load firmware in LS or NS mode onto a 45*215a3f91SJoel FernandesFalcon, another Falcon needs to be running in HS mode, which also establishes the 46*215a3f91SJoel Fernandesroot of trust. For example, in the case of an Ampere GPU, the CPU runs the "Booter" 47*215a3f91SJoel Fernandesucode in HS mode on the SEC2 Falcon, which then authenticates and runs the 48*215a3f91SJoel Fernandesrun-time GSP binary (GSP-RM) in LS mode on the GSP Falcon. Similarly, as an 49*215a3f91SJoel Fernandesexample, after reset on an Ampere, FWSEC runs on the GSP which then loads the 50*215a3f91SJoel Fernandesdevinit engine onto the PMU in LS mode. 51*215a3f91SJoel Fernandes 52*215a3f91SJoel FernandesRoot of trust establishment 53*215a3f91SJoel Fernandes--------------------------- 54*215a3f91SJoel FernandesTo establish a root of trust, the code running on a Falcon must be immutable and 55*215a3f91SJoel Fernandeshardwired into a read-only memory (ROM). This follows industry norms for 56*215a3f91SJoel Fernandesverification of firmware. This code is called the Boot ROM (BROM). The nova-core 57*215a3f91SJoel Fernandesdriver on the CPU communicates with Falcon's Boot ROM through various Falcon 58*215a3f91SJoel Fernandesregisters prefixed with "BROM" (see regs.rs). 59*215a3f91SJoel Fernandes 60*215a3f91SJoel FernandesAfter nova-core driver reads the necessary ucode from VBIOS, it programs the 61*215a3f91SJoel FernandesBROM and DMA registers to trigger the Falcon to load the HS ucode from the system 62*215a3f91SJoel Fernandesmemory into the Falcon's IMEM/DMEM. Once the HS ucode is loaded, it is verified 63*215a3f91SJoel Fernandesby the Falcon's Boot ROM. 64*215a3f91SJoel Fernandes 65*215a3f91SJoel FernandesOnce the verified HS code is running on a Falcon, it can verify and load other 66*215a3f91SJoel FernandesLS/NS ucode binaries onto other Falcons and start them. The process of signature 67*215a3f91SJoel Fernandesverification is the same as HS; just in this case, the hardware (BROM) doesn't 68*215a3f91SJoel Fernandescompute the signature, but the HS ucode does. 69*215a3f91SJoel Fernandes 70*215a3f91SJoel FernandesThe root of trust is therefore established as follows: 71*215a3f91SJoel Fernandes Hardware (Boot ROM running on the Falcon) -> HS ucode -> LS/NS ucode. 72*215a3f91SJoel Fernandes 73*215a3f91SJoel FernandesOn an Ampere GPU, for example, the boot verification flow is: 74*215a3f91SJoel Fernandes Hardware (Boot ROM running on the SEC2) -> 75*215a3f91SJoel Fernandes HS ucode (Booter running on the SEC2) -> 76*215a3f91SJoel Fernandes LS ucode (GSP-RM running on the GSP) 77*215a3f91SJoel Fernandes 78*215a3f91SJoel Fernandes.. note:: 79*215a3f91SJoel Fernandes While the CPU can load HS ucode onto a Falcon microcontroller and have it 80*215a3f91SJoel Fernandes verified by the hardware and run, the CPU itself typically does not load 81*215a3f91SJoel Fernandes LS or NS ucode and run it. Loading of LS or NS ucode is done mainly by the 82*215a3f91SJoel Fernandes HS ucode. For example, on an Ampere GPU, after the Booter ucode runs on the 83*215a3f91SJoel Fernandes SEC2 in HS mode and loads the GSP-RM binary onto the GSP, it needs to run 84*215a3f91SJoel Fernandes the "SEC2-RTOS" ucode at runtime. This presents a problem: there is no 85*215a3f91SJoel Fernandes component to load the SEC2-RTOS ucode onto the SEC2. The CPU cannot load 86*215a3f91SJoel Fernandes LS code, and GSP-RM must run in LS mode. To overcome this, the GSP is 87*215a3f91SJoel Fernandes temporarily made to run HS ucode (which is itself loaded by the CPU via 88*215a3f91SJoel Fernandes the nova-core driver using a "GSP-provided sequencer") which then loads 89*215a3f91SJoel Fernandes the SEC2-RTOS ucode onto the SEC2 in LS mode. The GSP then resumes 90*215a3f91SJoel Fernandes running its own GSP-RM LS ucode. 91*215a3f91SJoel Fernandes 92*215a3f91SJoel FernandesFalcon memory subsystem and DMA engine 93*215a3f91SJoel Fernandes====================================== 94*215a3f91SJoel FernandesFalcons have separate instruction and data memories (IMEM/DMEM) 95*215a3f91SJoel Fernandesand contains a small DMA engine called FBDMA (Framebuffer DMA) which does 96*215a3f91SJoel FernandesDMA transfers to/from the IMEM/DMEM memory inside the Falcon via the FBIF 97*215a3f91SJoel Fernandes(Framebuffer Interface), to external memory. 98*215a3f91SJoel Fernandes 99*215a3f91SJoel FernandesDMA transfers are possible from the Falcon's memory to both the system memory 100*215a3f91SJoel Fernandesand the framebuffer memory (VRAM). 101*215a3f91SJoel Fernandes 102*215a3f91SJoel FernandesTo perform a DMA via the FBDMA, the FBIF is configured to decide how the memory 103*215a3f91SJoel Fernandesis accessed (also known as aperture type). In the nova-core driver, this is 104*215a3f91SJoel Fernandesdetermined by the `FalconFbifTarget` enum. 105*215a3f91SJoel Fernandes 106*215a3f91SJoel FernandesThe IO-PMP block (Input/Output Physical Memory Protection) unit in the Falcon 107*215a3f91SJoel Fernandescontrols access by the FBDMA to the external memory. 108*215a3f91SJoel Fernandes 109*215a3f91SJoel FernandesConceptual diagram (not exact) of the Falcon and its memory subsystem is as follows:: 110*215a3f91SJoel Fernandes 111*215a3f91SJoel Fernandes External Memory (Framebuffer / System DRAM) 112*215a3f91SJoel Fernandes ^ | 113*215a3f91SJoel Fernandes | | 114*215a3f91SJoel Fernandes | v 115*215a3f91SJoel Fernandes +-----------------------------------------------------+ 116*215a3f91SJoel Fernandes | | | 117*215a3f91SJoel Fernandes | +---------------+ | | 118*215a3f91SJoel Fernandes | | FBIF |-------+ | FALCON 119*215a3f91SJoel Fernandes | | (FrameBuffer | Memory Interface | PROCESSOR 120*215a3f91SJoel Fernandes | | InterFace) | | 121*215a3f91SJoel Fernandes | | Apertures | | 122*215a3f91SJoel Fernandes | | Configures | | 123*215a3f91SJoel Fernandes | | mem access | | 124*215a3f91SJoel Fernandes | +-------^-------+ | 125*215a3f91SJoel Fernandes | | | 126*215a3f91SJoel Fernandes | | FBDMA uses configured FBIF apertures | 127*215a3f91SJoel Fernandes | | to access External Memory 128*215a3f91SJoel Fernandes | | 129*215a3f91SJoel Fernandes | +-------v--------+ +---------------+ 130*215a3f91SJoel Fernandes | | FBDMA | cfg | RISC | 131*215a3f91SJoel Fernandes | | (FrameBuffer |<---->| CORE |----->. Direct Core Access 132*215a3f91SJoel Fernandes | | DMA Engine) | | | | 133*215a3f91SJoel Fernandes | | - Master dev. | | (can run both | | 134*215a3f91SJoel Fernandes | +-------^--------+ | Falcon and | | 135*215a3f91SJoel Fernandes | | cfg--->| RISC-V code) | | 136*215a3f91SJoel Fernandes | | / | | | 137*215a3f91SJoel Fernandes | | | +---------------+ | +------------+ 138*215a3f91SJoel Fernandes | | | | | BROM | 139*215a3f91SJoel Fernandes | | | <--->| (Boot ROM) | 140*215a3f91SJoel Fernandes | | / | +------------+ 141*215a3f91SJoel Fernandes | | v | 142*215a3f91SJoel Fernandes | +---------------+ | 143*215a3f91SJoel Fernandes | | IO-PMP | Controls access by FBDMA | 144*215a3f91SJoel Fernandes | | (IO Physical | and other IO Masters | 145*215a3f91SJoel Fernandes | | Memory Protect) | 146*215a3f91SJoel Fernandes | +-------^-------+ | 147*215a3f91SJoel Fernandes | | | 148*215a3f91SJoel Fernandes | | Protected Access Path for FBDMA | 149*215a3f91SJoel Fernandes | v | 150*215a3f91SJoel Fernandes | +---------------------------------------+ | 151*215a3f91SJoel Fernandes | | Memory | | 152*215a3f91SJoel Fernandes | | +---------------+ +------------+ | | 153*215a3f91SJoel Fernandes | | | IMEM | | DMEM | |<-----+ 154*215a3f91SJoel Fernandes | | | (Instruction | | (Data | | 155*215a3f91SJoel Fernandes | | | Memory) | | Memory) | | 156*215a3f91SJoel Fernandes | | +---------------+ +------------+ | 157*215a3f91SJoel Fernandes | +---------------------------------------+ 158*215a3f91SJoel Fernandes +-----------------------------------------------------+ 159