1.. _dcn_overview: 2 3======================= 4Display Core Next (DCN) 5======================= 6 7To equip our readers with the basic knowledge of how AMD Display Core Next 8(DCN) works, we need to start with an overview of the hardware pipeline. Below 9you can see a picture that provides a DCN overview, keep in mind that this is a 10generic diagram, and we have variations per ASIC. 11 12.. kernel-figure:: dc_pipeline_overview.svg 13 14Based on this diagram, we can pass through each block and briefly describe 15them: 16 17* **Display Controller Hub (DCHUB)**: This is the gateway between the Scalable 18 Data Port (SDP) and DCN. This component has multiple features, such as memory 19 arbitration, rotation, and cursor manipulation. 20 21* **Display Pipe and Plane (DPP)**: This block provides pre-blend pixel 22 processing such as color space conversion, linearization of pixel data, tone 23 mapping, and gamut mapping. 24 25* **Multiple Pipe/Plane Combined (MPC)**: This component performs blending of 26 multiple planes, using global or per-pixel alpha. 27 28* **Output Pixel Processing (OPP)**: Process and format pixels to be sent to 29 the display. 30 31* **Output Pipe Timing Combiner (OPTC)**: It generates time output to combine 32 streams or divide capabilities. CRC values are generated in this block. 33 34* **Display Output (DIO)**: Codify the output to the display connected to our 35 GPU. 36 37* **Display Writeback (DWB)**: It provides the ability to write the output of 38 the display pipe back to memory as video frames. 39 40* **Multi-Media HUB (MMHUBBUB)**: Memory controller interface for DMCUB and DWB 41 (Note that DWB is not hooked yet). 42 43* **DCN Management Unit (DMU)**: It provides registers with access control and 44 interrupts the controller to the SOC host interrupt unit. This block includes 45 the Display Micro-Controller Unit - version B (DMCUB), which is handled via 46 firmware. 47 48* **DCN Clock Generator Block (DCCG)**: It provides the clocks and resets 49 for all of the display controller clock domains. 50 51* **Azalia (AZ)**: Audio engine. 52 53The above diagram is an architecture generalization of DCN, which means that 54every ASIC has variations around this base model. Notice that the display 55pipeline is connected to the Scalable Data Port (SDP) via DCHUB; you can see 56the SDP as the element from our Data Fabric that feeds the display pipe. 57 58Always approach the DCN architecture as something flexible that can be 59configured and reconfigured in multiple ways; in other words, each block can be 60setup or ignored accordingly with userspace demands. For example, if we 61want to drive an 8k@60Hz with a DSC enabled, our DCN may require 4 DPP and 2 62OPP. It is DC's responsibility to drive the best configuration for each 63specific scenario. Orchestrate all of these components together requires a 64sophisticated communication interface which is highlighted in the diagram by 65the edges that connect each block; from the chart, each connection between 66these blocks represents: 67 681. Pixel data interface (red): Represents the pixel data flow; 692. Global sync signals (green): It is a set of synchronization signals composed 70 by VStartup, VUpdate, and VReady; 713. Config interface: Responsible to configure blocks; 724. Sideband signals: All other signals that do not fit the previous one. 73 74These signals are essential and play an important role in DCN. Nevertheless, 75the Global Sync deserves an extra level of detail described in the next 76section. 77 78All of these components are represented by a data structure named dc_state. 79From DCHUB to MPC, we have a representation called dc_plane; from MPC to OPTC, 80we have dc_stream, and the output (DIO) is handled by dc_link. Keep in mind 81that HUBP accesses a surface using a specific format read from memory, and our 82dc_plane should work to convert all pixels in the plane to something that can 83be sent to the display via dc_stream and dc_link. 84 85Front End and Back End 86---------------------- 87 88Display pipeline can be broken down into two components that are usually 89referred as **Front End (FE)** and **Back End (BE)**, where FE consists of: 90 91* DCHUB (Mainly referring to a subcomponent named HUBP) 92* DPP 93* MPC 94 95On the other hand, BE consist of 96 97* OPP 98* OPTC 99* DIO (DP/HDMI stream encoder and link encoder) 100 101OPP and OPTC are two joining blocks between FE and BE. On a side note, this is 102a one-to-one mapping of the link encoder to PHY, but we can configure the DCN 103to choose which link encoder to connect to which PHY. FE's main responsibility 104is to change, blend and compose pixel data, while BE's job is to frame a 105generic pixel stream to a specific display's pixel stream. 106 107Data Flow 108--------- 109 110Initially, data is passed in from VRAM through Data Fabric (DF) in native pixel 111formats. Such data format stays through till HUBP in DCHUB, where HUBP unpacks 112different pixel formats and outputs them to DPP in uniform streams through 4 113channels (1 for alpha + 3 for colors). 114 115The Converter and Cursor (CNVC) in DPP would then normalize the data 116representation and convert them to a DCN specific floating-point format (i.e., 117different from the IEEE floating-point format). In the process, CNVC also 118applies a degamma function to transform the data from non-linear to linear 119space to relax the floating-point calculations following. Data would stay in 120this floating-point format from DPP to OPP. 121 122Starting OPP, because color transformation and blending have been completed 123(i.e alpha can be dropped), and the end sinks do not require the precision and 124dynamic range that floating points provide (i.e. all displays are in integer 125depth format), bit-depth reduction/dithering would kick in. In OPP, we would 126also apply a regamma function to introduce the gamma removed earlier back. 127Eventually, we output data in integer format at DIO. 128 129AMD Hardware Pipeline 130--------------------- 131 132When discussing graphics on Linux, the **pipeline** term can sometimes be 133overloaded with multiple meanings, so it is important to define what we mean 134when we say **pipeline**. In the DCN driver, we use the term **hardware 135pipeline** or **pipeline** or just **pipe** as an abstraction to indicate a 136sequence of DCN blocks instantiated to address some specific configuration. DC 137core treats DCN blocks as individual resources, meaning we can build a pipeline 138by taking resources for all individual hardware blocks to compose one pipeline. 139In actuality, we can't connect an arbitrary block from one pipe to a block from 140another pipe; they are routed linearly, except for DSC, which can be 141arbitrarily assigned as needed. We have this pipeline concept for trying to 142optimize bandwidth utilization. 143 144.. kernel-figure:: pipeline_4k_no_split.svg 145 146Additionally, let's take a look at parts of the DTN log (see 147'Documentation/gpu/amdgpu/display/dc-debug.rst' for more information) since 148this log can help us to see part of this pipeline behavior in real-time:: 149 150 HUBP: format addr_hi width height ... 151 [ 0]: 8h 81h 3840 2160 152 [ 1]: 0h 0h 0 0 153 [ 2]: 0h 0h 0 0 154 [ 3]: 0h 0h 0 0 155 [ 4]: 0h 0h 0 0 156 ... 157 MPCC: OPP DPP ... 158 [ 0]: 0h 0h ... 159 160The first thing to notice from the diagram and DTN log it is the fact that we 161have different clock domains for each part of the DCN blocks. In this example, 162we have just a single **pipeline** where the data flows from DCHUB to DIO, as 163we intuitively expect. Nonetheless, DCN is flexible, as mentioned before, and 164we can split this single pipe differently, as described in the below diagram: 165 166.. kernel-figure:: pipeline_4k_split.svg 167 168Now, if we inspect the DTN log again we can see some interesting changes:: 169 170 HUBP: format addr_hi width height ... 171 [ 0]: 8h 81h 1920 2160 ... 172 ... 173 [ 4]: 0h 0h 0 0 ... 174 [ 5]: 8h 81h 1920 2160 ... 175 ... 176 MPCC: OPP DPP ... 177 [ 0]: 0h 0h ... 178 [ 5]: 0h 5h ... 179 180From the above example, we now split the display pipeline into two vertical 181parts of 1920x2160 (i.e., 3440x2160), and as a result, we could reduce the 182clock frequency in the DPP part. This is not only useful for saving power but 183also to better handle the required throughput. The idea to keep in mind here is 184that the pipe configuration can vary a lot according to the display 185configuration, and it is the DML's responsibility to set up all required 186configuration parameters for multiple scenarios supported by our hardware. 187 188Global Sync 189----------- 190 191Many DCN registers are double buffered, most importantly the surface address. 192This allows us to update DCN hardware atomically for page flips, as well as 193for most other updates that don't require enabling or disabling of new pipes. 194 195(Note: There are many scenarios when DC will decide to reserve extra pipes 196in order to support outputs that need a very high pixel clock, or for 197power saving purposes.) 198 199These atomic register updates are driven by global sync signals in DCN. In 200order to understand how atomic updates interact with DCN hardware, and how DCN 201signals page flip and vblank events it is helpful to understand how global sync 202is programmed. 203 204Global sync consists of three signals, VSTARTUP, VUPDATE, and VREADY. These are 205calculated by the Display Mode Library - DML (drivers/gpu/drm/amd/display/dc/dml) 206based on a large number of parameters and ensure our hardware is able to feed 207the DCN pipeline without underflows or hangs in any given system configuration. 208The global sync signals always happen during VBlank, are independent from the 209VSync signal, and do not overlap each other. 210 211VUPDATE is the only signal that is of interest to the rest of the driver stack 212or userspace clients as it signals the point at which hardware latches to 213atomically programmed (i.e. double buffered) registers. Even though it is 214independent of the VSync signal we use VUPDATE to signal the VSync event as it 215provides the best indication of how atomic commits and hardware interact. 216 217Since DCN hardware is double-buffered the DC driver is able to program the 218hardware at any point during the frame. 219 220The below picture illustrates the global sync signals: 221 222.. kernel-figure:: global_sync_vblank.svg 223 224These signals affect core DCN behavior. Programming them incorrectly will lead 225to a number of negative consequences, most of them quite catastrophic. 226 227The following picture shows how global sync allows for a mailbox style of 228updates, i.e. it allows for multiple re-configurations between VUpdate 229events where only the last configuration programmed before the VUpdate signal 230becomes effective. 231 232.. kernel-figure:: config_example.svg 233