1============= 2 Ring Buffer 3============= 4 5To handle communication between user space and kernel space, AMD GPUs use a 6ring buffer design to feed the engines (GFX, Compute, SDMA, UVD, VCE, VCN, VPE, 7etc.). See the figure below that illustrates how this communication works: 8 9.. kernel-figure:: ring_buffers.svg 10 11Ring buffers in the amdgpu work as a producer-consumer model, where userspace 12acts as the producer, constantly filling the ring buffer with GPU commands to 13be executed. Meanwhile, the GPU retrieves the information from the ring, parses 14it, and distributes the specific set of instructions between the different 15amdgpu blocks. 16 17Notice from the diagram that the ring has a Read Pointer (rptr), which 18indicates where the engine is currently reading packets from the ring, and a 19Write Pointer (wptr), which indicates how many packets software has added to 20the ring. When the rptr and wptr are equal, the ring is idle. When software 21adds packets to the ring, it updates the wptr, this causes the engine to start 22fetching and processing packets. As the engine processes packets, the rptr gets 23updates until the rptr catches up to the wptr and they are equal again. 24 25Usually, ring buffers in the driver have a limited size (search for occurrences 26of `amdgpu_ring_init()`). One of the reasons for the small ring buffer size is 27that CP (Command Processor) is capable of following addresses inserted into the 28ring; this is illustrated in the image by the reference to the IB (Indirect 29Buffer). The IB gives userspace the possibility to have an area in memory that 30CP can read and feed the hardware with extra instructions. 31 32All ASICs pre-GFX11 use what is called a kernel queue, which means 33the ring is allocated in kernel space and has some restrictions, such as not 34being able to be :ref:`preempted directly by the scheduler<amdgpu-mes>`. GFX11 35and newer support kernel queues, but also provide a new mechanism named 36:ref:`user queues<amdgpu-userq>`, where the queue is moved to the user space 37and can be mapped and unmapped via the scheduler. In practice, both queues 38insert user-space-generated GPU commands from different jobs into the requested 39component ring. 40 41Enforce Isolation 42================= 43 44.. note:: After reading this section, you might want to check the 45 :ref:`Process Isolation<amdgpu-process-isolation>` page for more details. 46 47Before examining the Enforce Isolation mechanism in the ring buffer context, it 48is helpful to briefly discuss how instructions from the ring buffer are 49processed in the graphics pipeline. Let’s expand on this topic by checking the 50diagram below that illustrates the graphics pipeline: 51 52.. kernel-figure:: gfx_pipeline_seq.svg 53 54In terms of executing instructions, the GFX pipeline follows the sequence: 55Shader Export (SX), Geometry Engine (GE), Shader Process or Input (SPI), Scan 56Converter (SC), Primitive Assembler (PA), and cache manipulation (which may 57vary across ASICs). Another common way to describe the pipeline is to use Pixel 58Shader (PS), raster, and Vertex Shader (VS) to symbolize the two shader stages. 59Now, with this pipeline in mind, let's assume that Job B causes a hang issue, 60but Job C's instruction might already be executing, leading developers to 61incorrectly identify Job C as the problematic one. This problem can be 62mitigated on multiple levels; the diagram below illustrates how to minimize 63part of this problem: 64 65.. kernel-figure:: no_enforce_isolation.svg 66 67Note from the diagram that there is no guarantee of order or a clear separation 68between instructions, which is not a problem most of the time, and is also good 69for performance. Furthermore, notice some circles between jobs in the diagram 70that represent a **fence wait** used to avoid overlapping work in the ring. At 71the end of the fence, a cache flush occurs, ensuring that when the next job 72starts, it begins in a clean state and, if issues arise, the developer can 73pinpoint the problematic process more precisely. 74 75To increase the level of isolation between jobs, there is the "Enforce 76Isolation" method described in the picture below: 77 78.. kernel-figure:: enforce_isolation.svg 79 80As shown in the diagram, enforcing isolation introduces ordering between 81submissions, since the access to GFX/Compute is serialized, think about it as 82single process at a time mode for gfx/compute. Notice that this approach has a 83significant performance impact, as it allows only one job to submit commands at 84a time. However, this option can help pinpoint the job that caused the problem. 85Although enforcing isolation improves the situation, it does not fully resolve 86the issue of precisely pinpointing bad jobs, since isolation might mask the 87problem. In summary, identifying which job caused the issue may not be precise, 88but enforcing isolation might help with the debugging. 89 90Ring Operations 91=============== 92 93.. kernel-doc:: drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c 94 :internal: 95 96