Lines Matching +full:compute +full:-

1 Buffer Sharing and Synchronization (dma-buf)
4 The dma-buf subsystem provides the framework for sharing buffers for
14 interact with the three main primitives offered by dma-buf:
16 - dma-buf, representing a sg_table and exposed to userspace as a file
19 - dma-fence, providing a mechanism to signal when an asynchronous
21 - dma-resv, which manages a set of dma-fences for a particular dma-buf
22 allowing implicit (kernel-ordered) synchronization of work to
27 --------------------------------
29 For more details on how to design your subsystem's API for dma-buf use, please
30 see Documentation/userspace-api/dma-buf-alloc-exchange.rst.
34 ------------------
36 This document serves as a guide to device-driver writers on what is the dma-buf
43 exporter, and A as buffer-user/importer.
47 - implements and manages operations in :c:type:`struct dma_buf_ops
49 - allows other users to share the buffer by using dma_buf sharing APIs,
50 - manages the details of buffer allocation, wrapped in a :c:type:`struct
52 - decides about the actual backing storage where this allocation happens,
53 - and takes care of any migration of scatterlist - for all (shared) users of
56 The buffer-user
58 - is one of (many) sharing users of the buffer.
59 - doesn't need to worry about how the buffer is allocated, or where.
60 - and needs a mechanism to get access to the scatterlist that makes up this
65 Any exporters or users of the dma-buf buffer sharing framework must have a
75 - Since kernel 3.12 the dma-buf FD supports the llseek system call, but only
78 llseek operation will report -EINVAL.
80 If llseek on dma-buf FDs isn't supported the kernel will report -ESPIPE for all
81 cases. Userspace can use this to detect support for discovering the dma-buf
84 - In order to avoid fd leaks on exec, the FD_CLOEXEC flag must be set
92 multi-threaded app[3]. The issue is made worse when it is library code
97 flag be set when the dma-buf fd is created. So any API provided by
101 - Memory mapping the contents of the DMA buffer is also supported. See the
104 - The DMA buffer FD is also pollable, see `Implicit Fence Poll Support`_ below for
107 - The DMA buffer FD also supports a few dma-buf-specific ioctls, see
113 .. kernel-doc:: drivers/dma-buf/dma-buf.c
119 .. kernel-doc:: drivers/dma-buf/dma-buf.c
125 .. kernel-doc:: drivers/dma-buf/dma-buf.c
128 DMA-BUF statistics
130 .. kernel-doc:: drivers/dma-buf/dma-buf-sysfs-stats.c
136 .. kernel-doc:: include/uapi/linux/dma-buf.h
138 DMA-BUF locking convention
141 .. kernel-doc:: drivers/dma-buf/dma-buf.c
147 .. kernel-doc:: drivers/dma-buf/dma-buf.c
150 .. kernel-doc:: include/linux/dma-buf.h
154 -------------------
156 .. kernel-doc:: drivers/dma-buf/dma-resv.c
159 .. kernel-doc:: drivers/dma-buf/dma-resv.c
162 .. kernel-doc:: include/linux/dma-resv.h
166 ----------
168 .. kernel-doc:: drivers/dma-buf/dma-fence.c
171 DMA Fence Cross-Driver Contract
174 .. kernel-doc:: drivers/dma-buf/dma-fence.c
175 :doc: fence cross-driver contract
180 .. kernel-doc:: drivers/dma-buf/dma-fence.c
186 .. kernel-doc:: drivers/dma-buf/dma-fence.c
192 .. kernel-doc:: drivers/dma-buf/dma-fence.c
195 .. kernel-doc:: include/linux/dma-fence.h
201 .. kernel-doc:: drivers/dma-buf/dma-fence-array.c
204 .. kernel-doc:: include/linux/dma-fence-array.h
210 .. kernel-doc:: drivers/dma-buf/dma-fence-chain.c
213 .. kernel-doc:: include/linux/dma-fence-chain.h
219 .. kernel-doc:: include/linux/dma-fence-unwrap.h
225 .. kernel-doc:: drivers/dma-buf/sync_file.c
228 .. kernel-doc:: include/linux/sync_file.h
234 .. kernel-doc:: include/uapi/linux/sync_file.h
250 * Userspace fences or gpu futexes, fine-grained locking within a command buffer
255 * Long-running compute command buffers, while still using traditional end of
257 fences which get reattached when the compute job is rescheduled.
261 in-kernel DMA fences does not work, even when a fallback timeout is included to
276 .. kernel-render:: DOT
284 kernel -> userspace [label="memory management"]
285 userspace -> kernel [label="Future fence, fence proxy, ..."]
304 userspace is allowed to use userspace fencing or long running compute
326 on-demand fill a memory request. For now this means recoverable page
327 faults on GPUs are limited to pure compute workloads.
330 compute side, like compute units or command submission engines. If both a 3D
331 job with a DMA fence and a compute workload using recoverable page faults are
334 - The 3D workload might need to wait for the compute job to finish and release
337 - The compute workload might be stuck in a page fault, because the memory
343 - Compute workloads can always be preempted, even when a page fault is pending
346 - DMA fence workloads and workloads which need page fault handling have
348 achieved through e.g. through dedicated engines and minimal compute unit
351 - The reservation approach could be further refined by only reserving the
352 hardware resources for DMA fence workloads when they are in-flight. This must
356 - As a last resort, if the hardware provides no useful reservation mechanics,
359 fences must complete before a compute job with page fault handling can be
361 made visible anywhere in the system, all compute workloads must be preempted
364 - Only a fairly theoretical option would be to untangle these dependencies when
378 Fences` discussions: Infinite fences from compute workloads are allowed to
381 hit a page fault which holds up a userspace fence - supporting page faults on