xref: /linux/Documentation/userspace-api/dma-buf-alloc-exchange.rst (revision 24168c5e6dfbdd5b414f048f47f75d64533296ca)
1.. SPDX-License-Identifier: GPL-2.0
2.. Copyright 2021-2023 Collabora Ltd.
3
4========================
5Exchanging pixel buffers
6========================
7
8As originally designed, the Linux graphics subsystem had extremely limited
9support for sharing pixel-buffer allocations between processes, devices, and
10subsystems. Modern systems require extensive integration between all three
11classes; this document details how applications and kernel subsystems should
12approach this sharing for two-dimensional image data.
13
14It is written with reference to the DRM subsystem for GPU and display devices,
15V4L2 for media devices, and also to Vulkan, EGL and Wayland, for userspace
16support, however any other subsystems should also follow this design and advice.
17
18
19Glossary of terms
20=================
21
22.. glossary::
23
24    image:
25      Conceptually a two-dimensional array of pixels. The pixels may be stored
26      in one or more memory buffers. Has width and height in pixels, pixel
27      format and modifier (implicit or explicit).
28
29    row:
30      A span along a single y-axis value, e.g. from co-ordinates (0,100) to
31      (200,100).
32
33    scanline:
34      Synonym for row.
35
36    column:
37      A span along a single x-axis value, e.g. from co-ordinates (100,0) to
38      (100,100).
39
40    memory buffer:
41      A piece of memory for storing (parts of) pixel data. Has stride and size
42      in bytes and at least one handle in some API. May contain one or more
43      planes.
44
45    plane:
46      A two-dimensional array of some or all of an image's color and alpha
47      channel values.
48
49    pixel:
50      A picture element. Has a single color value which is defined by one or
51      more color channels values, e.g. R, G and B, or Y, Cb and Cr. May also
52      have an alpha value as an additional channel.
53
54    pixel data:
55      Bytes or bits that represent some or all of the color/alpha channel values
56      of a pixel or an image. The data for one pixel may be spread over several
57      planes or memory buffers depending on format and modifier.
58
59    color value:
60      A tuple of numbers, representing a color. Each element in the tuple is a
61      color channel value.
62
63    color channel:
64      One of the dimensions in a color model. For example, RGB model has
65      channels R, G, and B. Alpha channel is sometimes counted as a color
66      channel as well.
67
68    pixel format:
69      A description of how pixel data represents the pixel's color and alpha
70      values.
71
72    modifier:
73      A description of how pixel data is laid out in memory buffers.
74
75    alpha:
76      A value that denotes the color coverage in a pixel. Sometimes used for
77      translucency instead.
78
79    stride:
80      A value that denotes the relationship between pixel-location co-ordinates
81      and byte-offset values. Typically used as the byte offset between two
82      pixels at the start of vertically-consecutive tiling blocks. For linear
83      layouts, the byte offset between two vertically-adjacent pixels. For
84      non-linear formats the stride must be computed in a consistent way, which
85      usually is done as-if the layout was linear.
86
87    pitch:
88      Synonym for stride.
89
90
91Formats and modifiers
92=====================
93
94Each buffer must have an underlying format. This format describes the color
95values provided for each pixel. Although each subsystem has its own format
96descriptions (e.g. V4L2 and fbdev), the ``DRM_FORMAT_*`` tokens should be reused
97wherever possible, as they are the standard descriptions used for interchange.
98These tokens are described in the ``drm_fourcc.h`` file, which is a part of
99DRM's uAPI.
100
101Each ``DRM_FORMAT_*`` token describes the translation between a pixel
102co-ordinate in an image, and the color values for that pixel contained within
103its memory buffers. The number and type of color channels are described:
104whether they are RGB or YUV, integer or floating-point, the size of each channel
105and their locations within the pixel memory, and the relationship between color
106planes.
107
108For example, ``DRM_FORMAT_ARGB8888`` describes a format in which each pixel has
109a single 32-bit value in memory. Alpha, red, green, and blue, color channels are
110available at 8-bit precision per channel, ordered respectively from most to
111least significant bits in little-endian storage. ``DRM_FORMAT_*`` is not
112affected by either CPU or device endianness; the byte pattern in memory is
113always as described in the format definition, which is usually little-endian.
114
115As a more complex example, ``DRM_FORMAT_NV12`` describes a format in which luma
116and chroma YUV samples are stored in separate planes, where the chroma plane is
117stored at half the resolution in both dimensions (i.e. one U/V chroma
118sample is stored for each 2x2 pixel grouping).
119
120Format modifiers describe a translation mechanism between these per-pixel memory
121samples, and the actual memory storage for the buffer. The most straightforward
122modifier is ``DRM_FORMAT_MOD_LINEAR``, describing a scheme in which each plane
123is laid out row-sequentially, from the top-left to the bottom-right corner.
124This is considered the baseline interchange format, and most convenient for CPU
125access.
126
127Modern hardware employs much more sophisticated access mechanisms, typically
128making use of tiled access and possibly also compression. For example, the
129``DRM_FORMAT_MOD_VIVANTE_TILED`` modifier describes memory storage where pixels
130are stored in 4x4 blocks arranged in row-major ordering, i.e. the first tile in
131a plane stores pixels (0,0) to (3,3) inclusive, and the second tile in a plane
132stores pixels (4,0) to (7,3) inclusive.
133
134Some modifiers may modify the number of planes required for an image; for
135example, the ``I915_FORMAT_MOD_Y_TILED_CCS`` modifier adds a second plane to RGB
136formats in which it stores data about the status of every tile, notably
137including whether the tile is fully populated with pixel data, or can be
138expanded from a single solid color.
139
140These extended layouts are highly vendor-specific, and even specific to
141particular generations or configurations of devices per-vendor. For this reason,
142support of modifiers must be explicitly enumerated and negotiated by all users
143in order to ensure a compatible and optimal pipeline, as discussed below.
144
145
146Dimensions and size
147===================
148
149Each pixel buffer must be accompanied by logical pixel dimensions. This refers
150to the number of unique samples which can be extracted from, or stored to, the
151underlying memory storage. For example, even though a 1920x1080
152``DRM_FORMAT_NV12`` buffer has a luma plane containing 1920x1080 samples for the Y
153component, and 960x540 samples for the U and V components, the overall buffer is
154still described as having dimensions of 1920x1080.
155
156The in-memory storage of a buffer is not guaranteed to begin immediately at the
157base address of the underlying memory, nor is it guaranteed that the memory
158storage is tightly clipped to either dimension.
159
160Each plane must therefore be described with an ``offset`` in bytes, which will be
161added to the base address of the memory storage before performing any per-pixel
162calculations. This may be used to combine multiple planes into a single memory
163buffer; for example, ``DRM_FORMAT_NV12`` may be stored in a single memory buffer
164where the luma plane's storage begins immediately at the start of the buffer
165with an offset of 0, and the chroma plane's storage follows within the same buffer
166beginning from the byte offset for that plane.
167
168Each plane must also have a ``stride`` in bytes, expressing the offset in memory
169between two contiguous row. For example, a ``DRM_FORMAT_MOD_LINEAR`` buffer
170with dimensions of 1000x1000 may have been allocated as if it were 1024x1000, in
171order to allow for aligned access patterns. In this case, the buffer will still
172be described with a width of 1000, however the stride will be ``1024 * bpp``,
173indicating that there are 24 pixels at the positive extreme of the x axis whose
174values are not significant.
175
176Buffers may also be padded further in the y dimension, simply by allocating a
177larger area than would ordinarily be required. For example, many media decoders
178are not able to natively output buffers of height 1080, but instead require an
179effective height of 1088 pixels. In this case, the buffer continues to be
180described as having a height of 1080, with the memory allocation for each buffer
181being increased to account for the extra padding.
182
183
184Enumeration
185===========
186
187Every user of pixel buffers must be able to enumerate a set of supported formats
188and modifiers, described together. Within KMS, this is achieved with the
189``IN_FORMATS`` property on each DRM plane, listing the supported DRM formats, and
190the modifiers supported for each format. In userspace, this is supported through
191the `EGL_EXT_image_dma_buf_import_modifiers`_ extension entrypoints for EGL, the
192`VK_EXT_image_drm_format_modifier`_ extension for Vulkan, and the
193`zwp_linux_dmabuf_v1`_ extension for Wayland.
194
195Each of these interfaces allows users to query a set of supported
196format+modifier combinations.
197
198
199Negotiation
200===========
201
202It is the responsibility of userspace to negotiate an acceptable format+modifier
203combination for its usage. This is performed through a simple intersection of
204lists. For example, if a user wants to use Vulkan to render an image to be
205displayed on a KMS plane, it must:
206
207 - query KMS for the ``IN_FORMATS`` property for the given plane
208 - query Vulkan for the supported formats for its physical device, making sure
209   to pass the ``VkImageUsageFlagBits`` and ``VkImageCreateFlagBits``
210   corresponding to the intended rendering use
211 - intersect these formats to determine the most appropriate one
212 - for this format, intersect the lists of supported modifiers for both KMS and
213   Vulkan, to obtain a final list of acceptable modifiers for that format
214
215This intersection must be performed for all usages. For example, if the user
216also wishes to encode the image to a video stream, it must query the media API
217it intends to use for encoding for the set of modifiers it supports, and
218additionally intersect against this list.
219
220If the intersection of all lists is an empty list, it is not possible to share
221buffers in this way, and an alternate strategy must be considered (e.g. using
222CPU access routines to copy data between the different uses, with the
223corresponding performance cost).
224
225The resulting modifier list is unsorted; the order is not significant.
226
227
228Allocation
229==========
230
231Once userspace has determined an appropriate format, and corresponding list of
232acceptable modifiers, it must allocate the buffer. As there is no universal
233buffer-allocation interface available at either kernel or userspace level, the
234client makes an arbitrary choice of allocation interface such as Vulkan, GBM, or
235a media API.
236
237Each allocation request must take, at a minimum: the pixel format, a list of
238acceptable modifiers, and the buffer's width and height. Each API may extend
239this set of properties in different ways, such as allowing allocation in more
240than two dimensions, intended usage patterns, etc.
241
242The component which allocates the buffer will make an arbitrary choice of what
243it considers the 'best' modifier within the acceptable list for the requested
244allocation, any padding required, and further properties of the underlying
245memory buffers such as whether they are stored in system or device-specific
246memory, whether or not they are physically contiguous, and their cache mode.
247These properties of the memory buffer are not visible to userspace, however the
248``dma-heaps`` API is an effort to address this.
249
250After allocation, the client must query the allocator to determine the actual
251modifier selected for the buffer, as well as the per-plane offset and stride.
252Allocators are not permitted to vary the format in use, to select a modifier not
253provided within the acceptable list, nor to vary the pixel dimensions other than
254the padding expressed through offset, stride, and size.
255
256Communicating additional constraints, such as alignment of stride or offset,
257placement within a particular memory area, etc, is out of scope of dma-buf,
258and is not solved by format and modifier tokens.
259
260
261Import
262======
263
264To use a buffer within a different context, device, or subsystem, the user
265passes these parameters (format, modifier, width, height, and per-plane offset
266and stride) to an importing API.
267
268Each memory buffer is referred to by a buffer handle, which may be unique or
269duplicated within an image. For example, a ``DRM_FORMAT_NV12`` buffer may have
270the luma and chroma buffers combined into a single memory buffer by use of the
271per-plane offset parameters, or they may be completely separate allocations in
272memory. For this reason, each import and allocation API must provide a separate
273handle for each plane.
274
275Each kernel subsystem has its own types and interfaces for buffer management.
276DRM uses GEM buffer objects (BOs), V4L2 has its own references, etc. These types
277are not portable between contexts, processes, devices, or subsystems.
278
279To address this, ``dma-buf`` handles are used as the universal interchange for
280buffers. Subsystem-specific operations are used to export native buffer handles
281to a ``dma-buf`` file descriptor, and to import those file descriptors into a
282native buffer handle. dma-buf file descriptors can be transferred between
283contexts, processes, devices, and subsystems.
284
285For example, a Wayland media player may use V4L2 to decode a video frame into a
286``DRM_FORMAT_NV12`` buffer. This will result in two memory planes (luma and
287chroma) being dequeued by the user from V4L2. These planes are then exported to
288one dma-buf file descriptor per plane, these descriptors are then sent along
289with the metadata (format, modifier, width, height, per-plane offset and stride)
290to the Wayland server. The Wayland server will then import these file
291descriptors as an EGLImage for use through EGL/OpenGL (ES), a VkImage for use
292through Vulkan, or a KMS framebuffer object; each of these import operations
293will take the same metadata and convert the dma-buf file descriptors into their
294native buffer handles.
295
296Having a non-empty intersection of supported modifiers does not guarantee that
297import will succeed into all consumers; they may have constraints beyond those
298implied by modifiers which must be satisfied.
299
300
301Implicit modifiers
302==================
303
304The concept of modifiers post-dates all of the subsystems mentioned above. As
305such, it has been retrofitted into all of these APIs, and in order to ensure
306backwards compatibility, support is needed for drivers and userspace which do
307not (yet) support modifiers.
308
309As an example, GBM is used to allocate buffers to be shared between EGL for
310rendering and KMS for display. It has two entrypoints for allocating buffers:
311``gbm_bo_create`` which only takes the format, width, height, and a usage token,
312and ``gbm_bo_create_with_modifiers`` which extends this with a list of modifiers.
313
314In the latter case, the allocation is as discussed above, being provided with a
315list of acceptable modifiers that the implementation can choose from (or fail if
316it is not possible to allocate within those constraints). In the former case
317where modifiers are not provided, the GBM implementation must make its own
318choice as to what is likely to be the 'best' layout. Such a choice is entirely
319implementation-specific: some will internally use tiled layouts which are not
320CPU-accessible if the implementation decides that is a good idea through
321whatever heuristic. It is the implementation's responsibility to ensure that
322this choice is appropriate.
323
324To support this case where the layout is not known because there is no awareness
325of modifiers, a special ``DRM_FORMAT_MOD_INVALID`` token has been defined. This
326pseudo-modifier declares that the layout is not known, and that the driver
327should use its own logic to determine what the underlying layout may be.
328
329.. note::
330
331  ``DRM_FORMAT_MOD_INVALID`` is a non-zero value. The modifier value zero is
332  ``DRM_FORMAT_MOD_LINEAR``, which is an explicit guarantee that the image
333  has the linear layout. Care and attention should be taken to ensure that
334  zero as a default value is not mixed up with either no modifier or the linear
335  modifier. Also note that in some APIs the invalid modifier value is specified
336  with an out-of-band flag, like in ``DRM_IOCTL_MODE_ADDFB2``.
337
338There are four cases where this token may be used:
339  - during enumeration, an interface may return ``DRM_FORMAT_MOD_INVALID``, either
340    as the sole member of a modifier list to declare that explicit modifiers are
341    not supported, or as part of a larger list to declare that implicit modifiers
342    may be used
343  - during allocation, a user may supply ``DRM_FORMAT_MOD_INVALID``, either as the
344    sole member of a modifier list (equivalent to not supplying a modifier list
345    at all) to declare that explicit modifiers are not supported and must not be
346    used, or as part of a larger list to declare that an allocation using implicit
347    modifiers is acceptable
348  - in a post-allocation query, an implementation may return
349    ``DRM_FORMAT_MOD_INVALID`` as the modifier of the allocated buffer to declare
350    that the underlying layout is implementation-defined and that an explicit
351    modifier description is not available; per the above rules, this may only be
352    returned when the user has included ``DRM_FORMAT_MOD_INVALID`` as part of the
353    list of acceptable modifiers, or not provided a list
354  - when importing a buffer, the user may supply ``DRM_FORMAT_MOD_INVALID`` as the
355    buffer modifier (or not supply a modifier) to indicate that the modifier is
356    unknown for whatever reason; this is only acceptable when the buffer has
357    not been allocated with an explicit modifier
358
359It follows from this that for any single buffer, the complete chain of operations
360formed by the producer and all the consumers must be either fully implicit or fully
361explicit. For example, if a user wishes to allocate a buffer for use between
362GPU, display, and media, but the media API does not support modifiers, then the
363user **must not** allocate the buffer with explicit modifiers and attempt to
364import the buffer into the media API with no modifier, but either perform the
365allocation using implicit modifiers, or allocate the buffer for media use
366separately and copy between the two buffers.
367
368As one exception to the above, allocations may be 'upgraded' from implicit
369to explicit modifiers. For example, if the buffer is allocated with
370``gbm_bo_create`` (taking no modifiers), the user may then query the modifier with
371``gbm_bo_get_modifier`` and then use this modifier as an explicit modifier token
372if a valid modifier is returned.
373
374When allocating buffers for exchange between different users and modifiers are
375not available, implementations are strongly encouraged to use
376``DRM_FORMAT_MOD_LINEAR`` for their allocation, as this is the universal baseline
377for exchange. However, it is not guaranteed that this will result in the correct
378interpretation of buffer content, as implicit modifier operation may still be
379subject to driver-specific heuristics.
380
381Any new users - userspace programs and protocols, kernel subsystems, etc -
382wishing to exchange buffers must offer interoperability through dma-buf file
383descriptors for memory planes, DRM format tokens to describe the format, DRM
384format modifiers to describe the layout in memory, at least width and height for
385dimensions, and at least offset and stride for each memory plane.
386
387.. _zwp_linux_dmabuf_v1: https://gitlab.freedesktop.org/wayland/wayland-protocols/-/blob/main/unstable/linux-dmabuf/linux-dmabuf-unstable-v1.xml
388.. _VK_EXT_image_drm_format_modifier: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VK_EXT_image_drm_format_modifier.html
389.. _EGL_EXT_image_dma_buf_import_modifiers: https://registry.khronos.org/EGL/extensions/EXT/EGL_EXT_image_dma_buf_import_modifiers.txt
390