xref: /linux/Documentation/userspace-api/media/v4l/dev-stateless-decoder.rst (revision 778b8ebe5192e7a7f00563a7456517dfa63e1d90)
1.. SPDX-License-Identifier: GPL-2.0
2.. c:namespace:: V4L
3
4.. _stateless_decoder:
5
6**************************************************
7Memory-to-memory Stateless Video Decoder Interface
8**************************************************
9
10A stateless decoder is a decoder that works without retaining any kind of state
11between processed frames. This means that each frame is decoded independently
12of any previous and future frames, and that the client is responsible for
13maintaining the decoding state and providing it to the decoder with each
14decoding request. This is in contrast to the stateful video decoder interface,
15where the hardware and driver maintain the decoding state and all the client
16has to do is to provide the raw encoded stream and dequeue decoded frames in
17display order.
18
19This section describes how user-space ("the client") is expected to communicate
20with stateless decoders in order to successfully decode an encoded stream.
21Compared to stateful codecs, the decoder/client sequence is simpler, but the
22cost of this simplicity is extra complexity in the client which is responsible
23for maintaining a consistent decoding state.
24
25Stateless decoders make use of the :ref:`media-request-api`. A stateless
26decoder must expose the ``V4L2_BUF_CAP_SUPPORTS_REQUESTS`` capability on its
27``OUTPUT`` queue when :c:func:`VIDIOC_REQBUFS` or :c:func:`VIDIOC_CREATE_BUFS`
28are invoked.
29
30Depending on the encoded formats supported by the decoder, a single decoded
31frame may be the result of several decode requests (for instance, H.264 streams
32with multiple slices per frame). Decoders that support such formats must also
33expose the ``V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF`` capability on their
34``OUTPUT`` queue.
35
36Querying capabilities
37=====================
38
391. To enumerate the set of coded formats supported by the decoder, the client
40   calls :c:func:`VIDIOC_ENUM_FMT` on the ``OUTPUT`` queue.
41
42   * The driver must always return the full set of supported ``OUTPUT`` formats,
43     irrespective of the format currently set on the ``CAPTURE`` queue.
44
45   * Simultaneously, the driver must restrain the set of values returned by
46     codec-specific capability controls (such as H.264 profiles) to the set
47     actually supported by the hardware.
48
492. To enumerate the set of supported raw formats, the client calls
50   :c:func:`VIDIOC_ENUM_FMT` on the ``CAPTURE`` queue.
51
52   * The driver must return only the formats supported for the format currently
53     active on the ``OUTPUT`` queue.
54
55   * Depending on the currently set ``OUTPUT`` format, the set of supported raw
56     formats may depend on the value of some codec-dependent controls.
57     The client is responsible for making sure that these controls are set
58     before querying the ``CAPTURE`` queue. Failure to do so will result in the
59     default values for these controls being used, and a returned set of formats
60     that may not be usable for the media the client is trying to decode.
61
623. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported
63   resolutions for a given format, passing desired pixel format in
64   :c:type:`v4l2_frmsizeenum`'s ``pixel_format``.
65
664. Supported profiles and levels for the current ``OUTPUT`` format, if
67   applicable, may be queried using their respective controls via
68   :c:func:`VIDIOC_QUERYCTRL`.
69
70Initialization
71==============
72
731. Set the coded format on the ``OUTPUT`` queue via :c:func:`VIDIOC_S_FMT`.
74
75   * **Required fields:**
76
77     ``type``
78         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``.
79
80     ``pixelformat``
81         a coded pixel format.
82
83     ``width``, ``height``
84         coded width and height parsed from the stream.
85
86     other fields
87         follow standard semantics.
88
89   .. note::
90
91      Changing the ``OUTPUT`` format may change the currently set ``CAPTURE``
92      format. The driver will derive a new ``CAPTURE`` format from the
93      ``OUTPUT`` format being set, including resolution, colorimetry
94      parameters, etc. If the client needs a specific ``CAPTURE`` format,
95      it must adjust it afterwards.
96
972. Call :c:func:`VIDIOC_S_EXT_CTRLS` to set all the controls (parsed headers,
98   etc.) required by the ``OUTPUT`` format to enumerate the ``CAPTURE`` formats.
99
1003. Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get the format for the
101   destination buffers parsed/decoded from the bytestream.
102
103   * **Required fields:**
104
105     ``type``
106         a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``.
107
108   * **Returned fields:**
109
110     ``width``, ``height``
111         frame buffer resolution for the decoded frames.
112
113     ``pixelformat``
114         pixel format for decoded frames.
115
116     ``num_planes`` (for _MPLANE ``type`` only)
117         number of planes for pixelformat.
118
119     ``sizeimage``, ``bytesperline``
120         as per standard semantics; matching frame buffer format.
121
122   .. note::
123
124      The value of ``pixelformat`` may be any pixel format supported for the
125      ``OUTPUT`` format, based on the hardware capabilities. It is suggested
126      that the driver chooses the preferred/optimal format for the current
127      configuration. For example, a YUV format may be preferred over an RGB
128      format, if an additional conversion step would be required for RGB.
129
1304. *[optional]* Enumerate ``CAPTURE`` formats via :c:func:`VIDIOC_ENUM_FMT` on
131   the ``CAPTURE`` queue. The client may use this ioctl to discover which
132   alternative raw formats are supported for the current ``OUTPUT`` format and
133   select one of them via :c:func:`VIDIOC_S_FMT`.
134
135   .. note::
136
137      The driver will return only formats supported for the currently selected
138      ``OUTPUT`` format and currently set controls, even if more formats may be
139      supported by the decoder in general.
140
141      For example, a decoder may support YUV and RGB formats for
142      resolutions 1920x1088 and lower, but only YUV for higher resolutions (due
143      to hardware limitations). After setting a resolution of 1920x1088 or lower
144      as the ``OUTPUT`` format, :c:func:`VIDIOC_ENUM_FMT` may return a set of
145      YUV and RGB pixel formats, but after setting a resolution higher than
146      1920x1088, the driver will not return RGB pixel formats, since they are
147      unsupported for this resolution.
148
1495. *[optional]* Choose a different ``CAPTURE`` format than suggested via
150   :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for the client to
151   choose a different format than selected/suggested by the driver in
152   :c:func:`VIDIOC_G_FMT`.
153
154    * **Required fields:**
155
156      ``type``
157          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``.
158
159      ``pixelformat``
160          a raw pixel format.
161
162      ``width``, ``height``
163         frame buffer resolution of the decoded stream; typically unchanged from
164         what was returned with :c:func:`VIDIOC_G_FMT`, but it may be different
165         if the hardware supports composition and/or scaling.
166
167   After performing this step, the client must perform step 3 again in order
168   to obtain up-to-date information about the buffers size and layout.
169
1706. Allocate source (bytestream) buffers via :c:func:`VIDIOC_REQBUFS` on
171   ``OUTPUT`` queue.
172
173    * **Required fields:**
174
175      ``count``
176          requested number of buffers to allocate; greater than zero.
177
178      ``type``
179          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``.
180
181      ``memory``
182          follows standard semantics.
183
184    * **Returned fields:**
185
186      ``count``
187          actual number of buffers allocated.
188
189    * If required, the driver will adjust ``count`` to be equal or bigger to the
190      minimum of required number of ``OUTPUT`` buffers for the given format and
191      requested count. The client must check this value after the ioctl returns
192      to get the actual number of buffers allocated.
193
1947. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS` on the
195   ``CAPTURE`` queue.
196
197    * **Required fields:**
198
199      ``count``
200          requested number of buffers to allocate; greater than zero. The client
201          is responsible for deducing the minimum number of buffers required
202          for the stream to be properly decoded (taking e.g. reference frames
203          into account) and pass an equal or bigger number.
204
205      ``type``
206          a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``.
207
208      ``memory``
209          follows standard semantics. ``V4L2_MEMORY_USERPTR`` is not supported
210          for ``CAPTURE`` buffers.
211
212    * **Returned fields:**
213
214      ``count``
215          adjusted to allocated number of buffers, in case the codec requires
216          more buffers than requested.
217
218    * The driver must adjust count to the minimum of required number of
219      ``CAPTURE`` buffers for the current format, stream configuration and
220      requested count. The client must check this value after the ioctl
221      returns to get the number of buffers allocated.
222
2238. Allocate requests (likely one per ``OUTPUT`` buffer) via
224    :c:func:`MEDIA_IOC_REQUEST_ALLOC` on the media device.
225
2269. Start streaming on both ``OUTPUT`` and ``CAPTURE`` queues via
227    :c:func:`VIDIOC_STREAMON`.
228
229Decoding
230========
231
232For each frame, the client is responsible for submitting at least one request to
233which the following is attached:
234
235* The amount of encoded data expected by the codec for its current
236  configuration, as a buffer submitted to the ``OUTPUT`` queue. Typically, this
237  corresponds to one frame worth of encoded data, but some formats may allow (or
238  require) different amounts per unit.
239* All the metadata needed to decode the submitted encoded data, in the form of
240  controls relevant to the format being decoded.
241
242The amount of data and contents of the source ``OUTPUT`` buffer, as well as the
243controls that must be set on the request, depend on the active coded pixel
244format and might be affected by codec-specific extended controls, as stated in
245documentation of each format.
246
247If there is a possibility that the decoded frame will require one or more
248decode requests after the current one in order to be produced, then the client
249must set the ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag on the ``OUTPUT``
250buffer. This will result in the (potentially partially) decoded ``CAPTURE``
251buffer not being made available for dequeueing, and reused for the next decode
252request if the timestamp of the next ``OUTPUT`` buffer has not changed.
253
254A typical frame would thus be decoded using the following sequence:
255
2561. Queue an ``OUTPUT`` buffer containing one unit of encoded bytestream data for
257   the decoding request, using :c:func:`VIDIOC_QBUF`.
258
259    * **Required fields:**
260
261      ``index``
262          index of the buffer being queued.
263
264      ``type``
265          type of the buffer.
266
267      ``bytesused``
268          number of bytes taken by the encoded data frame in the buffer.
269
270      ``flags``
271          the ``V4L2_BUF_FLAG_REQUEST_FD`` flag must be set. Additionally, if
272          we are not sure that the current decode request is the last one needed
273          to produce a fully decoded frame, then
274          ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` must also be set.
275
276      ``request_fd``
277          must be set to the file descriptor of the decoding request.
278
279      ``timestamp``
280          must be set to a unique value per frame. This value will be propagated
281          into the decoded frame's buffer and can also be used to use this frame
282          as the reference of another. If using multiple decode requests per
283          frame, then the timestamps of all the ``OUTPUT`` buffers for a given
284          frame must be identical. If the timestamp changes, then the currently
285          held ``CAPTURE`` buffer will be made available for dequeuing and the
286          current request will work on a new ``CAPTURE`` buffer.
287
2882. Set the codec-specific controls for the decoding request, using
289   :c:func:`VIDIOC_S_EXT_CTRLS`.
290
291    * **Required fields:**
292
293      ``which``
294          must be ``V4L2_CTRL_WHICH_REQUEST_VAL``.
295
296      ``request_fd``
297          must be set to the file descriptor of the decoding request.
298
299      other fields
300          other fields are set as usual when setting controls. The ``controls``
301          array must contain all the codec-specific controls required to decode
302          a frame.
303
304   .. note::
305
306      It is possible to specify the controls in different invocations of
307      :c:func:`VIDIOC_S_EXT_CTRLS`, or to overwrite a previously set control, as
308      long as ``request_fd`` and ``which`` are properly set. The controls state
309      at the moment of request submission is the one that will be considered.
310
311   .. note::
312
313      The order in which steps 1 and 2 take place is interchangeable.
314
3153. Submit the request by invoking :c:func:`MEDIA_REQUEST_IOC_QUEUE` on the
316   request FD.
317
318    If the request is submitted without an ``OUTPUT`` buffer, or if some of the
319    required controls are missing from the request, then
320    :c:func:`MEDIA_REQUEST_IOC_QUEUE` will return ``-ENOENT``. If more than one
321    ``OUTPUT`` buffer is queued, then it will return ``-EINVAL``.
322    :c:func:`MEDIA_REQUEST_IOC_QUEUE` returning non-zero means that no
323    ``CAPTURE`` buffer will be produced for this request.
324
325``CAPTURE`` buffers must not be part of the request, and are queued
326independently. They are returned in decode order (i.e. the same order as coded
327frames were submitted to the ``OUTPUT`` queue).
328
329Runtime decoding errors are signaled by the dequeued ``CAPTURE`` buffers
330carrying the ``V4L2_BUF_FLAG_ERROR`` flag. If a decoded reference frame has an
331error, then all following decoded frames that refer to it also have the
332``V4L2_BUF_FLAG_ERROR`` flag set, although the decoder will still try to
333produce (likely corrupted) frames.
334
335Buffer management while decoding
336================================
337Contrary to stateful decoders, a stateless decoder does not perform any kind of
338buffer management: it only guarantees that dequeued ``CAPTURE`` buffers can be
339used by the client for as long as they are not queued again. "Used" here
340encompasses using the buffer for compositing or display.
341
342A dequeued capture buffer can also be used as the reference frame of another
343buffer.
344
345A frame is specified as reference by converting its timestamp into nanoseconds,
346and storing it into the relevant member of a codec-dependent control structure.
347The :c:func:`v4l2_timeval_to_ns` function must be used to perform that
348conversion. The timestamp of a frame can be used to reference it as soon as all
349its units of encoded data are successfully submitted to the ``OUTPUT`` queue.
350
351A decoded buffer containing a reference frame must not be reused as a decoding
352target until all the frames referencing it have been decoded. The safest way to
353achieve this is to refrain from queueing a reference buffer until all the
354decoded frames referencing it have been dequeued. However, if the driver can
355guarantee that buffers queued to the ``CAPTURE`` queue are processed in queued
356order, then user-space can take advantage of this guarantee and queue a
357reference buffer when the following conditions are met:
358
3591. All the requests for frames affected by the reference frame have been
360   queued, and
361
3622. A sufficient number of ``CAPTURE`` buffers to cover all the decoded
363   referencing frames have been queued.
364
365When queuing a decoding request, the driver will increase the reference count of
366all the resources associated with reference frames. This means that the client
367can e.g. close the DMABUF file descriptors of reference frame buffers if it
368won't need them afterwards.
369
370Seeking
371=======
372In order to seek, the client just needs to submit requests using input buffers
373corresponding to the new stream position. It must however be aware that
374resolution may have changed and follow the dynamic resolution change sequence in
375that case. Also depending on the codec used, picture parameters (e.g. SPS/PPS
376for H.264) may have changed and the client is responsible for making sure that a
377valid state is sent to the decoder.
378
379The client is then free to ignore any returned ``CAPTURE`` buffer that comes
380from the pre-seek position.
381
382Pausing
383=======
384
385In order to pause, the client can just cease queuing buffers onto the ``OUTPUT``
386queue. Without source bytestream data, there is no data to process and the codec
387will remain idle.
388
389Dynamic resolution change
390=========================
391
392If the client detects a resolution change in the stream, it will need to perform
393the initialization sequence again with the new resolution:
394
3951. If the last submitted request resulted in a ``CAPTURE`` buffer being
396   held by the use of the ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag, then the
397   last frame is not available on the ``CAPTURE`` queue. In this case, a
398   ``V4L2_DEC_CMD_FLUSH`` command shall be sent. This will make the driver
399   dequeue the held ``CAPTURE`` buffer.
400
4012. Wait until all submitted requests have completed and dequeue the
402   corresponding output buffers.
403
4043. Call :c:func:`VIDIOC_STREAMOFF` on both the ``OUTPUT`` and ``CAPTURE``
405   queues.
406
4074. Free all ``CAPTURE`` buffers by calling :c:func:`VIDIOC_REQBUFS` on the
408   ``CAPTURE`` queue with a buffer count of zero.
409
4105. Perform the initialization sequence again (minus the allocation of
411   ``OUTPUT`` buffers), with the new resolution set on the ``OUTPUT`` queue.
412   Note that due to resolution constraints, a different format may need to be
413   picked on the ``CAPTURE`` queue.
414
415Drain
416=====
417
418If the last submitted request resulted in a ``CAPTURE`` buffer being
419held by the use of the ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag, then the
420last frame is not available on the ``CAPTURE`` queue. In this case, a
421``V4L2_DEC_CMD_FLUSH`` command shall be sent. This will make the driver
422dequeue the held ``CAPTURE`` buffer.
423
424After that, in order to drain the stream on a stateless decoder, the client
425just needs to wait until all the submitted requests are completed.
426