sound/designs/compress-offload.rst

e9df12c3STakashi Iwai=========================
e9df12c3STakashi IwaiALSA Compress-Offload API
e9df12c3STakashi Iwai=========================
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiPierre-Louis.Bossart <pierre-louis.bossart@linux.intel.com>
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiVinod Koul <vinod.koul@linux.intel.com>
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiOverview
e9df12c3STakashi Iwai========
e9df12c3STakashi IwaiSince its early days, the ALSA API was defined with PCM support or
e9df12c3STakashi Iwaiconstant bitrates payloads such as IEC61937 in mind. Arguments and
e9df12c3STakashi Iwaireturned values in frames are the norm, making it a challenge to
e9df12c3STakashi Iwaiextend the existing API to compressed data streams.
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiIn recent years, audio digital signal processors (DSP) were integrated
e9df12c3STakashi Iwaiin system-on-chip designs, and DSPs are also integrated in audio
e9df12c3STakashi Iwaicodecs. Processing compressed data on such DSPs results in a dramatic
e9df12c3STakashi Iwaireduction of power consumption compared to host-based
e9df12c3STakashi Iwaiprocessing. Support for such hardware has not been very good in Linux,
e9df12c3STakashi Iwaimostly because of a lack of a generic API available in the mainline
e9df12c3STakashi Iwaikernel.
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiRather than requiring a compatibility break with an API change of the
e9df12c3STakashi IwaiALSA PCM interface, a new 'Compressed Data' API is introduced to
e9df12c3STakashi Iwaiprovide a control and data-streaming interface for audio DSPs.
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiThe design of this API was inspired by the 2-year experience with the
e9df12c3STakashi IwaiIntel Moorestown SOC, with many corrections required to upstream the
e9df12c3STakashi IwaiAPI in the mainline kernel instead of the staging tree and make it
e9df12c3STakashi Iwaiusable by others.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiRequirements
e9df12c3STakashi Iwai============
e9df12c3STakashi IwaiThe main requirements are:
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai- separation between byte counts and time. Compressed formats may have
e9df12c3STakashi Iwai  a header per file, per frame, or no header at all. The payload size
e9df12c3STakashi Iwai  may vary from frame-to-frame. As a result, it is not possible to
e9df12c3STakashi Iwai  estimate reliably the duration of audio buffers when handling
e9df12c3STakashi Iwai  compressed data. Dedicated mechanisms are required to allow for
e9df12c3STakashi Iwai  reliable audio-video synchronization, which requires precise
e9df12c3STakashi Iwai  reporting of the number of samples rendered at any given time.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai- Handling of multiple formats. PCM data only requires a specification
e9df12c3STakashi Iwai  of the sampling rate, number of channels and bits per sample. In
e9df12c3STakashi Iwai  contrast, compressed data comes in a variety of formats. Audio DSPs
e9df12c3STakashi Iwai  may also provide support for a limited number of audio encoders and
e9df12c3STakashi Iwai  decoders embedded in firmware, or may support more choices through
e9df12c3STakashi Iwai  dynamic download of libraries.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai- Focus on main formats. This API provides support for the most
e9df12c3STakashi Iwai  popular formats used for audio and video capture and playback. It is
e9df12c3STakashi Iwai  likely that as audio compression technology advances, new formats
e9df12c3STakashi Iwai  will be added.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai- Handling of multiple configurations. Even for a given format like
e9df12c3STakashi Iwai  AAC, some implementations may support AAC multichannel but HE-AAC
e9df12c3STakashi Iwai  stereo. Likewise WMA10 level M3 may require too much memory and cpu
e9df12c3STakashi Iwai  cycles. The new API needs to provide a generic way of listing these
e9df12c3STakashi Iwai  formats.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai- Rendering/Grabbing only. This API does not provide any means of
e9df12c3STakashi Iwai  hardware acceleration, where PCM samples are provided back to
e9df12c3STakashi Iwai  user-space for additional processing. This API focuses instead on
e9df12c3STakashi Iwai  streaming compressed data to a DSP, with the assumption that the
e9df12c3STakashi Iwai  decoded samples are routed to a physical output or logical back-end.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai- Complexity hiding. Existing user-space multimedia frameworks all
e9df12c3STakashi Iwai  have existing enums/structures for each compressed format. This new
e9df12c3STakashi Iwai  API assumes the existence of a platform-specific compatibility layer
e9df12c3STakashi Iwai  to expose, translate and make use of the capabilities of the audio
e9df12c3STakashi Iwai  DSP, eg. Android HAL or PulseAudio sinks. By construction, regular
e9df12c3STakashi Iwai  applications are not supposed to make use of this API.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiDesign
e9df12c3STakashi Iwai======
e9df12c3STakashi IwaiThe new API shares a number of concepts with the PCM API for flow
e9df12c3STakashi Iwaicontrol. Start, pause, resume, drain and stop commands have the same
e9df12c3STakashi Iwaisemantics no matter what the content is.
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiThe concept of memory ring buffer divided in a set of fragments is
e9df12c3STakashi Iwaiborrowed from the ALSA PCM API. However, only sizes in bytes can be
e9df12c3STakashi Iwaispecified.
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiSeeks/trick modes are assumed to be handled by the host.
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiThe notion of rewinds/forwards is not supported. Data committed to the
e9df12c3STakashi Iwairing buffer cannot be invalidated, except when dropping all buffers.
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiThe Compressed Data API does not make any assumptions on how the data
e9df12c3STakashi Iwaiis transmitted to the audio DSP. DMA transfers from main memory to an
e9df12c3STakashi Iwaiembedded audio cluster or to a SPI interface for external DSPs are
e9df12c3STakashi Iwaipossible. As in the ALSA PCM case, a core set of routines is exposed;
e9df12c3STakashi Iwaieach driver implementer will have to write support for a set of
e9df12c3STakashi Iwaimandatory routines and possibly make use of optional ones.
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiThe main additions are
e9df12c3STakashi Iwai
e9df12c3STakashi Iwaiget_caps
e9df12c3STakashi Iwai  This routine returns the list of audio formats supported. Querying the
e9df12c3STakashi Iwai  codecs on a capture stream will return encoders, decoders will be
e9df12c3STakashi Iwai  listed for playback streams.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwaiget_codec_caps
e9df12c3STakashi Iwai  For each codec, this routine returns a list of
e9df12c3STakashi Iwai  capabilities. The intent is to make sure all the capabilities
e9df12c3STakashi Iwai  correspond to valid settings, and to minimize the risks of
e9df12c3STakashi Iwai  configuration failures. For example, for a complex codec such as AAC,
e9df12c3STakashi Iwai  the number of channels supported may depend on a specific profile. If
e9df12c3STakashi Iwai  the capabilities were exposed with a single descriptor, it may happen
e9df12c3STakashi Iwai  that a specific combination of profiles/channels/formats may not be
e9df12c3STakashi Iwai  supported. Likewise, embedded DSPs have limited memory and cpu cycles,
e9df12c3STakashi Iwai  it is likely that some implementations make the list of capabilities
e9df12c3STakashi Iwai  dynamic and dependent on existing workloads. In addition to codec
e9df12c3STakashi Iwai  settings, this routine returns the minimum buffer size handled by the
e9df12c3STakashi Iwai  implementation. This information can be a function of the DMA buffer
e9df12c3STakashi Iwai  sizes, the number of bytes required to synchronize, etc, and can be
e9df12c3STakashi Iwai  used by userspace to define how much needs to be written in the ring
e9df12c3STakashi Iwai  buffer before playback can start.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwaiset_params
e9df12c3STakashi Iwai  This routine sets the configuration chosen for a specific codec. The
e9df12c3STakashi Iwai  most important field in the parameters is the codec type; in most
e9df12c3STakashi Iwai  cases decoders will ignore other fields, while encoders will strictly
e9df12c3STakashi Iwai  comply to the settings
e9df12c3STakashi Iwai
e9df12c3STakashi Iwaiget_params
e9df12c3STakashi Iwai  This routines returns the actual settings used by the DSP. Changes to
e9df12c3STakashi Iwai  the settings should remain the exception.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwaiget_timestamp
e9df12c3STakashi Iwai  The timestamp becomes a multiple field structure. It lists the number
e9df12c3STakashi Iwai  of bytes transferred, the number of samples processed and the number
e9df12c3STakashi Iwai  of samples rendered/grabbed. All these values can be used to determine
e9df12c3STakashi Iwai  the average bitrate, figure out if the ring buffer needs to be
e9df12c3STakashi Iwai  refilled or the delay due to decoding/encoding/io on the DSP.
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiNote that the list of codecs/profiles/modes was derived from the
e9df12c3STakashi IwaiOpenMAX AL specification instead of reinventing the wheel.
e9df12c3STakashi IwaiModifications include:
e9df12c3STakashi Iwai- Addition of FLAC and IEC formats
e9df12c3STakashi Iwai- Merge of encoder/decoder capabilities
e9df12c3STakashi Iwai- Profiles/modes listed as bitmasks to make descriptors more compact
e9df12c3STakashi Iwai- Addition of set_params for decoders (missing in OpenMAX AL)
e9df12c3STakashi Iwai- Addition of AMR/AMR-WB encoding modes (missing in OpenMAX AL)
e9df12c3STakashi Iwai- Addition of format information for WMA
e9df12c3STakashi Iwai- Addition of encoding options when required (derived from OpenMAX IL)
e9df12c3STakashi Iwai- Addition of rateControlSupported (missing in OpenMAX AL)
e9df12c3STakashi Iwai
2441bf4dSVinod KoulState Machine
2441bf4dSVinod Koul=============
2441bf4dSVinod Koul
2441bf4dSVinod KoulThe compressed audio stream state machine is described below ::
2441bf4dSVinod Koul
2441bf4dSVinod Koul                                        +----------+
2441bf4dSVinod Koul                                        |          |
2441bf4dSVinod Koul                                        |   OPEN   |
2441bf4dSVinod Koul                                        |          |
2441bf4dSVinod Koul                                        +----------+
2441bf4dSVinod Koul                                             |
2441bf4dSVinod Koul                                             |
2441bf4dSVinod Koul                                             | compr_set_params()
2441bf4dSVinod Koul                                             |
2441bf4dSVinod Koul                                             v
2441bf4dSVinod Koul         compr_free()                  +----------+
2441bf4dSVinod Koul  +------------------------------------|          |
2441bf4dSVinod Koul  |                                    |   SETUP  |
2441bf4dSVinod Koul  |          +-------------------------|          |<-------------------------+
2441bf4dSVinod Koul  |          |       compr_write()     +----------+                          |
2441bf4dSVinod Koul  |          |                              ^                                |
2441bf4dSVinod Koul  |          |                              | compr_drain_notify()           |
2441bf4dSVinod Koul  |          |                              |        or                      |
2441bf4dSVinod Koul  |          |                              |     compr_stop()               |
2441bf4dSVinod Koul  |          |                              |                                |
2441bf4dSVinod Koul  |          |                         +----------+                          |
2441bf4dSVinod Koul  |          |                         |          |                          |
2441bf4dSVinod Koul  |          |                         |   DRAIN  |                          |
2441bf4dSVinod Koul  |          |                         |          |                          |
2441bf4dSVinod Koul  |          |                         +----------+                          |
2441bf4dSVinod Koul  |          |                              ^                                |
2441bf4dSVinod Koul  |          |                              |                                |
2441bf4dSVinod Koul  |          |                              | compr_drain()                  |
2441bf4dSVinod Koul  |          |                              |                                |
2441bf4dSVinod Koul  |          v                              |                                |
2441bf4dSVinod Koul  |    +----------+                    +----------+                          |
2441bf4dSVinod Koul  |    |          |    compr_start()   |          |        compr_stop()      |
2441bf4dSVinod Koul  |    | PREPARE  |------------------->|  RUNNING |--------------------------+
2441bf4dSVinod Koul  |    |          |                    |          |                          |
2441bf4dSVinod Koul  |    +----------+                    +----------+                          |
2441bf4dSVinod Koul  |          |                            |    ^                             |
2441bf4dSVinod Koul  |          |compr_free()                |    |                             |
2441bf4dSVinod Koul  |          |              compr_pause() |    | compr_resume()              |
2441bf4dSVinod Koul  |          |                            |    |                             |
2441bf4dSVinod Koul  |          v                            v    |                             |
2441bf4dSVinod Koul  |    +----------+                   +----------+                           |
2441bf4dSVinod Koul  |    |          |                   |          |         compr_stop()      |
2441bf4dSVinod Koul  +--->|   FREE   |                   |  PAUSE   |---------------------------+
2441bf4dSVinod Koul       |          |                   |          |
2441bf4dSVinod Koul       +----------+                   +----------+
2441bf4dSVinod Koul
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiGapless Playback
e9df12c3STakashi Iwai================
e9df12c3STakashi IwaiWhen playing thru an album, the decoders have the ability to skip the encoder
e9df12c3STakashi Iwaidelay and padding and directly move from one track content to another. The end
e9df12c3STakashi Iwaiuser can perceive this as gapless playback as we don't have silence while
e9df12c3STakashi Iwaiswitching from one track to another
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiAlso, there might be low-intensity noises due to encoding. Perfect gapless is
e9df12c3STakashi Iwaidifficult to reach with all types of compressed data, but works fine with most
e9df12c3STakashi Iwaimusic content. The decoder needs to know the encoder delay and encoder padding.
e9df12c3STakashi IwaiSo we need to pass this to DSP. This metadata is extracted from ID3/MP4 headers
e9df12c3STakashi Iwaiand are not present by default in the bitstream, hence the need for a new
e9df12c3STakashi Iwaiinterface to pass this information to the DSP. Also DSP and userspace needs to
e9df12c3STakashi Iwaiswitch from one track to another and start using data for second track.
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiThe main additions are:
e9df12c3STakashi Iwai
e9df12c3STakashi Iwaiset_metadata
e9df12c3STakashi Iwai  This routine sets the encoder delay and encoder padding. This can be used by
e9df12c3STakashi Iwai  decoder to strip the silence. This needs to be set before the data in the track
e9df12c3STakashi Iwai  is written.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwaiset_next_track
e9df12c3STakashi Iwai  This routine tells DSP that metadata and write operation sent after this would
e9df12c3STakashi Iwai  correspond to subsequent track
e9df12c3STakashi Iwai
e9df12c3STakashi Iwaipartial drain
e9df12c3STakashi Iwai  This is called when end of file is reached. The userspace can inform DSP that
e9df12c3STakashi Iwai  EOF is reached and now DSP can start skipping padding delay. Also next write
e9df12c3STakashi Iwai  data would belong to next track
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiSequence flow for gapless would be:
e9df12c3STakashi Iwai- Open
e9df12c3STakashi Iwai- Get caps / codec caps
e9df12c3STakashi Iwai- Set params
e9df12c3STakashi Iwai- Set metadata of the first track
e9df12c3STakashi Iwai- Fill data of the first track
e9df12c3STakashi Iwai- Trigger start
e9df12c3STakashi Iwai- User-space finished sending all,
e9df12c3STakashi Iwai- Indicate next track data by sending set_next_track
e9df12c3STakashi Iwai- Set metadata of the next track
e9df12c3STakashi Iwai- then call partial_drain to flush most of buffer in DSP
e9df12c3STakashi Iwai- Fill data of the next track
e9df12c3STakashi Iwai- DSP switches to second track
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai(note: order for partial_drain and write for next track can be reversed as well)
e9df12c3STakashi Iwai
d0af37c8SVinod KoulGapless Playback SM
d0af37c8SVinod Koul===================
d0af37c8SVinod Koul
d0af37c8SVinod KoulFor Gapless, we move from running state to partial drain and back, along
d0af37c8SVinod Koulwith setting of meta_data and signalling for next track ::
d0af37c8SVinod Koul
d0af37c8SVinod Koul
d0af37c8SVinod Koul                                        +----------+
d0af37c8SVinod Koul                compr_drain_notify()    |          |
d0af37c8SVinod Koul              +------------------------>|  RUNNING |
d0af37c8SVinod Koul              |                         |          |
d0af37c8SVinod Koul              |                         +----------+
d0af37c8SVinod Koul              |                              |
d0af37c8SVinod Koul              |                              |
d0af37c8SVinod Koul              |                              | compr_next_track()
d0af37c8SVinod Koul              |                              |
d0af37c8SVinod Koul              |                              V
d0af37c8SVinod Koul              |                         +----------+
*7ea9ee00SSrinivas Kandagatla              |    compr_set_params()   |          |
*7ea9ee00SSrinivas Kandagatla              |             +-----------|NEXT_TRACK|
*7ea9ee00SSrinivas Kandagatla              |             |           |          |
*7ea9ee00SSrinivas Kandagatla              |             |           +--+-------+
*7ea9ee00SSrinivas Kandagatla              |             |              | |
*7ea9ee00SSrinivas Kandagatla              |             +--------------+ |
d0af37c8SVinod Koul              |                              |
d0af37c8SVinod Koul              |                              | compr_partial_drain()
d0af37c8SVinod Koul              |                              |
d0af37c8SVinod Koul              |                              V
d0af37c8SVinod Koul              |                         +----------+
d0af37c8SVinod Koul              |                         |          |
d0af37c8SVinod Koul              +------------------------ | PARTIAL_ |
d0af37c8SVinod Koul                                        |  DRAIN   |
d0af37c8SVinod Koul                                        +----------+
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiNot supported
e9df12c3STakashi Iwai=============
e9df12c3STakashi Iwai- Support for VoIP/circuit-switched calls is not the target of this
e9df12c3STakashi Iwai  API. Support for dynamic bit-rate changes would require a tight
e9df12c3STakashi Iwai  coupling between the DSP and the host stack, limiting power savings.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai- Packet-loss concealment is not supported. This would require an
e9df12c3STakashi Iwai  additional interface to let the decoder synthesize data when frames
e9df12c3STakashi Iwai  are lost during transmission. This may be added in the future.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai- Volume control/routing is not handled by this API. Devices exposing a
e9df12c3STakashi Iwai  compressed data interface will be considered as regular ALSA devices;
e9df12c3STakashi Iwai  volume changes and routing information will be provided with regular
e9df12c3STakashi Iwai  ALSA kcontrols.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai- Embedded audio effects. Such effects should be enabled in the same
e9df12c3STakashi Iwai  manner, no matter if the input was PCM or compressed.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai- multichannel IEC encoding. Unclear if this is required.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai- Encoding/decoding acceleration is not supported as mentioned
e9df12c3STakashi Iwai  above. It is possible to route the output of a decoder to a capture
e9df12c3STakashi Iwai  stream, or even implement transcoding capabilities. This routing
e9df12c3STakashi Iwai  would be enabled with ALSA kcontrols.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai- Audio policy/resource management. This API does not provide any
e9df12c3STakashi Iwai  hooks to query the utilization of the audio DSP, nor any preemption
e9df12c3STakashi Iwai  mechanisms.
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai- No notion of underrun/overrun. Since the bytes written are compressed
e9df12c3STakashi Iwai  in nature and data written/read doesn't translate directly to
e9df12c3STakashi Iwai  rendered output in time, this does not deal with underrun/overrun and
e9df12c3STakashi Iwai  maybe dealt in user-library
e9df12c3STakashi Iwai
e9df12c3STakashi Iwai
e9df12c3STakashi IwaiCredits
e9df12c3STakashi Iwai=======
e9df12c3STakashi Iwai- Mark Brown and Liam Girdwood for discussions on the need for this API
e9df12c3STakashi Iwai- Harsha Priya for her work on intel_sst compressed API
e9df12c3STakashi Iwai- Rakesh Ughreja for valuable feedback
e9df12c3STakashi Iwai- Sing Nallasellan, Sikkandar Madar and Prasanna Samaga for
e9df12c3STakashi Iwai  demonstrating and quantifying the benefits of audio offload on a
e9df12c3STakashi Iwai  real platform.