xref: /freebsd/contrib/libcbor/doc/source/internal.rst (revision abd872540f24cfc7dbd1ea29b6918c7082a22108)
110ff414cSEd MasteInternal mechanics
210ff414cSEd Maste==========================
310ff414cSEd Maste
410ff414cSEd MasteInternal workings of *libcbor* are mostly derived from the specification. The purpose of this document is to describe technical choices made during design & implementation and to explicate the reasoning behind those choices.
510ff414cSEd Maste
610ff414cSEd MasteTerminology
710ff414cSEd Maste---------------
810ff414cSEd Maste===  ======================  ========================================================================================================================================
9*abd87254SEd MasteMTB  Major Type Byte         https://www.rfc-editor.org/rfc/rfc8949.html#section-3.1
1010ff414cSEd Maste---  ----------------------  ----------------------------------------------------------------------------------------------------------------------------------------
1110ff414cSEd MasteDST  Dynamically Sized Type  Type whose storage requirements cannot be determined
1210ff414cSEd Maste
1310ff414cSEd Maste                             during compilation (originated in the `Rust <http://www.rust-lang.org/>`_ community)
1410ff414cSEd Maste===  ======================  ========================================================================================================================================
1510ff414cSEd Maste
1610ff414cSEd MasteConventions
1710ff414cSEd Maste--------------
1810ff414cSEd MasteAPI symbols start with ``cbor_`` or ``CBOR_`` prefix, internal symbols have ``_cbor_`` or ``_CBOR_`` prefix.
1910ff414cSEd Maste
2010ff414cSEd MasteInspiration & related projects
2110ff414cSEd Maste-------------------------------
2210ff414cSEd MasteMost of the API is largely modelled after existing JSON libraries, including
2310ff414cSEd Maste
2410ff414cSEd Maste - `Jansson <http://www.digip.org/jansson/>`_
2510ff414cSEd Maste - `json-c <https://github.com/json-c/json-c>`_
2610ff414cSEd Maste - Gnome's `JsonGlib <https://wiki.gnome.org/action/show/Projects/JsonGlib?action=show&redirect=JsonGlib>`_
2710ff414cSEd Maste
2810ff414cSEd Masteand also borrowing from
2910ff414cSEd Maste
3010ff414cSEd Maste - `msgpack-c <https://github.com/msgpack/msgpack-c>`_
3110ff414cSEd Maste - `Google Protocol Buffers <http://code.google.com/p/protobuf/>`_.
3210ff414cSEd Maste
3310ff414cSEd MasteGeneral notes on the API design
3410ff414cSEd Maste--------------------------------
35*abd87254SEd MasteThe API design has two main driving principles:
3610ff414cSEd Maste
3710ff414cSEd Maste 1. Let the client manage the memory as much as possible
3810ff414cSEd Maste 2. Behave exactly as specified by the standard
3910ff414cSEd Maste
4010ff414cSEd MasteCombining these two principles in practice turns out to be quite difficult. Indefinite-length strings, arrays, and maps require client to handle every fixed-size chunk explicitly in order to
4110ff414cSEd Maste
4210ff414cSEd Maste - ensure the client never runs out of memory due to *libcbor*
4310ff414cSEd Maste - use :func:`realloc` sparsely and predictably [#]_
4410ff414cSEd Maste
4510ff414cSEd Maste    - provide strong guarantees about its usage (to prevent latency spikes)
4610ff414cSEd Maste    - provide APIs to avoid :func:`realloc` altogether
4710ff414cSEd Maste - allow proper handling of (streamed) data bigger than available memory
4810ff414cSEd Maste
4910ff414cSEd Maste .. [#] Reasonable handling of DSTs requires reallocation if the API is to remain sane.
5010ff414cSEd Maste
5110ff414cSEd Maste
5210ff414cSEd MasteCoding style
5310ff414cSEd Maste-------------
5410ff414cSEd MasteThis code loosely follows the `Linux kernel coding style <https://www.kernel.org/doc/Documentation/CodingStyle>`_. Tabs are tabs, and they are 4 characters wide.
5510ff414cSEd Maste
5610ff414cSEd Maste
5710ff414cSEd MasteMemory layout
5810ff414cSEd Maste---------------
5910ff414cSEd MasteCBOR is very dynamic in the sense that it contains many data elements of variable length, sometimes even indefinite length. This section describes internal representation of all CBOR data types.
6010ff414cSEd Maste
6110ff414cSEd MasteGenerally speaking, data items consist of three parts:
6210ff414cSEd Maste
6310ff414cSEd Maste - a generic :type:`handle <cbor_item_t>`,
6410ff414cSEd Maste - the associated :type:`metadata <cbor_item_metadata>`,
6510ff414cSEd Maste - and the actual data
6610ff414cSEd Maste
6710ff414cSEd Maste.. type:: cbor_item_t
6810ff414cSEd Maste
6910ff414cSEd Maste    Represents the item. Used as an opaque type
7010ff414cSEd Maste
7110ff414cSEd Maste    .. member:: cbor_type type
7210ff414cSEd Maste
7310ff414cSEd Maste        Type discriminator
7410ff414cSEd Maste
7510ff414cSEd Maste    .. member:: size_t refcount
7610ff414cSEd Maste
7710ff414cSEd Maste        Reference counter. Used by :func:`cbor_decref`, :func:`cbor_incref`
7810ff414cSEd Maste
7910ff414cSEd Maste    .. member:: union cbor_item_metadata metadata
8010ff414cSEd Maste
8110ff414cSEd Maste        Union discriminated by :member:`type`. Contains type-specific metadata
8210ff414cSEd Maste
8310ff414cSEd Maste    .. member:: unsigned char * data
8410ff414cSEd Maste
8510ff414cSEd Maste        Contains pointer to the actual data. Small, fixed size items (:doc:`api/type_0_1`, :doc:`api/type_6`, :doc:`api/type_7`) are allocated as a single memory block.
8610ff414cSEd Maste
8710ff414cSEd Maste        Consider the following snippet
8810ff414cSEd Maste
8910ff414cSEd Maste        .. code-block:: c
9010ff414cSEd Maste
9110ff414cSEd Maste            cbor_item_t * item = cbor_new_int8();
9210ff414cSEd Maste
9310ff414cSEd Maste        then the memory is laid out as follows
9410ff414cSEd Maste
9510ff414cSEd Maste        ::
9610ff414cSEd Maste
9710ff414cSEd Maste            +-----------+---------------+---------------+-----------------------------------++-----------+
9810ff414cSEd Maste            |           |               |               |                                   ||           |
9910ff414cSEd Maste            |   type    |   refcount    |   metadata    |              data                 ||  uint8_t  |
10010ff414cSEd Maste            |           |               |               |   (= item + sizeof(cbor_item_t))  ||           |
10110ff414cSEd Maste            +-----------+---------------+---------------+-----------------------------------++-----------+
10210ff414cSEd Maste            ^                                                                                ^
10310ff414cSEd Maste            |                                                                                |
10410ff414cSEd Maste            +--- item                                                                        +--- item->data
10510ff414cSEd Maste
10610ff414cSEd Maste        Dynamically sized types (:doc:`api/type_2`, :doc:`api/type_3`, :doc:`api/type_4`, :doc:`api/type_5`) may store handle and data in separate locations. This enables creating large items (e.g :doc:`byte strings <api/type_2>`) without :func:`realloc` or copying large blocks of memory. One simply attaches the correct pointer to the handle.
10710ff414cSEd Maste
10810ff414cSEd Maste
10910ff414cSEd Maste.. type:: cbor_item_metadata
11010ff414cSEd Maste
11110ff414cSEd Maste    Union type of the following members, based on the item type:
11210ff414cSEd Maste
11310ff414cSEd Maste    .. member:: struct _cbor_int_metadata int_metadata
11410ff414cSEd Maste
11510ff414cSEd Maste        Used both by both :doc:`api/type_0_1`
11610ff414cSEd Maste
11710ff414cSEd Maste    .. member:: struct _cbor_bytestring_metadata bytestring_metadata
11810ff414cSEd Maste    .. member:: struct _cbor_string_metadata string_metadata
11910ff414cSEd Maste    .. member:: struct _cbor_array_metadata array_metadata
12010ff414cSEd Maste    .. member:: struct _cbor_map_metadata map_metadata
12110ff414cSEd Maste    .. member:: struct _cbor_tag_metadata tag_metadata
12210ff414cSEd Maste    .. member:: struct _cbor_float_ctrl_metadata float_ctrl_metadata
12310ff414cSEd Maste
12410ff414cSEd MasteDecoding
12510ff414cSEd Maste---------
12610ff414cSEd Maste
12710ff414cSEd MasteAs outlined in :doc:`api`, there decoding is based on the streaming decoder Essentially, the decoder is a custom set of callbacks for the streaming decoder.
12810ff414cSEd Maste
129