xref: /freebsd/contrib/libcbor/doc/source/internal.rst (revision 10ff414c14eef433d8157f0c17904d740693933b)
1*10ff414cSEd MasteInternal mechanics
2*10ff414cSEd Maste==========================
3*10ff414cSEd Maste
4*10ff414cSEd MasteInternal workings of *libcbor* are mostly derived from the specification. The purpose of this document is to describe technical choices made during design & implementation and to explicate the reasoning behind those choices.
5*10ff414cSEd Maste
6*10ff414cSEd MasteTerminology
7*10ff414cSEd Maste---------------
8*10ff414cSEd Maste===  ======================  ========================================================================================================================================
9*10ff414cSEd MasteMTB  Major Type Byte         http://tools.ietf.org/html/rfc7049#section-2.1
10*10ff414cSEd Maste---  ----------------------  ----------------------------------------------------------------------------------------------------------------------------------------
11*10ff414cSEd MasteDST  Dynamically Sized Type  Type whose storage requirements cannot be determined
12*10ff414cSEd Maste
13*10ff414cSEd Maste                             during compilation (originated in the `Rust <http://www.rust-lang.org/>`_ community)
14*10ff414cSEd Maste===  ======================  ========================================================================================================================================
15*10ff414cSEd Maste
16*10ff414cSEd MasteConventions
17*10ff414cSEd Maste--------------
18*10ff414cSEd MasteAPI symbols start with ``cbor_`` or ``CBOR_`` prefix, internal symbols have ``_cbor_`` or ``_CBOR_`` prefix.
19*10ff414cSEd Maste
20*10ff414cSEd MasteInspiration & related projects
21*10ff414cSEd Maste-------------------------------
22*10ff414cSEd MasteMost of the API is largely modelled after existing JSON libraries, including
23*10ff414cSEd Maste
24*10ff414cSEd Maste - `Jansson <http://www.digip.org/jansson/>`_
25*10ff414cSEd Maste - `json-c <https://github.com/json-c/json-c>`_
26*10ff414cSEd Maste - Gnome's `JsonGlib <https://wiki.gnome.org/action/show/Projects/JsonGlib?action=show&redirect=JsonGlib>`_
27*10ff414cSEd Maste
28*10ff414cSEd Masteand also borrowing from
29*10ff414cSEd Maste
30*10ff414cSEd Maste - `msgpack-c <https://github.com/msgpack/msgpack-c>`_
31*10ff414cSEd Maste - `Google Protocol Buffers <http://code.google.com/p/protobuf/>`_.
32*10ff414cSEd Maste
33*10ff414cSEd MasteGeneral notes on the API design
34*10ff414cSEd Maste--------------------------------
35*10ff414cSEd MasteThe API design has two main driving priciples:
36*10ff414cSEd Maste
37*10ff414cSEd Maste 1. Let the client manage the memory as much as possible
38*10ff414cSEd Maste 2. Behave exactly as specified by the standard
39*10ff414cSEd Maste
40*10ff414cSEd MasteCombining these two principles in practice turns out to be quite difficult. Indefinite-length strings, arrays, and maps require client to handle every fixed-size chunk explicitly in order to
41*10ff414cSEd Maste
42*10ff414cSEd Maste - ensure the client never runs out of memory due to *libcbor*
43*10ff414cSEd Maste - use :func:`realloc` sparsely and predictably [#]_
44*10ff414cSEd Maste
45*10ff414cSEd Maste    - provide strong guarantees about its usage (to prevent latency spikes)
46*10ff414cSEd Maste    - provide APIs to avoid :func:`realloc` altogether
47*10ff414cSEd Maste - allow proper handling of (streamed) data bigger than available memory
48*10ff414cSEd Maste
49*10ff414cSEd Maste .. [#] Reasonable handling of DSTs requires reallocation if the API is to remain sane.
50*10ff414cSEd Maste
51*10ff414cSEd Maste
52*10ff414cSEd MasteCoding style
53*10ff414cSEd Maste-------------
54*10ff414cSEd MasteThis code loosely follows the `Linux kernel coding style <https://www.kernel.org/doc/Documentation/CodingStyle>`_. Tabs are tabs, and they are 4 characters wide.
55*10ff414cSEd Maste
56*10ff414cSEd Maste
57*10ff414cSEd MasteMemory layout
58*10ff414cSEd Maste---------------
59*10ff414cSEd MasteCBOR is very dynamic in the sense that it contains many data elements of variable length, sometimes even indefinite length. This section describes internal representation of all CBOR data types.
60*10ff414cSEd Maste
61*10ff414cSEd MasteGenerally speaking, data items consist of three parts:
62*10ff414cSEd Maste
63*10ff414cSEd Maste - a generic :type:`handle <cbor_item_t>`,
64*10ff414cSEd Maste - the associated :type:`metadata <cbor_item_metadata>`,
65*10ff414cSEd Maste - and the actual data
66*10ff414cSEd Maste
67*10ff414cSEd Maste.. type:: cbor_item_t
68*10ff414cSEd Maste
69*10ff414cSEd Maste    Represents the item. Used as an opaque type
70*10ff414cSEd Maste
71*10ff414cSEd Maste    .. member:: cbor_type type
72*10ff414cSEd Maste
73*10ff414cSEd Maste        Type discriminator
74*10ff414cSEd Maste
75*10ff414cSEd Maste    .. member:: size_t refcount
76*10ff414cSEd Maste
77*10ff414cSEd Maste        Reference counter. Used by :func:`cbor_decref`, :func:`cbor_incref`
78*10ff414cSEd Maste
79*10ff414cSEd Maste    .. member:: union cbor_item_metadata metadata
80*10ff414cSEd Maste
81*10ff414cSEd Maste        Union discriminated by :member:`type`. Contains type-specific metadata
82*10ff414cSEd Maste
83*10ff414cSEd Maste    .. member:: unsigned char * data
84*10ff414cSEd Maste
85*10ff414cSEd Maste        Contains pointer to the actual data. Small, fixed size items (:doc:`api/type_0_1`, :doc:`api/type_6`, :doc:`api/type_7`) are allocated as a single memory block.
86*10ff414cSEd Maste
87*10ff414cSEd Maste        Consider the following snippet
88*10ff414cSEd Maste
89*10ff414cSEd Maste        .. code-block:: c
90*10ff414cSEd Maste
91*10ff414cSEd Maste            cbor_item_t * item = cbor_new_int8();
92*10ff414cSEd Maste
93*10ff414cSEd Maste        then the memory is laid out as follows
94*10ff414cSEd Maste
95*10ff414cSEd Maste        ::
96*10ff414cSEd Maste
97*10ff414cSEd Maste            +-----------+---------------+---------------+-----------------------------------++-----------+
98*10ff414cSEd Maste            |           |               |               |                                   ||           |
99*10ff414cSEd Maste            |   type    |   refcount    |   metadata    |              data                 ||  uint8_t  |
100*10ff414cSEd Maste            |           |               |               |   (= item + sizeof(cbor_item_t))  ||           |
101*10ff414cSEd Maste            +-----------+---------------+---------------+-----------------------------------++-----------+
102*10ff414cSEd Maste            ^                                                                                ^
103*10ff414cSEd Maste            |                                                                                |
104*10ff414cSEd Maste            +--- item                                                                        +--- item->data
105*10ff414cSEd Maste
106*10ff414cSEd Maste        Dynamically sized types (:doc:`api/type_2`, :doc:`api/type_3`, :doc:`api/type_4`, :doc:`api/type_5`) may store handle and data in separate locations. This enables creating large items (e.g :doc:`byte strings <api/type_2>`) without :func:`realloc` or copying large blocks of memory. One simply attaches the correct pointer to the handle.
107*10ff414cSEd Maste
108*10ff414cSEd Maste
109*10ff414cSEd Maste.. type:: cbor_item_metadata
110*10ff414cSEd Maste
111*10ff414cSEd Maste    Union type of the following members, based on the item type:
112*10ff414cSEd Maste
113*10ff414cSEd Maste    .. member:: struct _cbor_int_metadata int_metadata
114*10ff414cSEd Maste
115*10ff414cSEd Maste        Used both by both :doc:`api/type_0_1`
116*10ff414cSEd Maste
117*10ff414cSEd Maste    .. member:: struct _cbor_bytestring_metadata bytestring_metadata
118*10ff414cSEd Maste    .. member:: struct _cbor_string_metadata string_metadata
119*10ff414cSEd Maste    .. member:: struct _cbor_array_metadata array_metadata
120*10ff414cSEd Maste    .. member:: struct _cbor_map_metadata map_metadata
121*10ff414cSEd Maste    .. member:: struct _cbor_tag_metadata tag_metadata
122*10ff414cSEd Maste    .. member:: struct _cbor_float_ctrl_metadata float_ctrl_metadata
123*10ff414cSEd Maste
124*10ff414cSEd MasteDecoding
125*10ff414cSEd Maste---------
126*10ff414cSEd Maste
127*10ff414cSEd MasteAs outlined in :doc:`api`, there decoding is based on the streaming decoder Essentially, the decoder is a custom set of callbacks for the streaming decoder.
128*10ff414cSEd Maste
129