xref: /freebsd/contrib/libcbor/doc/source/internal.rst (revision d33f5a0afa54be7f18775f6506f015c7f79a6a5f)
1Internal mechanics
2==========================
3
4Internal workings of *libcbor* are mostly derived from the specification. The purpose of this document is to describe technical choices made during design & implementation and to explicate the reasoning behind those choices.
5
6Terminology
7---------------
8===  ======================  ========================================================================================================================================
9MTB  Major Type Byte         https://www.rfc-editor.org/rfc/rfc8949.html#section-3.1
10---  ----------------------  ----------------------------------------------------------------------------------------------------------------------------------------
11DST  Dynamically Sized Type  Type whose storage requirements cannot be determined
12
13                             during compilation (originated in the `Rust <http://www.rust-lang.org/>`_ community)
14===  ======================  ========================================================================================================================================
15
16Conventions
17--------------
18API symbols start with ``cbor_`` or ``CBOR_`` prefix, internal symbols have ``_cbor_`` or ``_CBOR_`` prefix.
19
20Inspiration & related projects
21-------------------------------
22Most of the API is largely modelled after existing JSON libraries, including
23
24 - `Jansson <http://www.digip.org/jansson/>`_
25 - `json-c <https://github.com/json-c/json-c>`_
26 - Gnome's `JsonGlib <https://wiki.gnome.org/action/show/Projects/JsonGlib?action=show&redirect=JsonGlib>`_
27
28and also borrowing from
29
30 - `msgpack-c <https://github.com/msgpack/msgpack-c>`_
31 - `Google Protocol Buffers <http://code.google.com/p/protobuf/>`_.
32
33General notes on the API design
34--------------------------------
35The API design has two main driving principles:
36
37 1. Let the client manage the memory as much as possible
38 2. Behave exactly as specified by the standard
39
40Combining these two principles in practice turns out to be quite difficult. Indefinite-length strings, arrays, and maps require client to handle every fixed-size chunk explicitly in order to
41
42 - ensure the client never runs out of memory due to *libcbor*
43 - use :func:`realloc` sparsely and predictably [#]_
44
45    - provide strong guarantees about its usage (to prevent latency spikes)
46    - provide APIs to avoid :func:`realloc` altogether
47 - allow proper handling of (streamed) data bigger than available memory
48
49 .. [#] Reasonable handling of DSTs requires reallocation if the API is to remain sane.
50
51
52Coding style
53-------------
54This code loosely follows the `Linux kernel coding style <https://www.kernel.org/doc/Documentation/CodingStyle>`_. Tabs are tabs, and they are 4 characters wide.
55
56
57Memory layout
58---------------
59CBOR is very dynamic in the sense that it contains many data elements of variable length, sometimes even indefinite length. This section describes internal representation of all CBOR data types.
60
61Generally speaking, data items consist of three parts:
62
63 - a generic :type:`handle <cbor_item_t>`,
64 - the associated :type:`metadata <cbor_item_metadata>`,
65 - and the actual data
66
67.. type:: cbor_item_t
68
69    Represents the item. Used as an opaque type
70
71    .. member:: cbor_type type
72
73        Type discriminator
74
75    .. member:: size_t refcount
76
77        Reference counter. Used by :func:`cbor_decref`, :func:`cbor_incref`
78
79    .. member:: union cbor_item_metadata metadata
80
81        Union discriminated by :member:`type`. Contains type-specific metadata
82
83    .. member:: unsigned char * data
84
85        Contains pointer to the actual data. Small, fixed size items (:doc:`api/type_0_1`, :doc:`api/type_6`, :doc:`api/type_7`) are allocated as a single memory block.
86
87        Consider the following snippet
88
89        .. code-block:: c
90
91            cbor_item_t * item = cbor_new_int8();
92
93        then the memory is laid out as follows
94
95        ::
96
97            +-----------+---------------+---------------+-----------------------------------++-----------+
98            |           |               |               |                                   ||           |
99            |   type    |   refcount    |   metadata    |              data                 ||  uint8_t  |
100            |           |               |               |   (= item + sizeof(cbor_item_t))  ||           |
101            +-----------+---------------+---------------+-----------------------------------++-----------+
102            ^                                                                                ^
103            |                                                                                |
104            +--- item                                                                        +--- item->data
105
106        Dynamically sized types (:doc:`api/type_2`, :doc:`api/type_3`, :doc:`api/type_4`, :doc:`api/type_5`) may store handle and data in separate locations. This enables creating large items (e.g :doc:`byte strings <api/type_2>`) without :func:`realloc` or copying large blocks of memory. One simply attaches the correct pointer to the handle.
107
108
109.. type:: cbor_item_metadata
110
111    Union type of the following members, based on the item type:
112
113    .. member:: struct _cbor_int_metadata int_metadata
114
115        Used both by both :doc:`api/type_0_1`
116
117    .. member:: struct _cbor_bytestring_metadata bytestring_metadata
118    .. member:: struct _cbor_string_metadata string_metadata
119    .. member:: struct _cbor_array_metadata array_metadata
120    .. member:: struct _cbor_map_metadata map_metadata
121    .. member:: struct _cbor_tag_metadata tag_metadata
122    .. member:: struct _cbor_float_ctrl_metadata float_ctrl_metadata
123
124Decoding
125---------
126
127As outlined in :doc:`api`, there decoding is based on the streaming decoder Essentially, the decoder is a custom set of callbacks for the streaming decoder.
128
129