1*10ff414cSEd MasteInternal mechanics 2*10ff414cSEd Maste========================== 3*10ff414cSEd Maste 4*10ff414cSEd MasteInternal workings of *libcbor* are mostly derived from the specification. The purpose of this document is to describe technical choices made during design & implementation and to explicate the reasoning behind those choices. 5*10ff414cSEd Maste 6*10ff414cSEd MasteTerminology 7*10ff414cSEd Maste--------------- 8*10ff414cSEd Maste=== ====================== ======================================================================================================================================== 9*10ff414cSEd MasteMTB Major Type Byte http://tools.ietf.org/html/rfc7049#section-2.1 10*10ff414cSEd Maste--- ---------------------- ---------------------------------------------------------------------------------------------------------------------------------------- 11*10ff414cSEd MasteDST Dynamically Sized Type Type whose storage requirements cannot be determined 12*10ff414cSEd Maste 13*10ff414cSEd Maste during compilation (originated in the `Rust <http://www.rust-lang.org/>`_ community) 14*10ff414cSEd Maste=== ====================== ======================================================================================================================================== 15*10ff414cSEd Maste 16*10ff414cSEd MasteConventions 17*10ff414cSEd Maste-------------- 18*10ff414cSEd MasteAPI symbols start with ``cbor_`` or ``CBOR_`` prefix, internal symbols have ``_cbor_`` or ``_CBOR_`` prefix. 19*10ff414cSEd Maste 20*10ff414cSEd MasteInspiration & related projects 21*10ff414cSEd Maste------------------------------- 22*10ff414cSEd MasteMost of the API is largely modelled after existing JSON libraries, including 23*10ff414cSEd Maste 24*10ff414cSEd Maste - `Jansson <http://www.digip.org/jansson/>`_ 25*10ff414cSEd Maste - `json-c <https://github.com/json-c/json-c>`_ 26*10ff414cSEd Maste - Gnome's `JsonGlib <https://wiki.gnome.org/action/show/Projects/JsonGlib?action=show&redirect=JsonGlib>`_ 27*10ff414cSEd Maste 28*10ff414cSEd Masteand also borrowing from 29*10ff414cSEd Maste 30*10ff414cSEd Maste - `msgpack-c <https://github.com/msgpack/msgpack-c>`_ 31*10ff414cSEd Maste - `Google Protocol Buffers <http://code.google.com/p/protobuf/>`_. 32*10ff414cSEd Maste 33*10ff414cSEd MasteGeneral notes on the API design 34*10ff414cSEd Maste-------------------------------- 35*10ff414cSEd MasteThe API design has two main driving priciples: 36*10ff414cSEd Maste 37*10ff414cSEd Maste 1. Let the client manage the memory as much as possible 38*10ff414cSEd Maste 2. Behave exactly as specified by the standard 39*10ff414cSEd Maste 40*10ff414cSEd MasteCombining these two principles in practice turns out to be quite difficult. Indefinite-length strings, arrays, and maps require client to handle every fixed-size chunk explicitly in order to 41*10ff414cSEd Maste 42*10ff414cSEd Maste - ensure the client never runs out of memory due to *libcbor* 43*10ff414cSEd Maste - use :func:`realloc` sparsely and predictably [#]_ 44*10ff414cSEd Maste 45*10ff414cSEd Maste - provide strong guarantees about its usage (to prevent latency spikes) 46*10ff414cSEd Maste - provide APIs to avoid :func:`realloc` altogether 47*10ff414cSEd Maste - allow proper handling of (streamed) data bigger than available memory 48*10ff414cSEd Maste 49*10ff414cSEd Maste .. [#] Reasonable handling of DSTs requires reallocation if the API is to remain sane. 50*10ff414cSEd Maste 51*10ff414cSEd Maste 52*10ff414cSEd MasteCoding style 53*10ff414cSEd Maste------------- 54*10ff414cSEd MasteThis code loosely follows the `Linux kernel coding style <https://www.kernel.org/doc/Documentation/CodingStyle>`_. Tabs are tabs, and they are 4 characters wide. 55*10ff414cSEd Maste 56*10ff414cSEd Maste 57*10ff414cSEd MasteMemory layout 58*10ff414cSEd Maste--------------- 59*10ff414cSEd MasteCBOR is very dynamic in the sense that it contains many data elements of variable length, sometimes even indefinite length. This section describes internal representation of all CBOR data types. 60*10ff414cSEd Maste 61*10ff414cSEd MasteGenerally speaking, data items consist of three parts: 62*10ff414cSEd Maste 63*10ff414cSEd Maste - a generic :type:`handle <cbor_item_t>`, 64*10ff414cSEd Maste - the associated :type:`metadata <cbor_item_metadata>`, 65*10ff414cSEd Maste - and the actual data 66*10ff414cSEd Maste 67*10ff414cSEd Maste.. type:: cbor_item_t 68*10ff414cSEd Maste 69*10ff414cSEd Maste Represents the item. Used as an opaque type 70*10ff414cSEd Maste 71*10ff414cSEd Maste .. member:: cbor_type type 72*10ff414cSEd Maste 73*10ff414cSEd Maste Type discriminator 74*10ff414cSEd Maste 75*10ff414cSEd Maste .. member:: size_t refcount 76*10ff414cSEd Maste 77*10ff414cSEd Maste Reference counter. Used by :func:`cbor_decref`, :func:`cbor_incref` 78*10ff414cSEd Maste 79*10ff414cSEd Maste .. member:: union cbor_item_metadata metadata 80*10ff414cSEd Maste 81*10ff414cSEd Maste Union discriminated by :member:`type`. Contains type-specific metadata 82*10ff414cSEd Maste 83*10ff414cSEd Maste .. member:: unsigned char * data 84*10ff414cSEd Maste 85*10ff414cSEd Maste Contains pointer to the actual data. Small, fixed size items (:doc:`api/type_0_1`, :doc:`api/type_6`, :doc:`api/type_7`) are allocated as a single memory block. 86*10ff414cSEd Maste 87*10ff414cSEd Maste Consider the following snippet 88*10ff414cSEd Maste 89*10ff414cSEd Maste .. code-block:: c 90*10ff414cSEd Maste 91*10ff414cSEd Maste cbor_item_t * item = cbor_new_int8(); 92*10ff414cSEd Maste 93*10ff414cSEd Maste then the memory is laid out as follows 94*10ff414cSEd Maste 95*10ff414cSEd Maste :: 96*10ff414cSEd Maste 97*10ff414cSEd Maste +-----------+---------------+---------------+-----------------------------------++-----------+ 98*10ff414cSEd Maste | | | | || | 99*10ff414cSEd Maste | type | refcount | metadata | data || uint8_t | 100*10ff414cSEd Maste | | | | (= item + sizeof(cbor_item_t)) || | 101*10ff414cSEd Maste +-----------+---------------+---------------+-----------------------------------++-----------+ 102*10ff414cSEd Maste ^ ^ 103*10ff414cSEd Maste | | 104*10ff414cSEd Maste +--- item +--- item->data 105*10ff414cSEd Maste 106*10ff414cSEd Maste Dynamically sized types (:doc:`api/type_2`, :doc:`api/type_3`, :doc:`api/type_4`, :doc:`api/type_5`) may store handle and data in separate locations. This enables creating large items (e.g :doc:`byte strings <api/type_2>`) without :func:`realloc` or copying large blocks of memory. One simply attaches the correct pointer to the handle. 107*10ff414cSEd Maste 108*10ff414cSEd Maste 109*10ff414cSEd Maste.. type:: cbor_item_metadata 110*10ff414cSEd Maste 111*10ff414cSEd Maste Union type of the following members, based on the item type: 112*10ff414cSEd Maste 113*10ff414cSEd Maste .. member:: struct _cbor_int_metadata int_metadata 114*10ff414cSEd Maste 115*10ff414cSEd Maste Used both by both :doc:`api/type_0_1` 116*10ff414cSEd Maste 117*10ff414cSEd Maste .. member:: struct _cbor_bytestring_metadata bytestring_metadata 118*10ff414cSEd Maste .. member:: struct _cbor_string_metadata string_metadata 119*10ff414cSEd Maste .. member:: struct _cbor_array_metadata array_metadata 120*10ff414cSEd Maste .. member:: struct _cbor_map_metadata map_metadata 121*10ff414cSEd Maste .. member:: struct _cbor_tag_metadata tag_metadata 122*10ff414cSEd Maste .. member:: struct _cbor_float_ctrl_metadata float_ctrl_metadata 123*10ff414cSEd Maste 124*10ff414cSEd MasteDecoding 125*10ff414cSEd Maste--------- 126*10ff414cSEd Maste 127*10ff414cSEd MasteAs outlined in :doc:`api`, there decoding is based on the streaming decoder Essentially, the decoder is a custom set of callbacks for the streaming decoder. 128*10ff414cSEd Maste 129