110ff414cSEd MasteInternal mechanics 210ff414cSEd Maste========================== 310ff414cSEd Maste 410ff414cSEd MasteInternal workings of *libcbor* are mostly derived from the specification. The purpose of this document is to describe technical choices made during design & implementation and to explicate the reasoning behind those choices. 510ff414cSEd Maste 610ff414cSEd MasteTerminology 710ff414cSEd Maste--------------- 810ff414cSEd Maste=== ====================== ======================================================================================================================================== 9*abd87254SEd MasteMTB Major Type Byte https://www.rfc-editor.org/rfc/rfc8949.html#section-3.1 1010ff414cSEd Maste--- ---------------------- ---------------------------------------------------------------------------------------------------------------------------------------- 1110ff414cSEd MasteDST Dynamically Sized Type Type whose storage requirements cannot be determined 1210ff414cSEd Maste 1310ff414cSEd Maste during compilation (originated in the `Rust <http://www.rust-lang.org/>`_ community) 1410ff414cSEd Maste=== ====================== ======================================================================================================================================== 1510ff414cSEd Maste 1610ff414cSEd MasteConventions 1710ff414cSEd Maste-------------- 1810ff414cSEd MasteAPI symbols start with ``cbor_`` or ``CBOR_`` prefix, internal symbols have ``_cbor_`` or ``_CBOR_`` prefix. 1910ff414cSEd Maste 2010ff414cSEd MasteInspiration & related projects 2110ff414cSEd Maste------------------------------- 2210ff414cSEd MasteMost of the API is largely modelled after existing JSON libraries, including 2310ff414cSEd Maste 2410ff414cSEd Maste - `Jansson <http://www.digip.org/jansson/>`_ 2510ff414cSEd Maste - `json-c <https://github.com/json-c/json-c>`_ 2610ff414cSEd Maste - Gnome's `JsonGlib <https://wiki.gnome.org/action/show/Projects/JsonGlib?action=show&redirect=JsonGlib>`_ 2710ff414cSEd Maste 2810ff414cSEd Masteand also borrowing from 2910ff414cSEd Maste 3010ff414cSEd Maste - `msgpack-c <https://github.com/msgpack/msgpack-c>`_ 3110ff414cSEd Maste - `Google Protocol Buffers <http://code.google.com/p/protobuf/>`_. 3210ff414cSEd Maste 3310ff414cSEd MasteGeneral notes on the API design 3410ff414cSEd Maste-------------------------------- 35*abd87254SEd MasteThe API design has two main driving principles: 3610ff414cSEd Maste 3710ff414cSEd Maste 1. Let the client manage the memory as much as possible 3810ff414cSEd Maste 2. Behave exactly as specified by the standard 3910ff414cSEd Maste 4010ff414cSEd MasteCombining these two principles in practice turns out to be quite difficult. Indefinite-length strings, arrays, and maps require client to handle every fixed-size chunk explicitly in order to 4110ff414cSEd Maste 4210ff414cSEd Maste - ensure the client never runs out of memory due to *libcbor* 4310ff414cSEd Maste - use :func:`realloc` sparsely and predictably [#]_ 4410ff414cSEd Maste 4510ff414cSEd Maste - provide strong guarantees about its usage (to prevent latency spikes) 4610ff414cSEd Maste - provide APIs to avoid :func:`realloc` altogether 4710ff414cSEd Maste - allow proper handling of (streamed) data bigger than available memory 4810ff414cSEd Maste 4910ff414cSEd Maste .. [#] Reasonable handling of DSTs requires reallocation if the API is to remain sane. 5010ff414cSEd Maste 5110ff414cSEd Maste 5210ff414cSEd MasteCoding style 5310ff414cSEd Maste------------- 5410ff414cSEd MasteThis code loosely follows the `Linux kernel coding style <https://www.kernel.org/doc/Documentation/CodingStyle>`_. Tabs are tabs, and they are 4 characters wide. 5510ff414cSEd Maste 5610ff414cSEd Maste 5710ff414cSEd MasteMemory layout 5810ff414cSEd Maste--------------- 5910ff414cSEd MasteCBOR is very dynamic in the sense that it contains many data elements of variable length, sometimes even indefinite length. This section describes internal representation of all CBOR data types. 6010ff414cSEd Maste 6110ff414cSEd MasteGenerally speaking, data items consist of three parts: 6210ff414cSEd Maste 6310ff414cSEd Maste - a generic :type:`handle <cbor_item_t>`, 6410ff414cSEd Maste - the associated :type:`metadata <cbor_item_metadata>`, 6510ff414cSEd Maste - and the actual data 6610ff414cSEd Maste 6710ff414cSEd Maste.. type:: cbor_item_t 6810ff414cSEd Maste 6910ff414cSEd Maste Represents the item. Used as an opaque type 7010ff414cSEd Maste 7110ff414cSEd Maste .. member:: cbor_type type 7210ff414cSEd Maste 7310ff414cSEd Maste Type discriminator 7410ff414cSEd Maste 7510ff414cSEd Maste .. member:: size_t refcount 7610ff414cSEd Maste 7710ff414cSEd Maste Reference counter. Used by :func:`cbor_decref`, :func:`cbor_incref` 7810ff414cSEd Maste 7910ff414cSEd Maste .. member:: union cbor_item_metadata metadata 8010ff414cSEd Maste 8110ff414cSEd Maste Union discriminated by :member:`type`. Contains type-specific metadata 8210ff414cSEd Maste 8310ff414cSEd Maste .. member:: unsigned char * data 8410ff414cSEd Maste 8510ff414cSEd Maste Contains pointer to the actual data. Small, fixed size items (:doc:`api/type_0_1`, :doc:`api/type_6`, :doc:`api/type_7`) are allocated as a single memory block. 8610ff414cSEd Maste 8710ff414cSEd Maste Consider the following snippet 8810ff414cSEd Maste 8910ff414cSEd Maste .. code-block:: c 9010ff414cSEd Maste 9110ff414cSEd Maste cbor_item_t * item = cbor_new_int8(); 9210ff414cSEd Maste 9310ff414cSEd Maste then the memory is laid out as follows 9410ff414cSEd Maste 9510ff414cSEd Maste :: 9610ff414cSEd Maste 9710ff414cSEd Maste +-----------+---------------+---------------+-----------------------------------++-----------+ 9810ff414cSEd Maste | | | | || | 9910ff414cSEd Maste | type | refcount | metadata | data || uint8_t | 10010ff414cSEd Maste | | | | (= item + sizeof(cbor_item_t)) || | 10110ff414cSEd Maste +-----------+---------------+---------------+-----------------------------------++-----------+ 10210ff414cSEd Maste ^ ^ 10310ff414cSEd Maste | | 10410ff414cSEd Maste +--- item +--- item->data 10510ff414cSEd Maste 10610ff414cSEd Maste Dynamically sized types (:doc:`api/type_2`, :doc:`api/type_3`, :doc:`api/type_4`, :doc:`api/type_5`) may store handle and data in separate locations. This enables creating large items (e.g :doc:`byte strings <api/type_2>`) without :func:`realloc` or copying large blocks of memory. One simply attaches the correct pointer to the handle. 10710ff414cSEd Maste 10810ff414cSEd Maste 10910ff414cSEd Maste.. type:: cbor_item_metadata 11010ff414cSEd Maste 11110ff414cSEd Maste Union type of the following members, based on the item type: 11210ff414cSEd Maste 11310ff414cSEd Maste .. member:: struct _cbor_int_metadata int_metadata 11410ff414cSEd Maste 11510ff414cSEd Maste Used both by both :doc:`api/type_0_1` 11610ff414cSEd Maste 11710ff414cSEd Maste .. member:: struct _cbor_bytestring_metadata bytestring_metadata 11810ff414cSEd Maste .. member:: struct _cbor_string_metadata string_metadata 11910ff414cSEd Maste .. member:: struct _cbor_array_metadata array_metadata 12010ff414cSEd Maste .. member:: struct _cbor_map_metadata map_metadata 12110ff414cSEd Maste .. member:: struct _cbor_tag_metadata tag_metadata 12210ff414cSEd Maste .. member:: struct _cbor_float_ctrl_metadata float_ctrl_metadata 12310ff414cSEd Maste 12410ff414cSEd MasteDecoding 12510ff414cSEd Maste--------- 12610ff414cSEd Maste 12710ff414cSEd MasteAs outlined in :doc:`api`, there decoding is based on the streaming decoder Essentially, the decoder is a custom set of callbacks for the streaming decoder. 12810ff414cSEd Maste 129