xref: /freebsd/contrib/libxo/doc/encoders.rst (revision 773bec086828bf0f1ba663958853823f7a059fb5)
176afb20cSPhil Shafer.. index:: encoder
276afb20cSPhil Shafer
376afb20cSPhil ShaferEncoders
476afb20cSPhil Shafer========
576afb20cSPhil Shafer
676afb20cSPhil ShaferThis section gives an overview of encoders, details on the encoders
776afb20cSPhil Shaferthat ship with libxo, and documentation for developers of future
876afb20cSPhil Shaferencoders.
976afb20cSPhil Shafer
1076afb20cSPhil ShaferOverview
1176afb20cSPhil Shafer--------
1276afb20cSPhil Shafer
1376afb20cSPhil ShaferThe libxo library contains software to generate four "built-in"
1476afb20cSPhil Shaferformats: text, XML, JSON, and HTML.  These formats are common and
1576afb20cSPhil Shaferuseful, but there are other common and useful formats that users will
1676afb20cSPhil Shaferwant, and including them all in the libxo software would be difficult
1776afb20cSPhil Shaferand cumbersome.
1876afb20cSPhil Shafer
1976afb20cSPhil ShaferTo allow support for additional encodings, libxo includes a
2076afb20cSPhil Shafer"pluggable" extension mechanism for dynamically loading new encoders.
2176afb20cSPhil Shaferlibxo-based applications can automatically use any installed encoder.
2276afb20cSPhil Shafer
2376afb20cSPhil ShaferUse the "encoder=XXX" option to access encoders.  The following
2476afb20cSPhil Shaferexample uses the "cbor" encoder, saving the output into a file::
2576afb20cSPhil Shafer
2676afb20cSPhil Shafer    df --libxo encoder=cbor > df-output.cbor
2776afb20cSPhil Shafer
2876afb20cSPhil ShaferEncoders can support specific options that can be accessed by
29*5c5819b2SPhil Shaferfollowing the encoder name with a colon (':') or a plus sign ('+') and
30*5c5819b2SPhil Shaferone of more options, separated by the same character::
3176afb20cSPhil Shafer
32*5c5819b2SPhil Shafer    df --libxo encoder=csv+path=filesystem+leaf=name+no-header
33*5c5819b2SPhil Shafer    df --libxo encoder=csv:path=filesystem:leaf=name:no-header
3476afb20cSPhil Shafer
35*5c5819b2SPhil ShaferThese examples instructs libxo to load the "csv" encoder and pass the
3676afb20cSPhil Shaferfollowing options::
3776afb20cSPhil Shafer
3876afb20cSPhil Shafer   path=filesystem
3976afb20cSPhil Shafer   leaf=name
4076afb20cSPhil Shafer   no-header
4176afb20cSPhil Shafer
4276afb20cSPhil ShaferEach of these option is interpreted by the encoder, and all such
4376afb20cSPhil Shaferoptions names and semantics are specific to the particular encoder.
4476afb20cSPhil ShaferRefer to the intended encoder for documentation on its options.
4576afb20cSPhil Shafer
46*5c5819b2SPhil ShaferThe string "@" can be used in place of the string "encoder=".
47*5c5819b2SPhil Shafer
48*5c5819b2SPhil Shafer    df --libxo @csv:no-header
49*5c5819b2SPhil Shafer
5076afb20cSPhil Shafer.. _csv_encoder:
5176afb20cSPhil Shafer
5276afb20cSPhil ShaferCSV - Comma Separated Values
5376afb20cSPhil Shafer----------------------------
5476afb20cSPhil Shafer
5576afb20cSPhil Shaferlibxo ships with a custom encoder for "CSV" files, a common format for
5676afb20cSPhil Shafercomma separated values.  The output of the CSV encoder can be loaded
5776afb20cSPhil Shaferdirectly into spreadsheets or similar applications.
5876afb20cSPhil Shafer
5976afb20cSPhil ShaferA standard for CSV files is provided in :RFC:`4180`, but since the
6076afb20cSPhil Shaferformat predates that standard by decades, there are many minor
6176afb20cSPhil Shaferdifferences in CSV file consumers and their expectations.  The CSV
6276afb20cSPhil Shaferencoder has a number of options to tailor output to those
6376afb20cSPhil Shaferexpectations.
6476afb20cSPhil Shafer
6576afb20cSPhil ShaferConsider the following XML::
6676afb20cSPhil Shafer
6776afb20cSPhil Shafer  % list-items --libxo xml,pretty
6876afb20cSPhil Shafer  <top>
6976afb20cSPhil Shafer    <data test="value">
7076afb20cSPhil Shafer      <item test2="value2">
7176afb20cSPhil Shafer        <sku test3="value3" key="key">GRO-000-415</sku>
7276afb20cSPhil Shafer        <name key="key">gum</name>
7376afb20cSPhil Shafer        <sold>1412</sold>
7476afb20cSPhil Shafer        <in-stock>54</in-stock>
7576afb20cSPhil Shafer        <on-order>10</on-order>
7676afb20cSPhil Shafer      </item>
7776afb20cSPhil Shafer      <item>
7876afb20cSPhil Shafer        <sku test3="value3" key="key">HRD-000-212</sku>
7976afb20cSPhil Shafer        <name key="key">rope</name>
8076afb20cSPhil Shafer        <sold>85</sold>
8176afb20cSPhil Shafer        <in-stock>4</in-stock>
8276afb20cSPhil Shafer        <on-order>2</on-order>
8376afb20cSPhil Shafer      </item>
8476afb20cSPhil Shafer      <item>
8576afb20cSPhil Shafer        <sku test3="value3" key="key">HRD-000-517</sku>
8676afb20cSPhil Shafer        <name key="key">ladder</name>
8776afb20cSPhil Shafer        <sold>0</sold>
8876afb20cSPhil Shafer        <in-stock>2</in-stock>
8976afb20cSPhil Shafer        <on-order>1</on-order>
9076afb20cSPhil Shafer      </item>
9176afb20cSPhil Shafer    </data>
9276afb20cSPhil Shafer  </top>
9376afb20cSPhil Shafer
9476afb20cSPhil ShaferThis output is a list of `instances` (named "item"), each containing a
9576afb20cSPhil Shaferset of `leafs` ("sku", "name", etc).
9676afb20cSPhil Shafer
9776afb20cSPhil ShaferThe CSV encoder will emit the leaf values in this output as `fields`
9876afb20cSPhil Shaferinside a CSV `record`, which is a line containing a set of
9976afb20cSPhil Shafercomma-separated values::
10076afb20cSPhil Shafer
10176afb20cSPhil Shafer  % list-items --libxo encoder=csv
10276afb20cSPhil Shafer  sku,name,sold,in-stock,on-order
10376afb20cSPhil Shafer  GRO-000-415,gum,1412,54,10
10476afb20cSPhil Shafer  HRD-000-212,rope,85,4,2
10576afb20cSPhil Shafer  HRD-000-517,ladder,0,2,1
10676afb20cSPhil Shafer
10776afb20cSPhil ShaferBe aware that since the CSV encoder looks for data instances, when
10876afb20cSPhil Shaferused with :ref:`xo`, the `--instance` option will be needed::
10976afb20cSPhil Shafer
11076afb20cSPhil Shafer  % xo --libxo encoder=csv --instance foo 'The {:product} is {:status}\n' stereo "in route"
11176afb20cSPhil Shafer  product,status
11276afb20cSPhil Shafer  stereo,in route
11376afb20cSPhil Shafer
11476afb20cSPhil Shafer.. _csv_path:
11576afb20cSPhil Shafer
11676afb20cSPhil ShaferThe `path` Option
11776afb20cSPhil Shafer~~~~~~~~~~~~~~~~~
11876afb20cSPhil Shafer
11976afb20cSPhil ShaferBy default, the CSV encoder will attempt to emit any list instance
12076afb20cSPhil Shafergenerated by the application.  In some cases, this may be
12176afb20cSPhil Shaferunacceptable, and a specific list may be desired.
12276afb20cSPhil Shafer
12376afb20cSPhil ShaferUse the "path" option to limit the processing of output to a specific
12476afb20cSPhil Shaferhierarchy.  The path should be one or more names of containers or
12576afb20cSPhil Shaferlists.
12676afb20cSPhil Shafer
12776afb20cSPhil ShaferFor example, if the "list-items" application generates other lists,
12876afb20cSPhil Shaferthe user can give "path=top/data/item" as a path::
12976afb20cSPhil Shafer
13076afb20cSPhil Shafer  % list-items --libxo encoder=csv:path=top/data/item
13176afb20cSPhil Shafer  sku,name,sold,in-stock,on-order
13276afb20cSPhil Shafer  GRO-000-415,gum,1412,54,10
13376afb20cSPhil Shafer  HRD-000-212,rope,85,4,2
13476afb20cSPhil Shafer  HRD-000-517,ladder,0,2,1
13576afb20cSPhil Shafer
13676afb20cSPhil ShaferPaths are "relative", meaning they need not be a complete set
13776afb20cSPhil Shaferof names to the list.  This means that "path=item" may be sufficient
13876afb20cSPhil Shaferfor the above example.
13976afb20cSPhil Shafer
14076afb20cSPhil Shafer.. _csv_leafs:
14176afb20cSPhil Shafer
14276afb20cSPhil ShaferThe `leafs` Option
14376afb20cSPhil Shafer~~~~~~~~~~~~~~~~~~
14476afb20cSPhil Shafer
14576afb20cSPhil ShaferThe CSV encoding requires that all lines of output have the same
14676afb20cSPhil Shafernumber of fields with the same order.  In contrast, XML and JSON allow
14776afb20cSPhil Shaferany order (though libxo forces key leafs to appear before other
14876afb20cSPhil Shaferleafs).
14976afb20cSPhil Shafer
15076afb20cSPhil ShaferTo maintain a consistent set of fields inside the CSV file, the same
15176afb20cSPhil Shaferset of leafs must be selected from each list item.  By default, the
15276afb20cSPhil ShaferCSV encoder records the set of leafs that appear in the first list
15376afb20cSPhil Shaferinstance it processes, and extract only those leafs from future
15476afb20cSPhil Shaferinstances.  If the first instance is missing a leaf that is desired by
15576afb20cSPhil Shaferthe consumer, the "leaf" option can be used to ensure that an empty
15676afb20cSPhil Shafervalue is recorded for instances that lack a particular leaf.
15776afb20cSPhil Shafer
15876afb20cSPhil ShaferThe "leafs" option can also be used to exclude leafs, limiting the
15976afb20cSPhil Shaferoutput to only those leafs provided.
16076afb20cSPhil Shafer
16176afb20cSPhil ShaferIn addition, the order of the output fields follows the order in which
16276afb20cSPhil Shaferthe leafs are listed.  "leafs=one.two" and "leafs=two.one" give
16376afb20cSPhil Shaferdistinct output.
16476afb20cSPhil Shafer
16576afb20cSPhil ShaferSo the "leafs" option can be used to expand, limit, and order the set
16676afb20cSPhil Shaferof leafs.
16776afb20cSPhil Shafer
16876afb20cSPhil ShaferThe value of the leafs option should be one or more leaf names,
16976afb20cSPhil Shaferseparated by a period (".")::
17076afb20cSPhil Shafer
17176afb20cSPhil Shafer  % list-items --libxo encoder=csv:leafs=sku.on-order
17276afb20cSPhil Shafer  sku,on-order
17376afb20cSPhil Shafer  GRO-000-415,10
17476afb20cSPhil Shafer  HRD-000-212,2
17576afb20cSPhil Shafer  HRD-000-517,1
17676afb20cSPhil Shafer  % list-items -libxo encoder=csv:leafs=on-order.sku
17776afb20cSPhil Shafer  on-order,sku
17876afb20cSPhil Shafer  10,GRO-000-415
17976afb20cSPhil Shafer  2,HRD-000-212
18076afb20cSPhil Shafer  1,HRD-000-517
18176afb20cSPhil Shafer
18276afb20cSPhil ShaferNote that since libxo uses terminology from YANG (:RFC:`7950`), the
18376afb20cSPhil Shaferdata modeling language for NETCONF (:RFC:`6241`), which uses "leafs"
18476afb20cSPhil Shaferas the plural form of "leaf".  libxo follows that convention.
18576afb20cSPhil Shafer
18676afb20cSPhil Shafer.. _csv_no_header:
18776afb20cSPhil Shafer
18876afb20cSPhil ShaferThe `no-header` Option
18976afb20cSPhil Shafer~~~~~~~~~~~~~~~~~~~~~~
19076afb20cSPhil Shafer
19176afb20cSPhil ShaferCSV files typical begin with a line that defines the fields included
19276afb20cSPhil Shaferin that file, in an attempt to make the contents self-defining::
19376afb20cSPhil Shafer
19476afb20cSPhil Shafer    sku,name,sold,in-stock,on-order
19576afb20cSPhil Shafer    GRO-000-415,gum,1412,54,10
19676afb20cSPhil Shafer    HRD-000-212,rope,85,4,2
19776afb20cSPhil Shafer    HRD-000-517,ladder,0,2,1
19876afb20cSPhil Shafer
19976afb20cSPhil ShaferThere is no reliable mechanism for determining whether this header
20076afb20cSPhil Shaferline is included, so the consumer must make an assumption.
20176afb20cSPhil Shafer
20276afb20cSPhil ShaferThe csv encoder defaults to producing the header line, but the
20376afb20cSPhil Shafer"no-header" option can be included to avoid the header line.
20476afb20cSPhil Shafer
20576afb20cSPhil Shafer.. _csv_no_quotes:
20676afb20cSPhil Shafer
20776afb20cSPhil ShaferThe `no-quotes` Option
20876afb20cSPhil Shafer~~~~~~~~~~~~~~~~~~~~~~
20976afb20cSPhil Shafer
21076afb20cSPhil Shafer:RFC:`4180` specifies that fields containing spaces should be quoted, but
21176afb20cSPhil Shafermany CSV consumers do not handle quotes.  The "no-quotes" option
21276afb20cSPhil Shaferinstruct the CSV encoder to avoid the use of quotes.
21376afb20cSPhil Shafer
21476afb20cSPhil Shafer.. _csv_dos:
21576afb20cSPhil Shafer
21676afb20cSPhil ShaferThe `dos` Option
21776afb20cSPhil Shafer~~~~~~~~~~~~~~~~
21876afb20cSPhil Shafer
21976afb20cSPhil Shafer:RFC:`4180` defines the end-of-line marker as a carriage return
22076afb20cSPhil Shaferfollowed by a newline.  This `CRLF` convention dates from the distant
22176afb20cSPhil Shaferpast, but its use was anchored in the 1980s by the `DOS` operating
22276afb20cSPhil Shafersystem.
22376afb20cSPhil Shafer
22476afb20cSPhil ShaferThe CSV encoder defaults to using the standard Unix end-of-line
22576afb20cSPhil Shafermarker, a simple newline.  Use the "dos" option to use the `CRLF`
22676afb20cSPhil Shaferconvention.
22776afb20cSPhil Shafer
22876afb20cSPhil ShaferThe Encoder API
22976afb20cSPhil Shafer---------------
23076afb20cSPhil Shafer
23176afb20cSPhil ShaferThe encoder API consists of three distinct phases:
23276afb20cSPhil Shafer
23376afb20cSPhil Shafer- loading the encoder
23476afb20cSPhil Shafer- initializing the encoder
23576afb20cSPhil Shafer- feeding operations to the encoder
23676afb20cSPhil Shafer
23776afb20cSPhil ShaferTo load the encoder, libxo will open a shared library named:
23876afb20cSPhil Shafer
23976afb20cSPhil Shafer   ${prefix}/lib/libxo/encoder/${name}.enc
24076afb20cSPhil Shafer
24176afb20cSPhil ShaferThis file is typically a symbolic link to a dynamic library, suitable
24276afb20cSPhil Shaferfor `dlopen`().  libxo looks for a symbol called
24376afb20cSPhil Shafer`xo_encoder_library_init` inside that library and calls it with the
24476afb20cSPhil Shaferarguments defined in the header file "xo_encoder.h".  This function
24576afb20cSPhil Shafershould look as follows::
24676afb20cSPhil Shafer
24776afb20cSPhil Shafer  int
24876afb20cSPhil Shafer  xo_encoder_library_init (XO_ENCODER_INIT_ARGS)
24976afb20cSPhil Shafer  {
25076afb20cSPhil Shafer      arg->xei_version = XO_ENCODER_VERSION;
25176afb20cSPhil Shafer      arg->xei_handler = test_handler;
25276afb20cSPhil Shafer
25376afb20cSPhil Shafer      return 0;
25476afb20cSPhil Shafer  }
25576afb20cSPhil Shafer
25676afb20cSPhil ShaferSeveral features here allow for future compatibility: the macro
25776afb20cSPhil ShaferXO_ENCODER_INIT_ARGS allows the arguments to this function change over
25876afb20cSPhil Shafertime, and the XO_ENCODER_VERSION allows the library to tell libxo
25976afb20cSPhil Shaferwhich version of the API it was compiled with.
26076afb20cSPhil Shafer
26176afb20cSPhil ShaferThe function places in xei_handler should be have the signature::
26276afb20cSPhil Shafer
26376afb20cSPhil Shafer  static int
26476afb20cSPhil Shafer  test_handler (XO_ENCODER_HANDLER_ARGS)
26576afb20cSPhil Shafer  {
26676afb20cSPhil Shafer       ...
26776afb20cSPhil Shafer
26876afb20cSPhil ShaferThis function will be called with the "op" codes defined in
26976afb20cSPhil Shafer"xo_encoder.h".  Each op code represents a distinct event in the libxo
27076afb20cSPhil Shaferprocessing model.  For example OP_OPEN_CONTAINER tells the encoder
27176afb20cSPhil Shaferthat a new container has been opened, and the encoder can behave in an
27276afb20cSPhil Shaferappropriate manner.
27376afb20cSPhil Shafer
27476afb20cSPhil Shafer
275