xref: /freebsd/contrib/libxo/doc/encoders.rst (revision c66ec88fed842fbaad62c30d510644ceb7bd2d71)
1.. index:: encoder
2
3Encoders
4========
5
6This section gives an overview of encoders, details on the encoders
7that ship with libxo, and documentation for developers of future
8encoders.
9
10Overview
11--------
12
13The libxo library contains software to generate four "built-in"
14formats: text, XML, JSON, and HTML.  These formats are common and
15useful, but there are other common and useful formats that users will
16want, and including them all in the libxo software would be difficult
17and cumbersome.
18
19To allow support for additional encodings, libxo includes a
20"pluggable" extension mechanism for dynamically loading new encoders.
21libxo-based applications can automatically use any installed encoder.
22
23Use the "encoder=XXX" option to access encoders.  The following
24example uses the "cbor" encoder, saving the output into a file::
25
26    df --libxo encoder=cbor > df-output.cbor
27
28Encoders can support specific options that can be accessed by
29following the encoder name with a colon (':') or a plus sign ('+') and
30one of more options, separated by the same character::
31
32    df --libxo encoder=csv+path=filesystem+leaf=name+no-header
33    df --libxo encoder=csv:path=filesystem:leaf=name:no-header
34
35These examples instructs libxo to load the "csv" encoder and pass the
36following options::
37
38   path=filesystem
39   leaf=name
40   no-header
41
42Each of these option is interpreted by the encoder, and all such
43options names and semantics are specific to the particular encoder.
44Refer to the intended encoder for documentation on its options.
45
46The string "@" can be used in place of the string "encoder=".
47
48    df --libxo @csv:no-header
49
50.. _csv_encoder:
51
52CSV - Comma Separated Values
53----------------------------
54
55libxo ships with a custom encoder for "CSV" files, a common format for
56comma separated values.  The output of the CSV encoder can be loaded
57directly into spreadsheets or similar applications.
58
59A standard for CSV files is provided in :RFC:`4180`, but since the
60format predates that standard by decades, there are many minor
61differences in CSV file consumers and their expectations.  The CSV
62encoder has a number of options to tailor output to those
63expectations.
64
65Consider the following XML::
66
67  % list-items --libxo xml,pretty
68  <top>
69    <data test="value">
70      <item test2="value2">
71        <sku test3="value3" key="key">GRO-000-415</sku>
72        <name key="key">gum</name>
73        <sold>1412</sold>
74        <in-stock>54</in-stock>
75        <on-order>10</on-order>
76      </item>
77      <item>
78        <sku test3="value3" key="key">HRD-000-212</sku>
79        <name key="key">rope</name>
80        <sold>85</sold>
81        <in-stock>4</in-stock>
82        <on-order>2</on-order>
83      </item>
84      <item>
85        <sku test3="value3" key="key">HRD-000-517</sku>
86        <name key="key">ladder</name>
87        <sold>0</sold>
88        <in-stock>2</in-stock>
89        <on-order>1</on-order>
90      </item>
91    </data>
92  </top>
93
94This output is a list of `instances` (named "item"), each containing a
95set of `leafs` ("sku", "name", etc).
96
97The CSV encoder will emit the leaf values in this output as `fields`
98inside a CSV `record`, which is a line containing a set of
99comma-separated values::
100
101  % list-items --libxo encoder=csv
102  sku,name,sold,in-stock,on-order
103  GRO-000-415,gum,1412,54,10
104  HRD-000-212,rope,85,4,2
105  HRD-000-517,ladder,0,2,1
106
107Be aware that since the CSV encoder looks for data instances, when
108used with :ref:`xo`, the `--instance` option will be needed::
109
110  % xo --libxo encoder=csv --instance foo 'The {:product} is {:status}\n' stereo "in route"
111  product,status
112  stereo,in route
113
114.. _csv_path:
115
116The `path` Option
117~~~~~~~~~~~~~~~~~
118
119By default, the CSV encoder will attempt to emit any list instance
120generated by the application.  In some cases, this may be
121unacceptable, and a specific list may be desired.
122
123Use the "path" option to limit the processing of output to a specific
124hierarchy.  The path should be one or more names of containers or
125lists.
126
127For example, if the "list-items" application generates other lists,
128the user can give "path=top/data/item" as a path::
129
130  % list-items --libxo encoder=csv:path=top/data/item
131  sku,name,sold,in-stock,on-order
132  GRO-000-415,gum,1412,54,10
133  HRD-000-212,rope,85,4,2
134  HRD-000-517,ladder,0,2,1
135
136Paths are "relative", meaning they need not be a complete set
137of names to the list.  This means that "path=item" may be sufficient
138for the above example.
139
140.. _csv_leafs:
141
142The `leafs` Option
143~~~~~~~~~~~~~~~~~~
144
145The CSV encoding requires that all lines of output have the same
146number of fields with the same order.  In contrast, XML and JSON allow
147any order (though libxo forces key leafs to appear before other
148leafs).
149
150To maintain a consistent set of fields inside the CSV file, the same
151set of leafs must be selected from each list item.  By default, the
152CSV encoder records the set of leafs that appear in the first list
153instance it processes, and extract only those leafs from future
154instances.  If the first instance is missing a leaf that is desired by
155the consumer, the "leaf" option can be used to ensure that an empty
156value is recorded for instances that lack a particular leaf.
157
158The "leafs" option can also be used to exclude leafs, limiting the
159output to only those leafs provided.
160
161In addition, the order of the output fields follows the order in which
162the leafs are listed.  "leafs=one.two" and "leafs=two.one" give
163distinct output.
164
165So the "leafs" option can be used to expand, limit, and order the set
166of leafs.
167
168The value of the leafs option should be one or more leaf names,
169separated by a period (".")::
170
171  % list-items --libxo encoder=csv:leafs=sku.on-order
172  sku,on-order
173  GRO-000-415,10
174  HRD-000-212,2
175  HRD-000-517,1
176  % list-items -libxo encoder=csv:leafs=on-order.sku
177  on-order,sku
178  10,GRO-000-415
179  2,HRD-000-212
180  1,HRD-000-517
181
182Note that since libxo uses terminology from YANG (:RFC:`7950`), the
183data modeling language for NETCONF (:RFC:`6241`), which uses "leafs"
184as the plural form of "leaf".  libxo follows that convention.
185
186.. _csv_no_header:
187
188The `no-header` Option
189~~~~~~~~~~~~~~~~~~~~~~
190
191CSV files typical begin with a line that defines the fields included
192in that file, in an attempt to make the contents self-defining::
193
194    sku,name,sold,in-stock,on-order
195    GRO-000-415,gum,1412,54,10
196    HRD-000-212,rope,85,4,2
197    HRD-000-517,ladder,0,2,1
198
199There is no reliable mechanism for determining whether this header
200line is included, so the consumer must make an assumption.
201
202The csv encoder defaults to producing the header line, but the
203"no-header" option can be included to avoid the header line.
204
205.. _csv_no_quotes:
206
207The `no-quotes` Option
208~~~~~~~~~~~~~~~~~~~~~~
209
210:RFC:`4180` specifies that fields containing spaces should be quoted, but
211many CSV consumers do not handle quotes.  The "no-quotes" option
212instruct the CSV encoder to avoid the use of quotes.
213
214.. _csv_dos:
215
216The `dos` Option
217~~~~~~~~~~~~~~~~
218
219:RFC:`4180` defines the end-of-line marker as a carriage return
220followed by a newline.  This `CRLF` convention dates from the distant
221past, but its use was anchored in the 1980s by the `DOS` operating
222system.
223
224The CSV encoder defaults to using the standard Unix end-of-line
225marker, a simple newline.  Use the "dos" option to use the `CRLF`
226convention.
227
228The Encoder API
229---------------
230
231The encoder API consists of three distinct phases:
232
233- loading the encoder
234- initializing the encoder
235- feeding operations to the encoder
236
237To load the encoder, libxo will open a shared library named:
238
239   ${prefix}/lib/libxo/encoder/${name}.enc
240
241This file is typically a symbolic link to a dynamic library, suitable
242for `dlopen`().  libxo looks for a symbol called
243`xo_encoder_library_init` inside that library and calls it with the
244arguments defined in the header file "xo_encoder.h".  This function
245should look as follows::
246
247  int
248  xo_encoder_library_init (XO_ENCODER_INIT_ARGS)
249  {
250      arg->xei_version = XO_ENCODER_VERSION;
251      arg->xei_handler = test_handler;
252
253      return 0;
254  }
255
256Several features here allow for future compatibility: the macro
257XO_ENCODER_INIT_ARGS allows the arguments to this function change over
258time, and the XO_ENCODER_VERSION allows the library to tell libxo
259which version of the API it was compiled with.
260
261The function places in xei_handler should be have the signature::
262
263  static int
264  test_handler (XO_ENCODER_HANDLER_ARGS)
265  {
266       ...
267
268This function will be called with the "op" codes defined in
269"xo_encoder.h".  Each op code represents a distinct event in the libxo
270processing model.  For example OP_OPEN_CONTAINER tells the encoder
271that a new container has been opened, and the encoder can behave in an
272appropriate manner.
273
274
275