xref: /freebsd/contrib/expat/doc/reference.html (revision e3f4a63af63bea70bc86b6c790b14aa5ee99fcd0)
1<?xml version="1.0" encoding="utf-8"?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
3    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
4<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
5  <head>
6    <!--
7                            __  __            _
8                         ___\ \/ /_ __   __ _| |_
9                        / _ \\  /| '_ \ / _` | __|
10                       |  __//  \| |_) | (_| | |_
11                        \___/_/\_\ .__/ \__,_|\__|
12                                 |_| XML parser
13
14   Copyright (c) 2000      Clark Cooper <coopercc@users.sourceforge.net>
15   Copyright (c) 2000-2004 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
16   Copyright (c) 2002-2012 Karl Waclawek <karl@waclawek.net>
17   Copyright (c) 2017-2026 Sebastian Pipping <sebastian@pipping.org>
18   Copyright (c) 2017      Jakub Wilk <jwilk@jwilk.net>
19   Copyright (c) 2021      Tomas Korbar <tkorbar@redhat.com>
20   Copyright (c) 2021      Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
21   Copyright (c) 2022      Thijs Schreijer <thijs@thijsschreijer.nl>
22   Copyright (c) 2023-2025 Hanno Böck <hanno@gentoo.org>
23   Copyright (c) 2023      Sony Corporation / Snild Dolkow <snild@sony.com>
24   Licensed under the MIT license:
25
26   Permission is  hereby granted,  free of charge,  to any  person obtaining
27   a  copy  of  this  software   and  associated  documentation  files  (the
28   "Software"),  to  deal in  the  Software  without restriction,  including
29   without  limitation the  rights  to use,  copy,  modify, merge,  publish,
30   distribute, sublicense, and/or sell copies of the Software, and to permit
31   persons  to whom  the Software  is  furnished to  do so,  subject to  the
32   following conditions:
33
34   The above copyright  notice and this permission notice  shall be included
35   in all copies or substantial portions of the Software.
36
37   THE  SOFTWARE  IS  PROVIDED  "AS  IS",  WITHOUT  WARRANTY  OF  ANY  KIND,
38   EXPRESS  OR IMPLIED,  INCLUDING  BUT  NOT LIMITED  TO  THE WARRANTIES  OF
39   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN
40   NO EVENT SHALL THE AUTHORS OR  COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
41   DAMAGES OR  OTHER LIABILITY, WHETHER  IN AN  ACTION OF CONTRACT,  TORT OR
42   OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
43   USE OR OTHER DEALINGS IN THE SOFTWARE.
44-->
45
46    <title>
47      Expat XML Parser
48    </title>
49    <meta name="author" content="Clark Cooper, coopercc@netheaven.com" />
50    <link href="ok.min.css" rel="stylesheet" />
51    <link href="style.css" rel="stylesheet" />
52  </head>
53  <body>
54    <div>
55      <h1>
56        The Expat XML Parser <small>Release 2.8.0</small>
57      </h1>
58    </div>
59
60    <div class="content">
61      <p>
62        Expat is a library, written in C, for parsing XML documents. It's the underlying
63        XML parser for the open source Mozilla project, Perl's <code>XML::Parser</code>,
64        Python's <code>xml.parsers.expat</code>, and other open-source XML parsers.
65      </p>
66
67      <p>
68        This library is the creation of James Clark, who's also given us groff (an nroff
69        look-alike), Jade (an implementation of ISO's DSSSL stylesheet language for
70        SGML), XP (a Java XML parser package), XT (a Java XSL engine). James was also the
71        technical lead on the XML Working Group at W3C that produced the XML
72        specification.
73      </p>
74
75      <p>
76        This is free software, licensed under the <a href="../COPYING">MIT/X Consortium
77        license</a>. You may download it from <a href="https://libexpat.github.io/">the
78        Expat home page</a>.
79      </p>
80
81      <p>
82        The bulk of this document was originally commissioned as an article by <a href=
83        "https://www.xml.com/">XML.com</a>. They graciously allowed Clark Cooper to
84        retain copyright and to distribute it with Expat. This version has been
85        substantially extended to include documentation on features which have been added
86        since the original article was published, and additional information on using the
87        original interface.
88      </p>
89
90      <hr />
91
92      <h2>
93        Table of Contents
94      </h2>
95
96      <ul>
97        <li>
98          <a href="#overview">Overview</a>
99        </li>
100
101        <li>
102          <a href="#building">Building and Installing</a>
103        </li>
104
105        <li>
106          <a href="#using">Using Expat</a>
107        </li>
108
109        <li>
110          <a href="#reference">Reference</a>
111          <ul>
112            <li>
113              <a href="#creation">Parser Creation Functions</a>
114              <ul>
115                <li>
116                  <a href="#XML_ParserCreate">XML_ParserCreate</a>
117                </li>
118
119                <li>
120                  <a href="#XML_ParserCreateNS">XML_ParserCreateNS</a>
121                </li>
122
123                <li>
124                  <a href="#XML_ParserCreate_MM">XML_ParserCreate_MM</a>
125                </li>
126
127                <li>
128                  <a href=
129                  "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a>
130                </li>
131
132                <li>
133                  <a href="#XML_ParserFree">XML_ParserFree</a>
134                </li>
135
136                <li>
137                  <a href="#XML_ParserReset">XML_ParserReset</a>
138                </li>
139              </ul>
140            </li>
141
142            <li>
143              <a href="#parsing">Parsing Functions</a>
144              <ul>
145                <li>
146                  <a href="#XML_Parse">XML_Parse</a>
147                </li>
148
149                <li>
150                  <a href="#XML_ParseBuffer">XML_ParseBuffer</a>
151                </li>
152
153                <li>
154                  <a href="#XML_GetBuffer">XML_GetBuffer</a>
155                </li>
156
157                <li>
158                  <a href="#XML_StopParser">XML_StopParser</a>
159                </li>
160
161                <li>
162                  <a href="#XML_ResumeParser">XML_ResumeParser</a>
163                </li>
164
165                <li>
166                  <a href="#XML_GetParsingStatus">XML_GetParsingStatus</a>
167                </li>
168              </ul>
169            </li>
170
171            <li>
172              <a href="#setting">Handler Setting Functions</a>
173              <ul>
174                <li>
175                  <a href="#XML_SetStartElementHandler">XML_SetStartElementHandler</a>
176                </li>
177
178                <li>
179                  <a href="#XML_SetEndElementHandler">XML_SetEndElementHandler</a>
180                </li>
181
182                <li>
183                  <a href="#XML_SetElementHandler">XML_SetElementHandler</a>
184                </li>
185
186                <li>
187                  <a href="#XML_SetCharacterDataHandler">XML_SetCharacterDataHandler</a>
188                </li>
189
190                <li>
191                  <a href=
192                  "#XML_SetProcessingInstructionHandler">XML_SetProcessingInstructionHandler</a>
193                </li>
194
195                <li>
196                  <a href="#XML_SetCommentHandler">XML_SetCommentHandler</a>
197                </li>
198
199                <li>
200                  <a href=
201                  "#XML_SetStartCdataSectionHandler">XML_SetStartCdataSectionHandler</a>
202                </li>
203
204                <li>
205                  <a href=
206                  "#XML_SetEndCdataSectionHandler">XML_SetEndCdataSectionHandler</a>
207                </li>
208
209                <li>
210                  <a href="#XML_SetCdataSectionHandler">XML_SetCdataSectionHandler</a>
211                </li>
212
213                <li>
214                  <a href="#XML_SetDefaultHandler">XML_SetDefaultHandler</a>
215                </li>
216
217                <li>
218                  <a href="#XML_SetDefaultHandlerExpand">XML_SetDefaultHandlerExpand</a>
219                </li>
220
221                <li>
222                  <a href=
223                  "#XML_SetExternalEntityRefHandler">XML_SetExternalEntityRefHandler</a>
224                </li>
225
226                <li>
227                  <a href=
228                  "#XML_SetExternalEntityRefHandlerArg">XML_SetExternalEntityRefHandlerArg</a>
229                </li>
230
231                <li>
232                  <a href="#XML_SetSkippedEntityHandler">XML_SetSkippedEntityHandler</a>
233                </li>
234
235                <li>
236                  <a href=
237                  "#XML_SetUnknownEncodingHandler">XML_SetUnknownEncodingHandler</a>
238                </li>
239
240                <li>
241                  <a href=
242                  "#XML_SetStartNamespaceDeclHandler">XML_SetStartNamespaceDeclHandler</a>
243                </li>
244
245                <li>
246                  <a href=
247                  "#XML_SetEndNamespaceDeclHandler">XML_SetEndNamespaceDeclHandler</a>
248                </li>
249
250                <li>
251                  <a href="#XML_SetNamespaceDeclHandler">XML_SetNamespaceDeclHandler</a>
252                </li>
253
254                <li>
255                  <a href="#XML_SetXmlDeclHandler">XML_SetXmlDeclHandler</a>
256                </li>
257
258                <li>
259                  <a href=
260                  "#XML_SetStartDoctypeDeclHandler">XML_SetStartDoctypeDeclHandler</a>
261                </li>
262
263                <li>
264                  <a href=
265                  "#XML_SetEndDoctypeDeclHandler">XML_SetEndDoctypeDeclHandler</a>
266                </li>
267
268                <li>
269                  <a href="#XML_SetDoctypeDeclHandler">XML_SetDoctypeDeclHandler</a>
270                </li>
271
272                <li>
273                  <a href="#XML_SetElementDeclHandler">XML_SetElementDeclHandler</a>
274                </li>
275
276                <li>
277                  <a href="#XML_SetAttlistDeclHandler">XML_SetAttlistDeclHandler</a>
278                </li>
279
280                <li>
281                  <a href="#XML_SetEntityDeclHandler">XML_SetEntityDeclHandler</a>
282                </li>
283
284                <li>
285                  <a href=
286                  "#XML_SetUnparsedEntityDeclHandler">XML_SetUnparsedEntityDeclHandler</a>
287                </li>
288
289                <li>
290                  <a href="#XML_SetNotationDeclHandler">XML_SetNotationDeclHandler</a>
291                </li>
292
293                <li>
294                  <a href="#XML_SetNotStandaloneHandler">XML_SetNotStandaloneHandler</a>
295                </li>
296              </ul>
297            </li>
298
299            <li>
300              <a href="#position">Parse Position and Error Reporting Functions</a>
301              <ul>
302                <li>
303                  <a href="#XML_GetErrorCode">XML_GetErrorCode</a>
304                </li>
305
306                <li>
307                  <a href="#XML_ErrorString">XML_ErrorString</a>
308                </li>
309
310                <li>
311                  <a href="#XML_GetCurrentByteIndex">XML_GetCurrentByteIndex</a>
312                </li>
313
314                <li>
315                  <a href="#XML_GetCurrentLineNumber">XML_GetCurrentLineNumber</a>
316                </li>
317
318                <li>
319                  <a href="#XML_GetCurrentColumnNumber">XML_GetCurrentColumnNumber</a>
320                </li>
321
322                <li>
323                  <a href="#XML_GetCurrentByteCount">XML_GetCurrentByteCount</a>
324                </li>
325
326                <li>
327                  <a href="#XML_GetInputContext">XML_GetInputContext</a>
328                </li>
329              </ul>
330            </li>
331
332            <li>
333              <a href="#attack-protection">Attack Protection</a>
334              <ul>
335                <li>
336                  <a href=
337                  "#XML_SetBillionLaughsAttackProtectionMaximumAmplification">XML_SetBillionLaughsAttackProtectionMaximumAmplification</a>
338                </li>
339
340                <li>
341                  <a href=
342                  "#XML_SetBillionLaughsAttackProtectionActivationThreshold">XML_SetBillionLaughsAttackProtectionActivationThreshold</a>
343                </li>
344
345                <li>
346                  <a href=
347                  "#XML_SetAllocTrackerMaximumAmplification">XML_SetAllocTrackerMaximumAmplification</a>
348                </li>
349
350                <li>
351                  <a href=
352                  "#XML_SetAllocTrackerActivationThreshold">XML_SetAllocTrackerActivationThreshold</a>
353                </li>
354
355                <li>
356                  <a href=
357                  "#XML_SetReparseDeferralEnabled">XML_SetReparseDeferralEnabled</a>
358                </li>
359              </ul>
360            </li>
361
362            <li>
363              <a href="#miscellaneous">Miscellaneous Functions</a>
364              <ul>
365                <li>
366                  <a href="#XML_SetUserData">XML_SetUserData</a>
367                </li>
368
369                <li>
370                  <a href="#XML_GetUserData">XML_GetUserData</a>
371                </li>
372
373                <li>
374                  <a href="#XML_UseParserAsHandlerArg">XML_UseParserAsHandlerArg</a>
375                </li>
376
377                <li>
378                  <a href="#XML_SetBase">XML_SetBase</a>
379                </li>
380
381                <li>
382                  <a href="#XML_GetBase">XML_GetBase</a>
383                </li>
384
385                <li>
386                  <a href=
387                  "#XML_GetSpecifiedAttributeCount">XML_GetSpecifiedAttributeCount</a>
388                </li>
389
390                <li>
391                  <a href="#XML_GetIdAttributeIndex">XML_GetIdAttributeIndex</a>
392                </li>
393
394                <li>
395                  <a href="#XML_GetAttributeInfo">XML_GetAttributeInfo</a>
396                </li>
397
398                <li>
399                  <a href="#XML_SetEncoding">XML_SetEncoding</a>
400                </li>
401
402                <li>
403                  <a href="#XML_SetParamEntityParsing">XML_SetParamEntityParsing</a>
404                </li>
405
406                <li>
407                  <a href="#XML_SetHashSalt">XML_SetHashSalt</a> (deprecated)
408                </li>
409
410                <li>
411                  <a href="#XML_SetHashSalt16Bytes">XML_SetHashSalt16Bytes</a>
412                </li>
413
414                <li>
415                  <a href="#XML_UseForeignDTD">XML_UseForeignDTD</a>
416                </li>
417
418                <li>
419                  <a href="#XML_SetReturnNSTriplet">XML_SetReturnNSTriplet</a>
420                </li>
421
422                <li>
423                  <a href="#XML_DefaultCurrent">XML_DefaultCurrent</a>
424                </li>
425
426                <li>
427                  <a href="#XML_ExpatVersion">XML_ExpatVersion</a>
428                </li>
429
430                <li>
431                  <a href="#XML_ExpatVersionInfo">XML_ExpatVersionInfo</a>
432                </li>
433
434                <li>
435                  <a href="#XML_GetFeatureList">XML_GetFeatureList</a>
436                </li>
437
438                <li>
439                  <a href="#XML_FreeContentModel">XML_FreeContentModel</a>
440                </li>
441
442                <li>
443                  <a href="#XML_MemMalloc">XML_MemMalloc</a>
444                </li>
445
446                <li>
447                  <a href="#XML_MemRealloc">XML_MemRealloc</a>
448                </li>
449
450                <li>
451                  <a href="#XML_MemFree">XML_MemFree</a>
452                </li>
453              </ul>
454            </li>
455          </ul>
456        </li>
457      </ul>
458
459      <hr />
460
461      <h2>
462        <a id="overview" name="overview">Overview</a>
463      </h2>
464
465      <p>
466        Expat is a stream-oriented parser. You register callback (or handler) functions
467        with the parser and then start feeding it the document. As the parser recognizes
468        parts of the document, it will call the appropriate handler for that part (if
469        you've registered one.) The document is fed to the parser in pieces, so you can
470        start parsing before you have all the document. This also allows you to parse
471        really huge documents that won't fit into memory.
472      </p>
473
474      <p>
475        Expat can be intimidating due to the many kinds of handlers and options you can
476        set. But you only need to learn four functions in order to do 90% of what you'll
477        want to do with it:
478      </p>
479
480      <dl>
481        <dt>
482          <code><a href="#XML_ParserCreate">XML_ParserCreate</a></code>
483        </dt>
484
485        <dd>
486          Create a new parser object.
487        </dd>
488
489        <dt>
490          <code><a href="#XML_SetElementHandler">XML_SetElementHandler</a></code>
491        </dt>
492
493        <dd>
494          Set handlers for start and end tags.
495        </dd>
496
497        <dt>
498          <code><a href=
499          "#XML_SetCharacterDataHandler">XML_SetCharacterDataHandler</a></code>
500        </dt>
501
502        <dd>
503          Set handler for text.
504        </dd>
505
506        <dt>
507          <code><a href="#XML_Parse">XML_Parse</a></code>
508        </dt>
509
510        <dd>
511          Pass a buffer full of document to the parser
512        </dd>
513      </dl>
514
515      <p>
516        These functions and others are described in the <a href=
517        "#reference">reference</a> part of this document. The reference section also
518        describes in detail the parameters passed to the different types of handlers.
519      </p>
520
521      <p>
522        Let's look at a very simple example program that only uses 3 of the above
523        functions (it doesn't need to set a character handler.) The program <a href=
524        "../examples/outline.c">outline.c</a> prints an element outline, indenting child
525        elements to distinguish them from the parent element that contains them. The
526        start handler does all the work. It prints two indenting spaces for every level
527        of ancestor elements, then it prints the element and attribute information.
528        Finally it increments the global <code>Depth</code> variable.
529      </p>
530
531      <pre class="eg">
532int Depth;
533
534void XMLCALL
535start(void *data, const char *el, const char **attr) {
536  int i;
537
538  for (i = 0; i &lt; Depth; i++)
539    printf("  ");
540
541  printf("%s", el);
542
543  for (i = 0; attr[i]; i += 2) {
544    printf(" %s='%s'", attr[i], attr[i + 1]);
545  }
546
547  printf("\n");
548  Depth++;
549}  /* End of start handler */
550</pre>
551      <p>
552        The end tag simply does the bookkeeping work of decrementing <code>Depth</code>.
553      </p>
554
555      <pre class="eg">
556void XMLCALL
557end(void *data, const char *el) {
558  Depth--;
559}  /* End of end handler */
560</pre>
561      <p>
562        Note the <code>XMLCALL</code> annotation used for the callbacks. This is used to
563        ensure that the Expat and the callbacks are using the same calling convention in
564        case the compiler options used for Expat itself and the client code are
565        different. Expat tries not to care what the default calling convention is, though
566        it may require that it be compiled with a default convention of "cdecl" on some
567        platforms. For code which uses Expat, however, the calling convention is
568        specified by the <code>XMLCALL</code> annotation on most platforms; callbacks
569        should be defined using this annotation.
570      </p>
571
572      <p>
573        The <code>XMLCALL</code> annotation was added in Expat 1.95.7, but existing
574        working Expat applications don't need to add it (since they are already using the
575        "cdecl" calling convention, or they wouldn't be working). The annotation is only
576        needed if the default calling convention may be something other than "cdecl". To
577        use the annotation safely with older versions of Expat, you can conditionally
578        define it <em>after</em> including Expat's header file:
579      </p>
580
581      <pre class="eg">
582#include &lt;expat.h&gt;
583
584#ifndef XMLCALL
585#if defined(_MSC_VER) &amp;&amp; !defined(__BEOS__) &amp;&amp; !defined(__CYGWIN__)
586#define XMLCALL __cdecl
587#elif defined(__GNUC__)
588#define XMLCALL __attribute__((cdecl))
589#else
590#define XMLCALL
591#endif
592#endif
593</pre>
594      <p>
595        After creating the parser, the main program just has the job of shoveling the
596        document to the parser so that it can do its work.
597      </p>
598
599      <hr />
600
601      <h2>
602        <a id="building" name="building">Building and Installing Expat</a>
603      </h2>
604
605      <p>
606        The Expat distribution comes as a compressed (with GNU gzip) tar file. You may
607        download the latest version from <a href=
608        "https://sourceforge.net/projects/expat/">Source Forge</a>. After unpacking this,
609        cd into the directory. Then follow either the Win32 directions or Unix directions
610        below.
611      </p>
612
613      <h3>
614        Building under Win32
615      </h3>
616
617      <p>
618        If you're using the GNU compiler under cygwin, follow the Unix directions in the
619        next section. Otherwise if you have Microsoft's Developer Studio installed, you
620        can use CMake to generate a <code>.sln</code> file, e.g. <code>cmake -G"Visual
621        Studio 17 2022" -DCMAKE_BUILD_TYPE=RelWithDebInfo .</code> , and build Expat
622        using <code>msbuild /m expat.sln</code> after.
623      </p>
624
625      <p>
626        Alternatively, you may download the Win32 binary package that contains the
627        "expat.h" include file and a pre-built DLL.
628      </p>
629
630      <h3>
631        Building under Unix (or GNU)
632      </h3>
633
634      <p>
635        First you'll need to run the configure shell script in order to configure the
636        Makefiles and headers for your system.
637      </p>
638
639      <p>
640        If you're happy with all the defaults that configure picks for you, and you have
641        permission on your system to install into /usr/local, you can install Expat with
642        this sequence of commands:
643      </p>
644
645      <pre class="eg">
646./configure
647make
648make install
649</pre>
650      <p>
651        There are some options that you can provide to this script, but the only one
652        we'll mention here is the <code>--prefix</code> option. You can find out all the
653        options available by running configure with just the <code>--help</code> option.
654      </p>
655
656      <p>
657        By default, the configure script sets things up so that the library gets
658        installed in <code>/usr/local/lib</code> and the associated header file in
659        <code>/usr/local/include</code>. But if you were to give the option,
660        <code>--prefix=/home/me/mystuff</code>, then the library and header would get
661        installed in <code>/home/me/mystuff/lib</code> and
662        <code>/home/me/mystuff/include</code> respectively.
663      </p>
664
665      <h3>
666        Configuring Expat Using the Pre-Processor
667      </h3>
668
669      <p>
670        Expat's feature set can be configured using a small number of pre-processor
671        definitions. The symbols are:
672      </p>
673
674      <dl class="cpp-symbols">
675        <dt>
676          <a id="XML_GE" name="XML_GE">XML_GE</a>
677        </dt>
678
679        <dd>
680          Added in Expat 2.6.0. Include support for <a href=
681          "https://www.w3.org/TR/2006/REC-xml-20060816/#sec-physical-struct">general
682          entities</a> (syntax <code>&amp;e1;</code> to reference and syntax
683          <code>&lt;!ENTITY e1 'value1'&gt;</code> (an internal general entity) or
684          <code>&lt;!ENTITY e2 SYSTEM 'file2'&gt;</code> (an external general entity) to
685          declare). With <code>XML_GE</code> enabled, general entities will be replaced
686          by their declared replacement text; for this to work for <em>external</em>
687          general entities, in addition an <code><a href=
688          "#XML_SetExternalEntityRefHandler">XML_ExternalEntityRefHandler</a></code> must
689          be set using <code><a href=
690          "#XML_SetExternalEntityRefHandler">XML_SetExternalEntityRefHandler</a></code>.
691          Also, enabling <code>XML_GE</code> makes the functions <code><a href=
692          "#XML_SetBillionLaughsAttackProtectionMaximumAmplification">XML_SetBillionLaughsAttackProtectionMaximumAmplification</a></code>
693          and <code><a href=
694          "#XML_SetBillionLaughsAttackProtectionActivationThreshold">XML_SetBillionLaughsAttackProtectionActivationThreshold</a></code>
695          available.<br />
696          With <code>XML_GE</code> disabled, Expat has a smaller memory footprint and can
697          be faster, but will not load external general entities and will replace all
698          general entities (except the <a href=
699          "https://www.w3.org/TR/2006/REC-xml-20060816/#sec-predefined-ent">predefined
700          five</a>: <code>amp</code>, <code>apos</code>, <code>gt</code>,
701          <code>lt</code>, <code>quot</code>) with a self-reference: for example,
702          referencing an entity <code>e1</code> via <code>&amp;e1;</code> will be
703          replaced by text <code>&amp;e1;</code>.
704        </dd>
705
706        <dt>
707          <a id="XML_DTD" name="XML_DTD">XML_DTD</a>
708        </dt>
709
710        <dd>
711          Include support for using and reporting DTD-based content. If this is defined,
712          default attribute values from an external DTD subset are reported and attribute
713          value normalization occurs based on the type of attributes defined in the
714          external subset. Without this, Expat has a smaller memory footprint and can be
715          faster, but will not load external parameter entities or process conditional
716          sections. If defined, makes the functions <code><a href=
717          "#XML_SetBillionLaughsAttackProtectionMaximumAmplification">XML_SetBillionLaughsAttackProtectionMaximumAmplification</a></code>
718          and <code><a href=
719          "#XML_SetBillionLaughsAttackProtectionActivationThreshold">XML_SetBillionLaughsAttackProtectionActivationThreshold</a></code>
720          available.
721        </dd>
722
723        <dt>
724          <a id="XML_NS" name="XML_NS">XML_NS</a>
725        </dt>
726
727        <dd>
728          When defined, support for the <cite><a href=
729          "https://www.w3.org/TR/REC-xml-names/">Namespaces in XML</a></cite>
730          specification is included.
731        </dd>
732
733        <dt>
734          <a id="XML_UNICODE" name="XML_UNICODE">XML_UNICODE</a>
735        </dt>
736
737        <dd>
738          When defined, character data reported to the application is encoded in UTF-16
739          using wide characters of the type <code>XML_Char</code>. This is implied if
740          <code>XML_UNICODE_WCHAR_T</code> is defined.
741        </dd>
742
743        <dt>
744          <a id="XML_UNICODE_WCHAR_T" name="XML_UNICODE_WCHAR_T">XML_UNICODE_WCHAR_T</a>
745        </dt>
746
747        <dd>
748          If defined, causes the <code>XML_Char</code> character type to be defined using
749          the <code>wchar_t</code> type; otherwise, <code>unsigned short</code> is used.
750          Defining this implies <code>XML_UNICODE</code>.
751        </dd>
752
753        <dt>
754          <a id="XML_LARGE_SIZE" name="XML_LARGE_SIZE">XML_LARGE_SIZE</a>
755        </dt>
756
757        <dd>
758          If defined, causes the <code>XML_Size</code> and <code>XML_Index</code> integer
759          types to be at least 64 bits in size. This is intended to support processing of
760          very large input streams, where the return values of <code><a href=
761          "#XML_GetCurrentByteIndex">XML_GetCurrentByteIndex</a></code>, <code><a href=
762          "#XML_GetCurrentLineNumber">XML_GetCurrentLineNumber</a></code> and
763          <code><a href=
764          "#XML_GetCurrentColumnNumber">XML_GetCurrentColumnNumber</a></code> could
765          overflow. It may not be supported by all compilers, and is turned off by
766          default.
767        </dd>
768
769        <dt>
770          <a id="XML_CONTEXT_BYTES" name="XML_CONTEXT_BYTES">XML_CONTEXT_BYTES</a>
771        </dt>
772
773        <dd>
774          The number of input bytes of markup context which the parser will ensure are
775          available for reporting via <code><a href=
776          "#XML_GetInputContext">XML_GetInputContext</a></code>. This is normally set to
777          1024, and must be set to a positive integer to enable. If this is set to zero,
778          the input context will not be available and <code><a href=
779          "#XML_GetInputContext">XML_GetInputContext</a></code> will always report
780          <code>NULL</code>. Without this, Expat has a smaller memory footprint and can
781          be faster.
782        </dd>
783
784        <dt>
785          <a id="XML_STATIC" name="XML_STATIC">XML_STATIC</a>
786        </dt>
787
788        <dd>
789          On Windows, this should be set if Expat is going to be linked statically with
790          the code that calls it; this is required to get all the right MSVC magic
791          annotations correct. This is ignored on other platforms.
792        </dd>
793
794        <dt>
795          <a id="XML_ATTR_INFO" name="XML_ATTR_INFO">XML_ATTR_INFO</a>
796        </dt>
797
798        <dd>
799          If defined, makes the additional function <code><a href=
800          "#XML_GetAttributeInfo">XML_GetAttributeInfo</a></code> available for reporting
801          attribute byte offsets.
802        </dd>
803      </dl>
804
805      <hr />
806
807      <h2>
808        <a id="using" name="using">Using Expat</a>
809      </h2>
810
811      <h3>
812        Compiling and Linking Against Expat
813      </h3>
814
815      <p>
816        Unless you installed Expat in a location not expected by your compiler and
817        linker, all you have to do to use Expat in your programs is to include the Expat
818        header (<code>#include &lt;expat.h&gt;</code>) in your files that make calls to
819        it and to tell the linker that it needs to link against the Expat library. On
820        Unix systems, this would usually be done with the <code>-lexpat</code> argument.
821        Otherwise, you'll need to tell the compiler where to look for the Expat header
822        and the linker where to find the Expat library. You may also need to take steps
823        to tell the operating system where to find this library at run time.
824      </p>
825
826      <p>
827        On a Unix-based system, here's what a Makefile might look like when Expat is
828        installed in a standard location:
829      </p>
830
831      <pre class="eg">
832CC=cc
833LDFLAGS=
834LIBS= -lexpat
835xmlapp: xmlapp.o
836        $(CC) $(LDFLAGS) -o xmlapp xmlapp.o $(LIBS)
837</pre>
838      <p>
839        If you installed Expat in, say, <code>/home/me/mystuff</code>, then the Makefile
840        would look like this:
841      </p>
842
843      <pre class="eg">
844CC=cc
845CFLAGS= -I/home/me/mystuff/include
846LDFLAGS=
847LIBS= -L/home/me/mystuff/lib -lexpat
848xmlapp: xmlapp.o
849        $(CC) $(LDFLAGS) -o xmlapp xmlapp.o $(LIBS)
850</pre>
851      <p>
852        You'd also have to set the environment variable <code>LD_LIBRARY_PATH</code> to
853        <code>/home/me/mystuff/lib</code> (or to
854        <code>${LD_LIBRARY_PATH}:/home/me/mystuff/lib</code> if LD_LIBRARY_PATH already
855        has some directories in it) in order to run your application.
856      </p>
857
858      <h3>
859        Expat Basics
860      </h3>
861
862      <p>
863        As we saw in the example in the overview, the first step in parsing an XML
864        document with Expat is to create a parser object. There are <a href=
865        "#creation">three functions</a> in the Expat API for creating a parser object.
866        However, only two of these (<code><a href=
867        "#XML_ParserCreate">XML_ParserCreate</a></code> and <code><a href=
868        "#XML_ParserCreateNS">XML_ParserCreateNS</a></code>) can be used for constructing
869        a parser for a top-level document. The object returned by these functions is an
870        opaque pointer (i.e. "expat.h" declares it as void *) to data with further
871        internal structure. In order to free the memory associated with this object you
872        must call <code><a href="#XML_ParserFree">XML_ParserFree</a></code>. Note that if
873        you have provided any <a href="#userdata">user data</a> that gets stored in the
874        parser, then your application is responsible for freeing it prior to calling
875        <code>XML_ParserFree</code>.
876      </p>
877
878      <p>
879        The objects returned by the parser creation functions are good for parsing only
880        one XML document or external parsed entity. If your application needs to parse
881        many XML documents, then it needs to create a parser object for each one. The
882        best way to deal with this is to create a higher level object that contains all
883        the default initialization you want for your parser objects.
884      </p>
885
886      <p>
887        Walking through a document hierarchy with a stream oriented parser will require a
888        good stack mechanism in order to keep track of current context. For instance, to
889        answer the simple question, "What element does this text belong to?" requires a
890        stack, since the parser may have descended into other elements that are children
891        of the current one and has encountered this text on the way out.
892      </p>
893
894      <p>
895        The things you're likely to want to keep on a stack are the currently opened
896        element and it's attributes. You push this information onto the stack in the
897        start handler and you pop it off in the end handler.
898      </p>
899
900      <p>
901        For some tasks, it is sufficient to just keep information on what the depth of
902        the stack is (or would be if you had one.) The outline program shown above
903        presents one example. Another such task would be skipping over a complete
904        element. When you see the start tag for the element you want to skip, you set a
905        skip flag and record the depth at which the element started. When the end tag
906        handler encounters the same depth, the skipped element has ended and the flag may
907        be cleared. If you follow the convention that the root element starts at 1, then
908        you can use the same variable for skip flag and skip depth.
909      </p>
910
911      <pre class="eg">
912void
913init_info(Parseinfo *info) {
914  info-&gt;skip = 0;
915  info-&gt;depth = 1;
916  /* Other initializations here */
917}  /* End of init_info */
918
919void XMLCALL
920rawstart(void *data, const char *el, const char **attr) {
921  Parseinfo *inf = (Parseinfo *) data;
922
923  if (! inf-&gt;skip) {
924    if (should_skip(inf, el, attr)) {
925      inf-&gt;skip = inf-&gt;depth;
926    }
927    else
928      start(inf, el, attr);     /* This does rest of start handling */
929  }
930
931  inf-&gt;depth++;
932}  /* End of rawstart */
933
934void XMLCALL
935rawend(void *data, const char *el) {
936  Parseinfo *inf = (Parseinfo *) data;
937
938  inf-&gt;depth--;
939
940  if (! inf-&gt;skip)
941    end(inf, el);              /* This does rest of end handling */
942
943  if (inf-&gt;skip == inf-&gt;depth)
944    inf-&gt;skip = 0;
945}  /* End rawend */
946</pre>
947      <p>
948        Notice in the above example the difference in how depth is manipulated in the
949        start and end handlers. The end tag handler should be the mirror image of the
950        start tag handler. This is necessary to properly model containment. Since, in the
951        start tag handler, we incremented depth <em>after</em> the main body of start tag
952        code, then in the end handler, we need to manipulate it <em>before</em> the main
953        body. If we'd decided to increment it first thing in the start handler, then we'd
954        have had to decrement it last thing in the end handler.
955      </p>
956
957      <h3 id="userdata">
958        Communicating between handlers
959      </h3>
960
961      <p>
962        In order to be able to pass information between different handlers without using
963        globals, you'll need to define a data structure to hold the shared variables. You
964        can then tell Expat (with the <code><a href=
965        "#XML_SetUserData">XML_SetUserData</a></code> function) to pass a pointer to this
966        structure to the handlers. This is the first argument received by most handlers.
967        In the <a href="#reference">reference section</a>, an argument to a callback
968        function is named <code>userData</code> and have type <code>void *</code> if the
969        user data is passed; it will have the type <code>XML_Parser</code> if the parser
970        itself is passed. When the parser is passed, the user data may be retrieved using
971        <code><a href="#XML_GetUserData">XML_GetUserData</a></code>.
972      </p>
973
974      <p>
975        One common case where multiple calls to a single handler may need to communicate
976        using an application data structure is the case when content passed to the
977        character data handler (set by <code><a href=
978        "#XML_SetCharacterDataHandler">XML_SetCharacterDataHandler</a></code>) needs to
979        be accumulated. A common first-time mistake with any of the event-oriented
980        interfaces to an XML parser is to expect all the text contained in an element to
981        be reported by a single call to the character data handler. Expat, like many
982        other XML parsers, reports such data as a sequence of calls; there's no way to
983        know when the end of the sequence is reached until a different callback is made.
984        A buffer referenced by the user data structure proves both an effective and
985        convenient place to accumulate character data.
986      </p>
987      <!-- XXX example needed here -->
988
989      <h3>
990        XML Version
991      </h3>
992
993      <p>
994        Expat is an XML 1.0 parser, and as such never complains based on the value of the
995        <code>version</code> pseudo-attribute in the XML declaration, if present.
996      </p>
997
998      <p>
999        If an application needs to check the version number (to support alternate
1000        processing), it should use the <code><a href=
1001        "#XML_SetXmlDeclHandler">XML_SetXmlDeclHandler</a></code> function to set a
1002        handler that uses the information in the XML declaration to determine what to do.
1003        This example shows how to check that only a version number of <code>"1.0"</code>
1004        is accepted:
1005      </p>
1006
1007      <pre class="eg">
1008static int wrong_version;
1009static XML_Parser parser;
1010
1011static void XMLCALL
1012xmldecl_handler(void            *userData,
1013                const XML_Char  *version,
1014                const XML_Char  *encoding,
1015                int              standalone)
1016{
1017  static const XML_Char Version_1_0[] = {'1', '.', '0', 0};
1018
1019  int i;
1020
1021  for (i = 0; i &lt; (sizeof(Version_1_0) / sizeof(Version_1_0[0])); ++i) {
1022    if (version[i] != Version_1_0[i]) {
1023      wrong_version = 1;
1024      /* also clear all other handlers: */
1025      XML_SetCharacterDataHandler(parser, NULL);
1026      ...
1027      return;
1028    }
1029  }
1030  ...
1031}
1032</pre>
1033      <h3>
1034        Namespace Processing
1035      </h3>
1036
1037      <p>
1038        When the parser is created using the <code><a href=
1039        "#XML_ParserCreateNS">XML_ParserCreateNS</a></code>, function, Expat performs
1040        namespace processing. Under namespace processing, Expat consumes
1041        <code>xmlns</code> and <code>xmlns:...</code> attributes, which declare
1042        namespaces for the scope of the element in which they occur. This means that your
1043        start handler will not see these attributes. Your application can still be
1044        informed of these declarations by setting namespace declaration handlers with
1045        <a href=
1046        "#XML_SetNamespaceDeclHandler"><code>XML_SetNamespaceDeclHandler</code></a>.
1047      </p>
1048
1049      <p>
1050        Element type and attribute names that belong to a given namespace are passed to
1051        the appropriate handler in expanded form. By default this expanded form is a
1052        concatenation of the namespace URI, the separator character (which is the 2nd
1053        argument to <code><a href="#XML_ParserCreateNS">XML_ParserCreateNS</a></code>),
1054        and the local name (i.e. the part after the colon). Names with undeclared
1055        prefixes are not well-formed when namespace processing is enabled, and will
1056        trigger an error. Unprefixed attribute names are never expanded, and unprefixed
1057        element names are only expanded when they are in the scope of a default
1058        namespace.
1059      </p>
1060
1061      <p>
1062        However if <code><a href=
1063        "#XML_SetReturnNSTriplet">XML_SetReturnNSTriplet</a></code> has been called with
1064        a non-zero <code>do_nst</code> parameter, then the expanded form for names with
1065        an explicit prefix is a concatenation of: URI, separator, local name, separator,
1066        prefix.
1067      </p>
1068
1069      <p>
1070        You can set handlers for the start of a namespace declaration and for the end of
1071        a scope of a declaration with the <code><a href=
1072        "#XML_SetNamespaceDeclHandler">XML_SetNamespaceDeclHandler</a></code> function.
1073        The StartNamespaceDeclHandler is called prior to the start tag handler and the
1074        EndNamespaceDeclHandler is called after the corresponding end tag that ends the
1075        namespace's scope. The namespace start handler gets passed the prefix and URI for
1076        the namespace. For a default namespace declaration (xmlns='...'), the prefix will
1077        be <code>NULL</code>. The URI will be <code>NULL</code> for the case where the
1078        default namespace is being unset. The namespace end handler just gets the prefix
1079        for the closing scope.
1080      </p>
1081
1082      <p>
1083        These handlers are called for each declaration. So if, for instance, a start tag
1084        had three namespace declarations, then the StartNamespaceDeclHandler would be
1085        called three times before the start tag handler is called, once for each
1086        declaration.
1087      </p>
1088
1089      <h3>
1090        Character Encodings
1091      </h3>
1092
1093      <p>
1094        While XML is based on Unicode, and every XML processor is required to recognized
1095        UTF-8 and UTF-16 (1 and 2 byte encodings of Unicode), other encodings may be
1096        declared in XML documents or entities. For the main document, an XML declaration
1097        may contain an encoding declaration:
1098      </p>
1099
1100      <pre>
1101&lt;?xml version="1.0" encoding="ISO-8859-2"?&gt;
1102</pre>
1103      <p>
1104        External parsed entities may begin with a text declaration, which looks like an
1105        XML declaration with just an encoding declaration:
1106      </p>
1107
1108      <pre>
1109&lt;?xml encoding="Big5"?&gt;
1110</pre>
1111      <p>
1112        With Expat, you may also specify an encoding at the time of creating a parser.
1113        This is useful when the encoding information may come from a source outside the
1114        document itself (like a higher level protocol.)
1115      </p>
1116
1117      <p>
1118        <a id="builtin_encodings" name="builtin_encodings"></a>There are four built-in
1119        encodings in Expat:
1120      </p>
1121
1122      <ul>
1123        <li>UTF-8
1124        </li>
1125
1126        <li>UTF-16
1127        </li>
1128
1129        <li>ISO-8859-1
1130        </li>
1131
1132        <li>US-ASCII
1133        </li>
1134      </ul>
1135
1136      <p>
1137        Anything else discovered in an encoding declaration or in the protocol encoding
1138        specified in the parser constructor, triggers a call to the
1139        <code>UnknownEncodingHandler</code>. This handler gets passed the encoding name
1140        and a pointer to an <code>XML_Encoding</code> data structure. Your handler must
1141        fill in this structure and return <code>XML_STATUS_OK</code> if it knows how to
1142        deal with the encoding. Otherwise the handler should return
1143        <code>XML_STATUS_ERROR</code>. The handler also gets passed a pointer to an
1144        optional application data structure that you may indicate when you set the
1145        handler.
1146      </p>
1147
1148      <p>
1149        Expat places restrictions on character encodings that it can support by filling
1150        in the <code>XML_Encoding</code> structure. include file:
1151      </p>
1152
1153      <ol>
1154        <li>Every ASCII character that can appear in a well-formed XML document must be
1155        represented by a single byte, and that byte must correspond to it's ASCII
1156        encoding (except for the characters $@\^'{}~)
1157        </li>
1158
1159        <li>Characters must be encoded in 4 bytes or less.
1160        </li>
1161
1162        <li>All characters encoded must have Unicode scalar values less than or equal to
1163        65535 (0xFFFF)<em>This does not apply to the built-in support for UTF-16 and
1164        UTF-8</em>
1165        </li>
1166
1167        <li>No character may be encoded by more that one distinct sequence of bytes
1168        </li>
1169      </ol>
1170
1171      <p>
1172        <code>XML_Encoding</code> contains an array of integers that correspond to the
1173        1st byte of an encoding sequence. If the value in the array for a byte is zero or
1174        positive, then the byte is a single byte encoding that encodes the Unicode scalar
1175        value contained in the array. A -1 in this array indicates a malformed byte. If
1176        the value is -2, -3, or -4, then the byte is the beginning of a 2, 3, or 4 byte
1177        sequence respectively. Multi-byte sequences are sent to the convert function
1178        pointed at in the <code>XML_Encoding</code> structure. This function should
1179        return the Unicode scalar value for the sequence or -1 if the sequence is
1180        malformed.
1181      </p>
1182
1183      <p>
1184        One pitfall that novice Expat users are likely to fall into is that although
1185        Expat may accept input in various encodings, the strings that it passes to the
1186        handlers are always encoded in UTF-8 or UTF-16 (depending on how Expat was
1187        compiled). Your application is responsible for any translation of these strings
1188        into other encodings.
1189      </p>
1190
1191      <h3>
1192        Handling External Entity References
1193      </h3>
1194
1195      <p>
1196        Expat does not read or parse external entities directly. Note that any external
1197        DTD is a special case of an external entity. If you've set no
1198        <code>ExternalEntityRefHandler</code>, then external entity references are
1199        silently ignored. Otherwise, it calls your handler with the information needed to
1200        read and parse the external entity.
1201      </p>
1202
1203      <p>
1204        Your handler isn't actually responsible for parsing the entity, but it is
1205        responsible for creating a subsidiary parser with <code><a href=
1206        "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code> that
1207        will do the job. This returns an instance of <code>XML_Parser</code> that has
1208        handlers and other data structures initialized from the parent parser. You may
1209        then use <code><a href="#XML_Parse">XML_Parse</a></code> or <code><a href=
1210        "#XML_ParseBuffer">XML_ParseBuffer</a></code> calls against this parser. Since
1211        external entities my refer to other external entities, your handler should be
1212        prepared to be called recursively.
1213      </p>
1214
1215      <h3>
1216        Parsing DTDs
1217      </h3>
1218
1219      <p>
1220        In order to parse parameter entities, before starting the parse, you must call
1221        <code><a href="#XML_SetParamEntityParsing">XML_SetParamEntityParsing</a></code>
1222        with one of the following arguments:
1223      </p>
1224
1225      <dl>
1226        <dt>
1227          <code>XML_PARAM_ENTITY_PARSING_NEVER</code>
1228        </dt>
1229
1230        <dd>
1231          Don't parse parameter entities or the external subset
1232        </dd>
1233
1234        <dt>
1235          <code>XML_PARAM_ENTITY_PARSING_UNLESS_STANDALONE</code>
1236        </dt>
1237
1238        <dd>
1239          Parse parameter entities and the external subset unless <code>standalone</code>
1240          was set to "yes" in the XML declaration.
1241        </dd>
1242
1243        <dt>
1244          <code>XML_PARAM_ENTITY_PARSING_ALWAYS</code>
1245        </dt>
1246
1247        <dd>
1248          Always parse parameter entities and the external subset
1249        </dd>
1250      </dl>
1251
1252      <p>
1253        In order to read an external DTD, you also have to set an external entity
1254        reference handler as described above.
1255      </p>
1256
1257      <h3 id="stop-resume">
1258        Temporarily Stopping Parsing
1259      </h3>
1260
1261      <p>
1262        Expat 1.95.8 introduces a new feature: its now possible to stop parsing
1263        temporarily from within a handler function, even if more data has already been
1264        passed into the parser. Applications for this include
1265      </p>
1266
1267      <ul>
1268        <li>Supporting the <a href="https://www.w3.org/TR/xinclude/">XInclude</a>
1269        specification.
1270        </li>
1271
1272        <li>Delaying further processing until additional information is available from
1273        some other source.
1274        </li>
1275
1276        <li>Adjusting processor load as task priorities shift within an application.
1277        </li>
1278
1279        <li>Stopping parsing completely (simply free or reset the parser instead of
1280        resuming in the outer parsing loop). This can be useful if an application-domain
1281        error is found in the XML being parsed or if the result of the parse is
1282        determined not to be useful after all.
1283        </li>
1284      </ul>
1285
1286      <p>
1287        To take advantage of this feature, the main parsing loop of an application needs
1288        to support this specifically. It cannot be supported with a parsing loop
1289        compatible with Expat 1.95.7 or earlier (though existing loops will continue to
1290        work without supporting the stop/resume feature).
1291      </p>
1292
1293      <p>
1294        An application that uses this feature for a single parser will have the rough
1295        structure (in pseudo-code):
1296      </p>
1297
1298      <pre class="pseudocode">
1299fd = open_input()
1300p = create_parser()
1301
1302if parse_xml(p, fd) {
1303  /* suspended */
1304
1305  int suspended = 1;
1306
1307  while (suspended) {
1308    do_something_else()
1309    if ready_to_resume() {
1310      suspended = continue_parsing(p, fd);
1311    }
1312  }
1313}
1314</pre>
1315      <p>
1316        An application that may resume any of several parsers based on input (either from
1317        the XML being parsed or some other source) will certainly have more interesting
1318        control structures.
1319      </p>
1320
1321      <p>
1322        This C function could be used for the <code>parse_xml</code> function mentioned
1323        in the pseudo-code above:
1324      </p>
1325
1326      <pre class="eg">
1327#define BUFF_SIZE 10240
1328
1329/* Parse a document from the open file descriptor 'fd' until the parse
1330   is complete (the document has been completely parsed, or there's
1331   been an error), or the parse is stopped.  Return non-zero when
1332   the parse is merely suspended.
1333*/
1334int
1335parse_xml(XML_Parser p, int fd)
1336{
1337  for (;;) {
1338    int last_chunk;
1339    int bytes_read;
1340    enum XML_Status status;
1341
1342    void *buff = XML_GetBuffer(p, BUFF_SIZE);
1343    if (buff == NULL) {
1344      /* handle error... */
1345      return 0;
1346    }
1347    bytes_read = read(fd, buff, BUFF_SIZE);
1348    if (bytes_read &lt; 0) {
1349      /* handle error... */
1350      return 0;
1351    }
1352    status = XML_ParseBuffer(p, bytes_read, bytes_read == 0);
1353    switch (status) {
1354      case XML_STATUS_ERROR:
1355        /* handle error... */
1356        return 0;
1357      case XML_STATUS_SUSPENDED:
1358        return 1;
1359    }
1360    if (bytes_read == 0)
1361      return 0;
1362  }
1363}
1364</pre>
1365      <p>
1366        The corresponding <code>continue_parsing</code> function is somewhat simpler,
1367        since it only need deal with the return code from <code><a href=
1368        "#XML_ResumeParser">XML_ResumeParser</a></code>; it can delegate the input
1369        handling to the <code>parse_xml</code> function:
1370      </p>
1371
1372      <pre class="eg">
1373/* Continue parsing a document which had been suspended.  The 'p' and
1374   'fd' arguments are the same as passed to parse_xml().  Return
1375   non-zero when the parse is suspended.
1376*/
1377int
1378continue_parsing(XML_Parser p, int fd)
1379{
1380  enum XML_Status status = XML_ResumeParser(p);
1381  switch (status) {
1382    case XML_STATUS_ERROR:
1383      /* handle error... */
1384      return 0;
1385    case XML_ERROR_NOT_SUSPENDED:
1386      /* handle error... */
1387      return 0;.
1388    case XML_STATUS_SUSPENDED:
1389      return 1;
1390  }
1391  return parse_xml(p, fd);
1392}
1393</pre>
1394      <p>
1395        Now that we've seen what a mess the top-level parsing loop can become, what have
1396        we gained? Very simply, we can now use the <code><a href=
1397        "#XML_StopParser">XML_StopParser</a></code> function to stop parsing, without
1398        having to go to great lengths to avoid additional processing that we're expecting
1399        to ignore. As a bonus, we get to stop parsing <em>temporarily</em>, and come back
1400        to it when we're ready.
1401      </p>
1402
1403      <p>
1404        To stop parsing from a handler function, use the <code><a href=
1405        "#XML_StopParser">XML_StopParser</a></code> function. This function takes two
1406        arguments; the parser being stopped and a flag indicating whether the parse can
1407        be resumed in the future.
1408      </p>
1409      <!-- XXX really need more here -->
1410
1411      <hr />
1412      <!-- ================================================================ -->
1413
1414      <h2>
1415        <a id="reference" name="reference">Expat Reference</a>
1416      </h2>
1417
1418      <h3>
1419        <a id="creation" name="creation">Parser Creation</a>
1420      </h3>
1421
1422      <h4 id="XML_ParserCreate">
1423        XML_ParserCreate
1424      </h4>
1425
1426      <pre class="fcndec">
1427XML_Parser XMLCALL
1428XML_ParserCreate(const XML_Char *encoding);
1429</pre>
1430      <div class="fcndef">
1431        <p>
1432          Construct a new parser. If encoding is non-<code>NULL</code>, it specifies a
1433          character encoding to use for the document. This overrides the document
1434          encoding declaration. There are four built-in encodings:
1435        </p>
1436
1437        <ul>
1438          <li>US-ASCII
1439          </li>
1440
1441          <li>UTF-8
1442          </li>
1443
1444          <li>UTF-16
1445          </li>
1446
1447          <li>ISO-8859-1
1448          </li>
1449        </ul>
1450
1451        <p>
1452          Any other value will invoke a call to the UnknownEncodingHandler.
1453        </p>
1454      </div>
1455
1456      <h4 id="XML_ParserCreateNS">
1457        XML_ParserCreateNS
1458      </h4>
1459
1460      <pre class="fcndec">
1461XML_Parser XMLCALL
1462XML_ParserCreateNS(const XML_Char *encoding,
1463                   XML_Char sep);
1464</pre>
1465      <div class="fcndef">
1466        Constructs a new parser that has namespace processing in effect. Namespace
1467        expanded element names and attribute names are returned as a concatenation of the
1468        namespace URI, <em>sep</em>, and the local part of the name. This means that you
1469        should pick a character for <em>sep</em> that can't be part of an URI. Since
1470        Expat does not check namespace URIs for conformance, the only safe choice for a
1471        namespace separator is a character that is illegal in XML. For instance,
1472        <code>'\xFF'</code> is not legal in UTF-8, and <code>'\xFFFF'</code> is not legal
1473        in UTF-16. There is a special case when <em>sep</em> is the null character
1474        <code>'\0'</code>: the namespace URI and the local part will be concatenated
1475        without any separator - this is intended to support RDF processors. It is a
1476        programming error to use the null separator with <a href=
1477        "#XML_SetReturnNSTriplet">namespace triplets</a>.
1478      </div>
1479
1480      <p>
1481        <strong>Note:</strong> Expat does not validate namespace URIs (beyond encoding)
1482        against RFC 3986 today (and is not required to do so with regard to the XML 1.0
1483        namespaces specification) but it may start doing that in future releases. Before
1484        that, an application using Expat must be ready to receive namespace URIs
1485        containing non-URI characters.
1486      </p>
1487
1488      <h4 id="XML_ParserCreate_MM">
1489        XML_ParserCreate_MM
1490      </h4>
1491
1492      <pre class="fcndec">
1493XML_Parser XMLCALL
1494XML_ParserCreate_MM(const XML_Char *encoding,
1495                    const XML_Memory_Handling_Suite *ms,
1496                    const XML_Char *sep);
1497</pre>
1498
1499      <pre class="signature">
1500typedef struct {
1501  void *(XMLCALL *malloc_fcn)(size_t size);
1502  void *(XMLCALL *realloc_fcn)(void *ptr, size_t size);
1503  void (XMLCALL *free_fcn)(void *ptr);
1504} XML_Memory_Handling_Suite;
1505</pre>
1506      <div class="fcndef">
1507        <p>
1508          Construct a new parser using the suite of memory handling functions specified
1509          in <code>ms</code>. If <code>ms</code> is <code>NULL</code>, then use the
1510          standard set of memory management functions. If <code>sep</code> is
1511          non-<code>NULL</code>, then namespace processing is enabled in the created
1512          parser and the character pointed at by sep is used as the separator between the
1513          namespace URI and the local part of the name.
1514        </p>
1515      </div>
1516
1517      <h4 id="XML_ExternalEntityParserCreate">
1518        XML_ExternalEntityParserCreate
1519      </h4>
1520
1521      <pre class="fcndec">
1522XML_Parser XMLCALL
1523XML_ExternalEntityParserCreate(XML_Parser p,
1524                               const XML_Char *context,
1525                               const XML_Char *encoding);
1526</pre>
1527      <div class="fcndef">
1528        <p>
1529          Construct a new <code>XML_Parser</code> object for parsing an external general
1530          entity. Context is the context argument passed in a call to a
1531          ExternalEntityRefHandler. Other state information such as handlers, user data,
1532          namespace processing is inherited from the parser passed as the 1st argument.
1533          So you shouldn't need to call any of the behavior changing functions on this
1534          parser (unless you want it to act differently than the parent parser).
1535        </p>
1536
1537        <p>
1538          <strong>Note:</strong> Please be sure to free subparsers created by
1539          <code><a href=
1540          "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>
1541          <em>prior to</em> freeing their related parent parser, as subparsers reference
1542          and use parts of their respective parent parser, internally. Parent parsers
1543          must outlive subparsers.
1544        </p>
1545      </div>
1546
1547      <h4 id="XML_ParserFree">
1548        XML_ParserFree
1549      </h4>
1550
1551      <pre class="fcndec">
1552void XMLCALL
1553XML_ParserFree(XML_Parser p);
1554</pre>
1555      <div class="fcndef">
1556        <p>
1557          Free memory used by the parser.
1558        </p>
1559
1560        <p>
1561          <strong>Note:</strong> Your application is responsible for freeing any memory
1562          associated with <a href="#userdata">user data</a>.
1563        </p>
1564
1565        <p>
1566          <strong>Note:</strong> Please be sure to free subparsers created by
1567          <code><a href=
1568          "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>
1569          <em>prior to</em> freeing their related parent parser, as subparsers reference
1570          and use parts of their respective parent parser, internally. Parent parsers
1571          must outlive subparsers.
1572        </p>
1573      </div>
1574
1575      <h4 id="XML_ParserReset">
1576        XML_ParserReset
1577      </h4>
1578
1579      <pre class="fcndec">
1580XML_Bool XMLCALL
1581XML_ParserReset(XML_Parser p,
1582                const XML_Char *encoding);
1583</pre>
1584      <div class="fcndef">
1585        Clean up the memory structures maintained by the parser so that it may be used
1586        again. After this has been called, <code>parser</code> is ready to start parsing
1587        a new document. All handlers are cleared from the parser, except for the
1588        unknownEncodingHandler. The parser's external state is re-initialized except for
1589        the values of ns and ns_triplets. This function may not be used on a parser
1590        created using <code><a href=
1591        "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>; it
1592        will return <code>XML_FALSE</code> in that case. Returns <code>XML_TRUE</code> on
1593        success. Your application is responsible for dealing with any memory associated
1594        with <a href="#userdata">user data</a>.
1595      </div>
1596
1597      <h3>
1598        <a id="parsing" name="parsing">Parsing</a>
1599      </h3>
1600
1601      <p>
1602        To state the obvious: the three parsing functions <code><a href=
1603        "#XML_Parse">XML_Parse</a></code>, <code><a href=
1604        "#XML_ParseBuffer">XML_ParseBuffer</a></code> and <code><a href=
1605        "#XML_GetBuffer">XML_GetBuffer</a></code> must not be called from within a
1606        handler unless they operate on a separate parser instance, that is, one that did
1607        not call the handler. For example, it is OK to call the parsing functions from
1608        within an <code>XML_ExternalEntityRefHandler</code>, if they apply to the parser
1609        created by <code><a href=
1610        "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>.
1611      </p>
1612
1613      <p>
1614        Note: The <code>len</code> argument passed to these functions should be
1615        considerably less than the maximum value for an integer, as it could create an
1616        integer overflow situation if the added lengths of a buffer and the unprocessed
1617        portion of the previous buffer exceed the maximum integer value. Input data at
1618        the end of a buffer will remain unprocessed if it is part of an XML token for
1619        which the end is not part of that buffer.
1620      </p>
1621
1622      <p>
1623        <a id="isFinal" name="isFinal"></a>The application <em>must</em> make a
1624        concluding <code><a href="#XML_Parse">XML_Parse</a></code> or <code><a href=
1625        "#XML_ParseBuffer">XML_ParseBuffer</a></code> call with <code>isFinal</code> set
1626        to <code>XML_TRUE</code>.
1627      </p>
1628
1629      <h4 id="XML_Parse">
1630        XML_Parse
1631      </h4>
1632
1633      <pre class="fcndec">
1634enum XML_Status XMLCALL
1635XML_Parse(XML_Parser p,
1636          const char *s,
1637          int len,
1638          int isFinal);
1639</pre>
1640
1641      <pre class="signature">
1642enum XML_Status {
1643  XML_STATUS_ERROR = 0,
1644  XML_STATUS_OK = 1
1645};
1646</pre>
1647      <div class="fcndef">
1648        <p>
1649          Parse some more of the document. The string <code>s</code> is a buffer
1650          containing part (or perhaps all) of the document. The number of bytes of s that
1651          are part of the document is indicated by <code>len</code>. This means that
1652          <code>s</code> doesn't have to be null-terminated. It also means that if
1653          <code>len</code> is larger than the number of bytes in the block of memory that
1654          <code>s</code> points at, then a memory fault is likely. Negative values for
1655          <code>len</code> are rejected since Expat 2.2.1. The <code>isFinal</code>
1656          parameter informs the parser that this is the last piece of the document.
1657          Frequently, the last piece is empty (i.e. <code>len</code> is zero.)
1658        </p>
1659
1660        <p>
1661          If a parse error occurred, it returns <code>XML_STATUS_ERROR</code>. Otherwise
1662          it returns <code>XML_STATUS_OK</code> value. Note that regardless of the return
1663          value, there is no guarantee that all provided input has been parsed; only
1664          after <a href="#isFinal">the concluding call</a> will all handler callbacks and
1665          parsing errors have happened.
1666        </p>
1667
1668        <p>
1669          Simplified, <code>XML_Parse</code> can be considered a convenience wrapper that
1670          is pairing calls to <code><a href="#XML_GetBuffer">XML_GetBuffer</a></code> and
1671          <code><a href="#XML_ParseBuffer">XML_ParseBuffer</a></code> (when Expat is
1672          built with macro <code>XML_CONTEXT_BYTES</code> defined to a positive value,
1673          which is both common and default). <code>XML_Parse</code> is then functionally
1674          equivalent to calling <code><a href="#XML_GetBuffer">XML_GetBuffer</a></code>,
1675          <code>memcpy</code>, and <code><a href=
1676          "#XML_ParseBuffer">XML_ParseBuffer</a></code>.
1677        </p>
1678
1679        <p>
1680          To avoid double copying of the input, direct use of functions <code><a href=
1681          "#XML_GetBuffer">XML_GetBuffer</a></code> and <code><a href=
1682          "#XML_ParseBuffer">XML_ParseBuffer</a></code> is advised for most production
1683          use, e.g. if you're using <code>read</code> or similar functionality to fill
1684          your buffers, fill directly into the buffer from <code><a href=
1685          "#XML_GetBuffer">XML_GetBuffer</a></code>, then parse with <code><a href=
1686          "#XML_ParseBuffer">XML_ParseBuffer</a></code>.
1687        </p>
1688      </div>
1689
1690      <h4 id="XML_ParseBuffer">
1691        XML_ParseBuffer
1692      </h4>
1693
1694      <pre class="fcndec">
1695enum XML_Status XMLCALL
1696XML_ParseBuffer(XML_Parser p,
1697                int len,
1698                int isFinal);
1699</pre>
1700      <div class="fcndef">
1701        <p>
1702          This is just like <code><a href="#XML_Parse">XML_Parse</a></code>, except in
1703          this case Expat provides the buffer. By obtaining the buffer from Expat with
1704          the <code><a href="#XML_GetBuffer">XML_GetBuffer</a></code> function, the
1705          application can avoid double copying of the input.
1706        </p>
1707
1708        <p>
1709          Negative values for <code>len</code> are rejected since Expat 2.6.3.
1710        </p>
1711      </div>
1712
1713      <h4 id="XML_GetBuffer">
1714        XML_GetBuffer
1715      </h4>
1716
1717      <pre class="fcndec">
1718void * XMLCALL
1719XML_GetBuffer(XML_Parser p,
1720              int len);
1721</pre>
1722      <div class="fcndef">
1723        Obtain a buffer of size <code>len</code> to read a piece of the document into. A
1724        <code>NULL</code> value is returned if Expat can't allocate enough memory for
1725        this buffer. A <code>NULL</code> value may also be returned if <code>len</code>
1726        is zero. This has to be called prior to every call to <code><a href=
1727        "#XML_ParseBuffer">XML_ParseBuffer</a></code>. A typical use would look like
1728        this:
1729
1730        <pre class="eg">
1731for (;;) {
1732  int bytes_read;
1733  void *buff = XML_GetBuffer(p, BUFF_SIZE);
1734  if (buff == NULL) {
1735    /* handle error */
1736  }
1737
1738  bytes_read = read(docfd, buff, BUFF_SIZE);
1739  if (bytes_read &lt; 0) {
1740    /* handle error */
1741  }
1742
1743  if (! XML_ParseBuffer(p, bytes_read, bytes_read == 0)) {
1744    /* handle parse error */
1745  }
1746
1747  if (bytes_read == 0)
1748    break;
1749}
1750</pre>
1751      </div>
1752
1753      <h4 id="XML_StopParser">
1754        XML_StopParser
1755      </h4>
1756
1757      <pre class="fcndec">
1758enum XML_Status XMLCALL
1759XML_StopParser(XML_Parser p,
1760               XML_Bool resumable);
1761</pre>
1762      <div class="fcndef">
1763        <p>
1764          Stops parsing, causing <code><a href="#XML_Parse">XML_Parse</a></code> or
1765          <code><a href="#XML_ParseBuffer">XML_ParseBuffer</a></code> to return. Must be
1766          called from within a call-back handler, except when aborting (when
1767          <code>resumable</code> is <code>XML_FALSE</code>) an already suspended parser.
1768          Some call-backs may still follow because they would otherwise get lost,
1769          including
1770        </p>
1771
1772        <ul>
1773          <li>the end element handler for empty elements when stopped in the start
1774          element handler,
1775          </li>
1776
1777          <li>the end namespace declaration handler when stopped in the end element
1778          handler,
1779          </li>
1780
1781          <li>the character data handler when stopped in the character data handler while
1782          making multiple call-backs on a contiguous chunk of characters,
1783          </li>
1784        </ul>
1785
1786        <p>
1787          and possibly others.
1788        </p>
1789
1790        <p>
1791          This can be called from most handlers, including DTD related call-backs, except
1792          when parsing an external parameter entity and <code>resumable</code> is
1793          <code>XML_TRUE</code>. Returns <code>XML_STATUS_OK</code> when successful,
1794          <code>XML_STATUS_ERROR</code> otherwise. The possible error codes are:
1795        </p>
1796
1797        <dl>
1798          <dt>
1799            <code>XML_ERROR_NOT_STARTED</code>
1800          </dt>
1801
1802          <dd>
1803            when stopping or suspending a parser before it has started, added in Expat
1804            2.6.4.
1805          </dd>
1806
1807          <dt>
1808            <code>XML_ERROR_SUSPENDED</code>
1809          </dt>
1810
1811          <dd>
1812            when suspending an already suspended parser.
1813          </dd>
1814
1815          <dt>
1816            <code>XML_ERROR_FINISHED</code>
1817          </dt>
1818
1819          <dd>
1820            when the parser has already finished.
1821          </dd>
1822
1823          <dt>
1824            <code>XML_ERROR_SUSPEND_PE</code>
1825          </dt>
1826
1827          <dd>
1828            when suspending while parsing an external PE.
1829          </dd>
1830        </dl>
1831
1832        <p>
1833          Since the stop/resume feature requires application support in the outer parsing
1834          loop, it is an error to call this function for a parser not being handled
1835          appropriately; see <a href="#stop-resume">Temporarily Stopping Parsing</a> for
1836          more information.
1837        </p>
1838
1839        <p>
1840          When <code>resumable</code> is <code>XML_TRUE</code> then parsing is
1841          <em>suspended</em>, that is, <code><a href="#XML_Parse">XML_Parse</a></code>
1842          and <code><a href="#XML_ParseBuffer">XML_ParseBuffer</a></code> return
1843          <code>XML_STATUS_SUSPENDED</code>. Otherwise, parsing is <em>aborted</em>, that
1844          is, <code><a href="#XML_Parse">XML_Parse</a></code> and <code><a href=
1845          "#XML_ParseBuffer">XML_ParseBuffer</a></code> return
1846          <code>XML_STATUS_ERROR</code> with error code <code>XML_ERROR_ABORTED</code>.
1847        </p>
1848
1849        <p>
1850          <strong>Note:</strong> This will be applied to the current parser instance
1851          only, that is, if there is a parent parser then it will continue parsing when
1852          the external entity reference handler returns. It is up to the implementation
1853          of that handler to call <code><a href=
1854          "#XML_StopParser">XML_StopParser</a></code> on the parent parser (recursively),
1855          if one wants to stop parsing altogether.
1856        </p>
1857
1858        <p>
1859          When suspended, parsing can be resumed by calling <code><a href=
1860          "#XML_ResumeParser">XML_ResumeParser</a></code>.
1861        </p>
1862
1863        <p>
1864          New in Expat 1.95.8.
1865        </p>
1866      </div>
1867
1868      <h4 id="XML_ResumeParser">
1869        XML_ResumeParser
1870      </h4>
1871
1872      <pre class="fcndec">
1873enum XML_Status XMLCALL
1874XML_ResumeParser(XML_Parser p);
1875</pre>
1876      <div class="fcndef">
1877        <p>
1878          Resumes parsing after it has been suspended with <code><a href=
1879          "#XML_StopParser">XML_StopParser</a></code>. Must not be called from within a
1880          handler call-back. Returns same status codes as <code><a href=
1881          "#XML_Parse">XML_Parse</a></code> or <code><a href=
1882          "#XML_ParseBuffer">XML_ParseBuffer</a></code>. An additional error code,
1883          <code>XML_ERROR_NOT_SUSPENDED</code>, will be returned if the parser was not
1884          currently suspended.
1885        </p>
1886
1887        <p>
1888          <strong>Note:</strong> This must be called on the most deeply nested child
1889          parser instance first, and on its parent parser only after the child parser has
1890          finished, to be applied recursively until the document entity's parser is
1891          restarted. That is, the parent parser will not resume by itself and it is up to
1892          the application to call <code><a href=
1893          "#XML_ResumeParser">XML_ResumeParser</a></code> on it at the appropriate
1894          moment.
1895        </p>
1896
1897        <p>
1898          New in Expat 1.95.8.
1899        </p>
1900      </div>
1901
1902      <h4 id="XML_GetParsingStatus">
1903        XML_GetParsingStatus
1904      </h4>
1905
1906      <pre class="fcndec">
1907void XMLCALL
1908XML_GetParsingStatus(XML_Parser p,
1909                     XML_ParsingStatus *status);
1910</pre>
1911
1912      <pre class="signature">
1913enum XML_Parsing {
1914  XML_INITIALIZED,
1915  XML_PARSING,
1916  XML_FINISHED,
1917  XML_SUSPENDED
1918};
1919
1920typedef struct {
1921  enum XML_Parsing parsing;
1922  XML_Bool finalBuffer;
1923} XML_ParsingStatus;
1924</pre>
1925      <div class="fcndef">
1926        <p>
1927          Returns status of parser with respect to being initialized, parsing, finished,
1928          or suspended, and whether the final buffer is being processed. The
1929          <code>status</code> parameter <em>must not</em> be <code>NULL</code>.
1930        </p>
1931
1932        <p>
1933          New in Expat 1.95.8.
1934        </p>
1935      </div>
1936
1937      <h3>
1938        <a id="setting" name="setting">Handler Setting</a>
1939      </h3>
1940
1941      <p>
1942        Although handlers are typically set prior to parsing and left alone, an
1943        application may choose to set or change the handler for a parsing event while the
1944        parse is in progress. For instance, your application may choose to ignore all
1945        text not descended from a <code>para</code> element. One way it could do this is
1946        to set the character handler when a para start tag is seen, and unset it for the
1947        corresponding end tag.
1948      </p>
1949
1950      <p>
1951        A handler may be <em>unset</em> by providing a <code>NULL</code> pointer to the
1952        appropriate handler setter. None of the handler setting functions have a return
1953        value.
1954      </p>
1955
1956      <p>
1957        Your handlers will be receiving strings in arrays of type <code>XML_Char</code>.
1958        This type is conditionally defined in expat.h as either <code>char</code>,
1959        <code>wchar_t</code> or <code>unsigned short</code>. The former implies UTF-8
1960        encoding, the latter two imply UTF-16 encoding. Note that you'll receive them in
1961        this form independent of the original encoding of the document.
1962      </p>
1963
1964      <div class="handler">
1965        <h4 id="XML_SetStartElementHandler">
1966          XML_SetStartElementHandler
1967        </h4>
1968
1969        <pre class="setter">
1970void XMLCALL
1971XML_SetStartElementHandler(XML_Parser p,
1972                           XML_StartElementHandler start);
1973</pre>
1974
1975        <pre class="signature">
1976typedef void
1977(XMLCALL *XML_StartElementHandler)(void *userData,
1978                                   const XML_Char *name,
1979                                   const XML_Char **atts);
1980</pre>
1981        <p>
1982          Set handler for start (and empty) tags. Attributes are passed to the start
1983          handler as a pointer to a vector of char pointers. Each attribute seen in a
1984          start (or empty) tag occupies 2 consecutive places in this vector: the
1985          attribute name followed by the attribute value. These pairs are terminated by a
1986          <code>NULL</code> pointer.
1987        </p>
1988
1989        <p>
1990          Note that an empty tag generates a call to both start and end handlers (in that
1991          order).
1992        </p>
1993      </div>
1994
1995      <div class="handler">
1996        <h4 id="XML_SetEndElementHandler">
1997          XML_SetEndElementHandler
1998        </h4>
1999
2000        <pre class="setter">
2001void XMLCALL
2002XML_SetEndElementHandler(XML_Parser p,
2003                         XML_EndElementHandler);
2004</pre>
2005
2006        <pre class="signature">
2007typedef void
2008(XMLCALL *XML_EndElementHandler)(void *userData,
2009                                 const XML_Char *name);
2010</pre>
2011        <p>
2012          Set handler for end (and empty) tags. As noted above, an empty tag generates a
2013          call to both start and end handlers.
2014        </p>
2015      </div>
2016
2017      <div class="handler">
2018        <h4 id="XML_SetElementHandler">
2019          XML_SetElementHandler
2020        </h4>
2021
2022        <pre class="setter">
2023void XMLCALL
2024XML_SetElementHandler(XML_Parser p,
2025                      XML_StartElementHandler start,
2026                      XML_EndElementHandler end);
2027</pre>
2028        <p>
2029          Set handlers for start and end tags with one call.
2030        </p>
2031      </div>
2032
2033      <div class="handler">
2034        <h4 id="XML_SetCharacterDataHandler">
2035          XML_SetCharacterDataHandler
2036        </h4>
2037
2038        <pre class="setter">
2039void XMLCALL
2040XML_SetCharacterDataHandler(XML_Parser p,
2041                            XML_CharacterDataHandler charhndl)
2042</pre>
2043
2044        <pre class="signature">
2045typedef void
2046(XMLCALL *XML_CharacterDataHandler)(void *userData,
2047                                    const XML_Char *s,
2048                                    int len);
2049</pre>
2050        <p>
2051          Set a text handler. The string your handler receives is <em>NOT
2052          null-terminated</em>. You have to use the length argument to deal with the end
2053          of the string. A single block of contiguous text free of markup may still
2054          result in a sequence of calls to this handler. In other words, if you're
2055          searching for a pattern in the text, it may be split across calls to this
2056          handler. Note: Setting this handler to <code>NULL</code> may <em>NOT
2057          immediately</em> terminate call-backs if the parser is currently processing
2058          such a single block of contiguous markup-free text, as the parser will continue
2059          calling back until the end of the block is reached.
2060        </p>
2061      </div>
2062
2063      <div class="handler">
2064        <h4 id="XML_SetProcessingInstructionHandler">
2065          XML_SetProcessingInstructionHandler
2066        </h4>
2067
2068        <pre class="setter">
2069void XMLCALL
2070XML_SetProcessingInstructionHandler(XML_Parser p,
2071                                    XML_ProcessingInstructionHandler proc)
2072</pre>
2073
2074        <pre class="signature">
2075typedef void
2076(XMLCALL *XML_ProcessingInstructionHandler)(void *userData,
2077                                            const XML_Char *target,
2078                                            const XML_Char *data);
2079
2080</pre>
2081        <p>
2082          Set a handler for processing instructions. The target is the first word in the
2083          processing instruction. The data is the rest of the characters in it after
2084          skipping all whitespace after the initial word.
2085        </p>
2086      </div>
2087
2088      <div class="handler">
2089        <h4 id="XML_SetCommentHandler">
2090          XML_SetCommentHandler
2091        </h4>
2092
2093        <pre class="setter">
2094void XMLCALL
2095XML_SetCommentHandler(XML_Parser p,
2096                      XML_CommentHandler cmnt)
2097</pre>
2098
2099        <pre class="signature">
2100typedef void
2101(XMLCALL *XML_CommentHandler)(void *userData,
2102                              const XML_Char *data);
2103</pre>
2104        <p>
2105          Set a handler for comments. The data is all text inside the comment delimiters.
2106        </p>
2107      </div>
2108
2109      <div class="handler">
2110        <h4 id="XML_SetStartCdataSectionHandler">
2111          XML_SetStartCdataSectionHandler
2112        </h4>
2113
2114        <pre class="setter">
2115void XMLCALL
2116XML_SetStartCdataSectionHandler(XML_Parser p,
2117                                XML_StartCdataSectionHandler start);
2118</pre>
2119
2120        <pre class="signature">
2121typedef void
2122(XMLCALL *XML_StartCdataSectionHandler)(void *userData);
2123</pre>
2124        <p>
2125          Set a handler that gets called at the beginning of a CDATA section.
2126        </p>
2127      </div>
2128
2129      <div class="handler">
2130        <h4 id="XML_SetEndCdataSectionHandler">
2131          XML_SetEndCdataSectionHandler
2132        </h4>
2133
2134        <pre class="setter">
2135void XMLCALL
2136XML_SetEndCdataSectionHandler(XML_Parser p,
2137                              XML_EndCdataSectionHandler end);
2138</pre>
2139
2140        <pre class="signature">
2141typedef void
2142(XMLCALL *XML_EndCdataSectionHandler)(void *userData);
2143</pre>
2144        <p>
2145          Set a handler that gets called at the end of a CDATA section.
2146        </p>
2147      </div>
2148
2149      <div class="handler">
2150        <h4 id="XML_SetCdataSectionHandler">
2151          XML_SetCdataSectionHandler
2152        </h4>
2153
2154        <pre class="setter">
2155void XMLCALL
2156XML_SetCdataSectionHandler(XML_Parser p,
2157                           XML_StartCdataSectionHandler start,
2158                           XML_EndCdataSectionHandler end)
2159</pre>
2160        <p>
2161          Sets both CDATA section handlers with one call.
2162        </p>
2163      </div>
2164
2165      <div class="handler">
2166        <h4 id="XML_SetDefaultHandler">
2167          XML_SetDefaultHandler
2168        </h4>
2169
2170        <pre class="setter">
2171void XMLCALL
2172XML_SetDefaultHandler(XML_Parser p,
2173                      XML_DefaultHandler hndl)
2174</pre>
2175
2176        <pre class="signature">
2177typedef void
2178(XMLCALL *XML_DefaultHandler)(void *userData,
2179                              const XML_Char *s,
2180                              int len);
2181</pre>
2182        <p>
2183          Sets a handler for any characters in the document which wouldn't otherwise be
2184          handled. This includes both data for which no handlers can be set (like some
2185          kinds of DTD declarations) and data which could be reported but which currently
2186          has no handler set. The characters are passed exactly as they were present in
2187          the XML document except that they will be encoded in UTF-8 or UTF-16. Line
2188          boundaries are not normalized. Note that a byte order mark character is not
2189          passed to the default handler. There are no guarantees about how characters are
2190          divided between calls to the default handler: for example, a comment might be
2191          split between multiple calls. Setting the handler with this call has the side
2192          effect of turning off expansion of references to internally defined general
2193          entities. Instead these references are passed to the default handler.
2194        </p>
2195
2196        <p>
2197          See also <code><a href="#XML_DefaultCurrent">XML_DefaultCurrent</a></code>.
2198        </p>
2199      </div>
2200
2201      <div class="handler">
2202        <h4 id="XML_SetDefaultHandlerExpand">
2203          XML_SetDefaultHandlerExpand
2204        </h4>
2205
2206        <pre class="setter">
2207void XMLCALL
2208XML_SetDefaultHandlerExpand(XML_Parser p,
2209                            XML_DefaultHandler hndl)
2210</pre>
2211
2212        <pre class="signature">
2213typedef void
2214(XMLCALL *XML_DefaultHandler)(void *userData,
2215                              const XML_Char *s,
2216                              int len);
2217</pre>
2218        <p>
2219          This sets a default handler, but doesn't inhibit the expansion of internal
2220          entity references. The entity reference will not be passed to the default
2221          handler.
2222        </p>
2223
2224        <p>
2225          See also <code><a href="#XML_DefaultCurrent">XML_DefaultCurrent</a></code>.
2226        </p>
2227      </div>
2228
2229      <div class="handler">
2230        <h4 id="XML_SetExternalEntityRefHandler">
2231          XML_SetExternalEntityRefHandler
2232        </h4>
2233
2234        <pre class="setter">
2235void XMLCALL
2236XML_SetExternalEntityRefHandler(XML_Parser p,
2237                                XML_ExternalEntityRefHandler hndl)
2238</pre>
2239
2240        <pre class="signature">
2241typedef int
2242(XMLCALL *XML_ExternalEntityRefHandler)(XML_Parser p,
2243                                        const XML_Char *context,
2244                                        const XML_Char *base,
2245                                        const XML_Char *systemId,
2246                                        const XML_Char *publicId);
2247</pre>
2248        <p>
2249          Set an external entity reference handler. This handler is also called for
2250          processing an external DTD subset if parameter entity parsing is in effect.
2251          (See <a href=
2252          "#XML_SetParamEntityParsing"><code>XML_SetParamEntityParsing</code></a>.)
2253        </p>
2254
2255        <p>
2256          <strong>Warning:</strong> Using an external entity reference handler can lead
2257          to <a href="https://libexpat.github.io/doc/xml-security/#external-entities">XXE
2258          vulnerabilities</a>. It should only be used in applications that do not parse
2259          untrusted XML input.
2260        </p>
2261
2262        <p>
2263          The <code>context</code> parameter specifies the parsing context in the format
2264          expected by the <code>context</code> argument to <code><a href=
2265          "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>.
2266          <code>code</code> is valid only until the handler returns, so if the referenced
2267          entity is to be parsed later, it must be copied. <code>context</code> is
2268          <code>NULL</code> only when the entity is a parameter entity, which is how one
2269          can differentiate between general and parameter entities.
2270        </p>
2271
2272        <p>
2273          The <code>base</code> parameter is the base to use for relative system
2274          identifiers. It is set by <code><a href="#XML_SetBase">XML_SetBase</a></code>
2275          and may be <code>NULL</code>. The <code>publicId</code> parameter is the public
2276          id given in the entity declaration and may be <code>NULL</code>.
2277          <code>systemId</code> is the system identifier specified in the entity
2278          declaration and is never <code>NULL</code>.
2279        </p>
2280
2281        <p>
2282          There are a couple of ways in which this handler differs from others. First,
2283          this handler returns a status indicator (an integer).
2284          <code>XML_STATUS_OK</code> should be returned for successful handling of the
2285          external entity reference. Returning <code>XML_STATUS_ERROR</code> indicates
2286          failure, and causes the calling parser to return an
2287          <code>XML_ERROR_EXTERNAL_ENTITY_HANDLING</code> error.
2288        </p>
2289
2290        <p>
2291          Second, instead of having the user data as its first argument, it receives the
2292          parser that encountered the entity reference. This, along with the context
2293          parameter, may be used as arguments to a call to <code><a href=
2294          "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>.
2295          Using the returned parser, the body of the external entity can be recursively
2296          parsed.
2297        </p>
2298
2299        <p>
2300          Since this handler may be called recursively, it should not be saving
2301          information into global or static variables.
2302        </p>
2303      </div>
2304
2305      <h4 id="XML_SetExternalEntityRefHandlerArg">
2306        XML_SetExternalEntityRefHandlerArg
2307      </h4>
2308
2309      <pre class="fcndec">
2310void XMLCALL
2311XML_SetExternalEntityRefHandlerArg(XML_Parser p,
2312                                   void *arg)
2313</pre>
2314      <div class="fcndef">
2315        <p>
2316          Set the argument passed to the ExternalEntityRefHandler. If <code>arg</code> is
2317          not <code>NULL</code>, it is the new value passed to the handler set using
2318          <code><a href=
2319          "#XML_SetExternalEntityRefHandler">XML_SetExternalEntityRefHandler</a></code>;
2320          if <code>arg</code> is <code>NULL</code>, the argument passed to the handler
2321          function will be the parser object itself.
2322        </p>
2323
2324        <p>
2325          <strong>Note:</strong> The type of <code>arg</code> and the type of the first
2326          argument to the ExternalEntityRefHandler do not match. This function takes a
2327          <code>void *</code> to be passed to the handler, while the handler accepts an
2328          <code>XML_Parser</code>. This is a historical accident, but will not be
2329          corrected before Expat 2.0 (at the earliest) to avoid causing compiler warnings
2330          for code that's known to work with this API. It is the responsibility of the
2331          application code to know the actual type of the argument passed to the handler
2332          and to manage it properly.
2333        </p>
2334      </div>
2335
2336      <div class="handler">
2337        <h4 id="XML_SetSkippedEntityHandler">
2338          XML_SetSkippedEntityHandler
2339        </h4>
2340
2341        <pre class="setter">
2342void XMLCALL
2343XML_SetSkippedEntityHandler(XML_Parser p,
2344                            XML_SkippedEntityHandler handler)
2345</pre>
2346
2347        <pre class="signature">
2348typedef void
2349(XMLCALL *XML_SkippedEntityHandler)(void *userData,
2350                                    const XML_Char *entityName,
2351                                    int is_parameter_entity);
2352</pre>
2353        <p>
2354          Set a skipped entity handler. This is called in two situations:
2355        </p>
2356
2357        <ol>
2358          <li>An entity reference is encountered for which no declaration has been read
2359          <em>and</em> this is not an error.
2360          </li>
2361
2362          <li>An internal entity reference is read, but not expanded, because <a href=
2363          "#XML_SetDefaultHandler"><code>XML_SetDefaultHandler</code></a> has been
2364          called.
2365          </li>
2366        </ol>
2367
2368        <p>
2369          The <code>is_parameter_entity</code> argument will be non-zero for a parameter
2370          entity and zero for a general entity.
2371        </p>
2372
2373        <p>
2374          Note: Skipped parameter entities in declarations and skipped general entities
2375          in attribute values cannot be reported, because the event would be out of sync
2376          with the reporting of the declarations or attribute values
2377        </p>
2378      </div>
2379
2380      <div class="handler">
2381        <h4 id="XML_SetUnknownEncodingHandler">
2382          XML_SetUnknownEncodingHandler
2383        </h4>
2384
2385        <pre class="setter">
2386void XMLCALL
2387XML_SetUnknownEncodingHandler(XML_Parser p,
2388                              XML_UnknownEncodingHandler enchandler,
2389                              void *encodingHandlerData)
2390</pre>
2391
2392        <pre class="signature">
2393typedef int
2394(XMLCALL *XML_UnknownEncodingHandler)(void *encodingHandlerData,
2395                                      const XML_Char *name,
2396                                      XML_Encoding *info);
2397
2398typedef struct {
2399  int map[256];
2400  void *data;
2401  int (XMLCALL *convert)(void *data, const char *s);
2402  void (XMLCALL *release)(void *data);
2403} XML_Encoding;
2404</pre>
2405        <p>
2406          Set a handler to deal with encodings other than the <a href=
2407          "#builtin_encodings">built in set</a>. This should be done before
2408          <code><a href="#XML_Parse">XML_Parse</a></code> or <code><a href=
2409          "#XML_ParseBuffer">XML_ParseBuffer</a></code> have been called on the given
2410          parser.
2411        </p>
2412
2413        <p>
2414          If the handler knows how to deal with an encoding with the given name, it
2415          should fill in the <code>info</code> data structure and return
2416          <code>XML_STATUS_OK</code>. Otherwise it should return
2417          <code>XML_STATUS_ERROR</code>. The handler will be called at most once per
2418          parsed (external) entity. The optional application data pointer
2419          <code>encodingHandlerData</code> will be passed back to the handler.
2420        </p>
2421
2422        <p>
2423          The map array contains information for every possible leading byte in a byte
2424          sequence. If the corresponding value is &gt;= 0, then it's a single byte
2425          sequence and the byte encodes that Unicode value. If the value is -1, then that
2426          byte is invalid as the initial byte in a sequence. If the value is -n, where n
2427          is an integer &gt; 1, then n is the number of bytes in the sequence and the
2428          actual conversion is accomplished by a call to the function pointed at by
2429          convert. This function may return -1 if the sequence itself is invalid. The
2430          convert pointer may be <code>NULL</code> if there are only single byte codes.
2431          The data parameter passed to the convert function is the data pointer from
2432          <code>XML_Encoding</code>. The string s is <em>NOT</em> null-terminated and
2433          points at the sequence of bytes to be converted.
2434        </p>
2435
2436        <p>
2437          The function pointed at by <code>release</code> is called by the parser when it
2438          is finished with the encoding. It may be <code>NULL</code>.
2439        </p>
2440      </div>
2441
2442      <div class="handler">
2443        <h4 id="XML_SetStartNamespaceDeclHandler">
2444          XML_SetStartNamespaceDeclHandler
2445        </h4>
2446
2447        <pre class="setter">
2448void XMLCALL
2449XML_SetStartNamespaceDeclHandler(XML_Parser p,
2450                                 XML_StartNamespaceDeclHandler start);
2451</pre>
2452
2453        <pre class="signature">
2454typedef void
2455(XMLCALL *XML_StartNamespaceDeclHandler)(void *userData,
2456                                         const XML_Char *prefix,
2457                                         const XML_Char *uri);
2458</pre>
2459        <p>
2460          Set a handler to be called when a namespace is declared. Namespace declarations
2461          occur inside start tags. But the namespace declaration start handler is called
2462          before the start tag handler for each namespace declared in that start tag.
2463        </p>
2464      </div>
2465
2466      <div class="handler">
2467        <h4 id="XML_SetEndNamespaceDeclHandler">
2468          XML_SetEndNamespaceDeclHandler
2469        </h4>
2470
2471        <pre class="setter">
2472void XMLCALL
2473XML_SetEndNamespaceDeclHandler(XML_Parser p,
2474                               XML_EndNamespaceDeclHandler end);
2475</pre>
2476
2477        <pre class="signature">
2478typedef void
2479(XMLCALL *XML_EndNamespaceDeclHandler)(void *userData,
2480                                       const XML_Char *prefix);
2481</pre>
2482        <p>
2483          Set a handler to be called when leaving the scope of a namespace declaration.
2484          This will be called, for each namespace declaration, after the handler for the
2485          end tag of the element in which the namespace was declared.
2486        </p>
2487      </div>
2488
2489      <div class="handler">
2490        <h4 id="XML_SetNamespaceDeclHandler">
2491          XML_SetNamespaceDeclHandler
2492        </h4>
2493
2494        <pre class="setter">
2495void XMLCALL
2496XML_SetNamespaceDeclHandler(XML_Parser p,
2497                            XML_StartNamespaceDeclHandler start,
2498                            XML_EndNamespaceDeclHandler end)
2499</pre>
2500        <p>
2501          Sets both namespace declaration handlers with a single call.
2502        </p>
2503      </div>
2504
2505      <div class="handler">
2506        <h4 id="XML_SetXmlDeclHandler">
2507          XML_SetXmlDeclHandler
2508        </h4>
2509
2510        <pre class="setter">
2511void XMLCALL
2512XML_SetXmlDeclHandler(XML_Parser p,
2513                      XML_XmlDeclHandler xmldecl);
2514</pre>
2515
2516        <pre class="signature">
2517typedef void
2518(XMLCALL *XML_XmlDeclHandler)(void            *userData,
2519                              const XML_Char  *version,
2520                              const XML_Char  *encoding,
2521                              int             standalone);
2522</pre>
2523        <p>
2524          Sets a handler that is called for XML declarations and also for text
2525          declarations discovered in external entities. The way to distinguish is that
2526          the <code>version</code> parameter will be <code>NULL</code> for text
2527          declarations. The <code>encoding</code> parameter may be <code>NULL</code> for
2528          an XML declaration. The <code>standalone</code> argument will contain -1, 0, or
2529          1 indicating respectively that there was no standalone parameter in the
2530          declaration, that it was given as no, or that it was given as yes.
2531        </p>
2532      </div>
2533
2534      <div class="handler">
2535        <h4 id="XML_SetStartDoctypeDeclHandler">
2536          XML_SetStartDoctypeDeclHandler
2537        </h4>
2538
2539        <pre class="setter">
2540void XMLCALL
2541XML_SetStartDoctypeDeclHandler(XML_Parser p,
2542                               XML_StartDoctypeDeclHandler start);
2543</pre>
2544
2545        <pre class="signature">
2546typedef void
2547(XMLCALL *XML_StartDoctypeDeclHandler)(void           *userData,
2548                                       const XML_Char *doctypeName,
2549                                       const XML_Char *sysid,
2550                                       const XML_Char *pubid,
2551                                       int            has_internal_subset);
2552</pre>
2553        <p>
2554          Set a handler that is called at the start of a DOCTYPE declaration, before any
2555          external or internal subset is parsed. Both <code>sysid</code> and
2556          <code>pubid</code> may be <code>NULL</code>. The
2557          <code>has_internal_subset</code> will be non-zero if the DOCTYPE declaration
2558          has an internal subset.
2559        </p>
2560      </div>
2561
2562      <div class="handler">
2563        <h4 id="XML_SetEndDoctypeDeclHandler">
2564          XML_SetEndDoctypeDeclHandler
2565        </h4>
2566
2567        <pre class="setter">
2568void XMLCALL
2569XML_SetEndDoctypeDeclHandler(XML_Parser p,
2570                             XML_EndDoctypeDeclHandler end);
2571</pre>
2572
2573        <pre class="signature">
2574typedef void
2575(XMLCALL *XML_EndDoctypeDeclHandler)(void *userData);
2576</pre>
2577        <p>
2578          Set a handler that is called at the end of a DOCTYPE declaration, after parsing
2579          any external subset.
2580        </p>
2581      </div>
2582
2583      <div class="handler">
2584        <h4 id="XML_SetDoctypeDeclHandler">
2585          XML_SetDoctypeDeclHandler
2586        </h4>
2587
2588        <pre class="setter">
2589void XMLCALL
2590XML_SetDoctypeDeclHandler(XML_Parser p,
2591                          XML_StartDoctypeDeclHandler start,
2592                          XML_EndDoctypeDeclHandler end);
2593</pre>
2594        <p>
2595          Set both doctype handlers with one call.
2596        </p>
2597      </div>
2598
2599      <div class="handler">
2600        <h4 id="XML_SetElementDeclHandler">
2601          XML_SetElementDeclHandler
2602        </h4>
2603
2604        <pre class="setter">
2605void XMLCALL
2606XML_SetElementDeclHandler(XML_Parser p,
2607                          XML_ElementDeclHandler eldecl);
2608</pre>
2609
2610        <pre class="signature">
2611typedef void
2612(XMLCALL *XML_ElementDeclHandler)(void *userData,
2613                                  const XML_Char *name,
2614                                  XML_Content *model);
2615</pre>
2616
2617        <pre class="signature">
2618enum XML_Content_Type {
2619  XML_CTYPE_EMPTY = 1,
2620  XML_CTYPE_ANY,
2621  XML_CTYPE_MIXED,
2622  XML_CTYPE_NAME,
2623  XML_CTYPE_CHOICE,
2624  XML_CTYPE_SEQ
2625};
2626
2627enum XML_Content_Quant {
2628  XML_CQUANT_NONE,
2629  XML_CQUANT_OPT,
2630  XML_CQUANT_REP,
2631  XML_CQUANT_PLUS
2632};
2633
2634typedef struct XML_cp XML_Content;
2635
2636struct XML_cp {
2637  enum XML_Content_Type         type;
2638  enum XML_Content_Quant        quant;
2639  const XML_Char *              name;
2640  unsigned int                  numchildren;
2641  XML_Content *                 children;
2642};
2643</pre>
2644        <p>
2645          Sets a handler for element declarations in a DTD. The handler gets called with
2646          the name of the element in the declaration and a pointer to a structure that
2647          contains the element model. It's the user code's responsibility to free model
2648          when finished with via a call to <code><a href=
2649          "#XML_FreeContentModel">XML_FreeContentModel</a></code>. There is no need to
2650          free the model from the handler, it can be kept around and freed at a later
2651          stage.
2652        </p>
2653
2654        <p>
2655          The <code>model</code> argument is the root of a tree of
2656          <code>XML_Content</code> nodes. If <code>type</code> equals
2657          <code>XML_CTYPE_EMPTY</code> or <code>XML_CTYPE_ANY</code>, then
2658          <code>quant</code> will be <code>XML_CQUANT_NONE</code>, and the other fields
2659          will be zero or <code>NULL</code>. If <code>type</code> is
2660          <code>XML_CTYPE_MIXED</code>, then <code>quant</code> will be
2661          <code>XML_CQUANT_NONE</code> or <code>XML_CQUANT_REP</code> and
2662          <code>numchildren</code> will contain the number of elements that are allowed
2663          to be mixed in and <code>children</code> points to an array of
2664          <code>XML_Content</code> structures that will all have type XML_CTYPE_NAME with
2665          no quantification. Only the root node can be type <code>XML_CTYPE_EMPTY</code>,
2666          <code>XML_CTYPE_ANY</code>, or <code>XML_CTYPE_MIXED</code>.
2667        </p>
2668
2669        <p>
2670          For type <code>XML_CTYPE_NAME</code>, the <code>name</code> field points to the
2671          name and the <code>numchildren</code> and <code>children</code> fields will be
2672          zero and <code>NULL</code>. The <code>quant</code> field will indicate any
2673          quantifiers placed on the name.
2674        </p>
2675
2676        <p>
2677          Types <code>XML_CTYPE_CHOICE</code> and <code>XML_CTYPE_SEQ</code> indicate a
2678          choice or sequence respectively. The <code>numchildren</code> field indicates
2679          how many nodes in the choice or sequence and <code>children</code> points to
2680          the nodes.
2681        </p>
2682      </div>
2683
2684      <div class="handler">
2685        <h4 id="XML_SetAttlistDeclHandler">
2686          XML_SetAttlistDeclHandler
2687        </h4>
2688
2689        <pre class="setter">
2690void XMLCALL
2691XML_SetAttlistDeclHandler(XML_Parser p,
2692                          XML_AttlistDeclHandler attdecl);
2693</pre>
2694
2695        <pre class="signature">
2696typedef void
2697(XMLCALL *XML_AttlistDeclHandler)(void           *userData,
2698                                  const XML_Char *elname,
2699                                  const XML_Char *attname,
2700                                  const XML_Char *att_type,
2701                                  const XML_Char *dflt,
2702                                  int            isrequired);
2703</pre>
2704        <p>
2705          Set a handler for attlist declarations in the DTD. This handler is called for
2706          <em>each</em> attribute. So a single attlist declaration with multiple
2707          attributes declared will generate multiple calls to this handler. The
2708          <code>elname</code> parameter returns the name of the element for which the
2709          attribute is being declared. The attribute name is in the <code>attname</code>
2710          parameter. The attribute type is in the <code>att_type</code> parameter. It is
2711          the string representing the type in the declaration with whitespace removed.
2712        </p>
2713
2714        <p>
2715          The <code>dflt</code> parameter holds the default value. It will be
2716          <code>NULL</code> in the case of "#IMPLIED" or "#REQUIRED" attributes. You can
2717          distinguish these two cases by checking the <code>isrequired</code> parameter,
2718          which will be true in the case of "#REQUIRED" attributes. Attributes which are
2719          "#FIXED" will have also have a true <code>isrequired</code>, but they will have
2720          the non-<code>NULL</code> fixed value in the <code>dflt</code> parameter.
2721        </p>
2722      </div>
2723
2724      <div class="handler">
2725        <h4 id="XML_SetEntityDeclHandler">
2726          XML_SetEntityDeclHandler
2727        </h4>
2728
2729        <pre class="setter">
2730void XMLCALL
2731XML_SetEntityDeclHandler(XML_Parser p,
2732                         XML_EntityDeclHandler handler);
2733</pre>
2734
2735        <pre class="signature">
2736typedef void
2737(XMLCALL *XML_EntityDeclHandler)(void           *userData,
2738                                 const XML_Char *entityName,
2739                                 int            is_parameter_entity,
2740                                 const XML_Char *value,
2741                                 int            value_length,
2742                                 const XML_Char *base,
2743                                 const XML_Char *systemId,
2744                                 const XML_Char *publicId,
2745                                 const XML_Char *notationName);
2746</pre>
2747        <p>
2748          Sets a handler that will be called for all entity declarations. The
2749          <code>is_parameter_entity</code> argument will be non-zero in the case of
2750          parameter entities and zero otherwise.
2751        </p>
2752
2753        <p>
2754          For internal entities (<code>&lt;!ENTITY foo "bar"&gt;</code>),
2755          <code>value</code> will be non-<code>NULL</code> and <code>systemId</code>,
2756          <code>publicId</code>, and <code>notationName</code> will all be
2757          <code>NULL</code>. The value string is <em>not</em> null-terminated; the length
2758          is provided in the <code>value_length</code> parameter. Do not use
2759          <code>value_length</code> to test for internal entities, since it is legal to
2760          have zero-length values. Instead check for whether or not <code>value</code> is
2761          <code>NULL</code>.
2762        </p>
2763
2764        <p>
2765          The <code>notationName</code> argument will have a non-<code>NULL</code> value
2766          only for unparsed entity declarations.
2767        </p>
2768      </div>
2769
2770      <div class="handler">
2771        <h4 id="XML_SetUnparsedEntityDeclHandler">
2772          XML_SetUnparsedEntityDeclHandler
2773        </h4>
2774
2775        <pre class="setter">
2776void XMLCALL
2777XML_SetUnparsedEntityDeclHandler(XML_Parser p,
2778                                 XML_UnparsedEntityDeclHandler h)
2779</pre>
2780
2781        <pre class="signature">
2782typedef void
2783(XMLCALL *XML_UnparsedEntityDeclHandler)(void *userData,
2784                                         const XML_Char *entityName,
2785                                         const XML_Char *base,
2786                                         const XML_Char *systemId,
2787                                         const XML_Char *publicId,
2788                                         const XML_Char *notationName);
2789</pre>
2790        <p>
2791          Set a handler that receives declarations of unparsed entities. These are entity
2792          declarations that have a notation (NDATA) field:
2793        </p>
2794
2795        <div id="eg">
2796          <pre>
2797&lt;!ENTITY logo SYSTEM "images/logo.gif" NDATA gif&gt;
2798</pre>
2799        </div>
2800
2801        <p>
2802          This handler is obsolete and is provided for backwards compatibility. Use
2803          instead <a href="#XML_SetEntityDeclHandler">XML_SetEntityDeclHandler</a>.
2804        </p>
2805      </div>
2806
2807      <div class="handler">
2808        <h4 id="XML_SetNotationDeclHandler">
2809          XML_SetNotationDeclHandler
2810        </h4>
2811
2812        <pre class="setter">
2813void XMLCALL
2814XML_SetNotationDeclHandler(XML_Parser p,
2815                           XML_NotationDeclHandler h)
2816</pre>
2817
2818        <pre class="signature">
2819typedef void
2820(XMLCALL *XML_NotationDeclHandler)(void *userData,
2821                                   const XML_Char *notationName,
2822                                   const XML_Char *base,
2823                                   const XML_Char *systemId,
2824                                   const XML_Char *publicId);
2825</pre>
2826        <p>
2827          Set a handler that receives notation declarations.
2828        </p>
2829      </div>
2830
2831      <div class="handler">
2832        <h4 id="XML_SetNotStandaloneHandler">
2833          XML_SetNotStandaloneHandler
2834        </h4>
2835
2836        <pre class="setter">
2837void XMLCALL
2838XML_SetNotStandaloneHandler(XML_Parser p,
2839                            XML_NotStandaloneHandler h)
2840</pre>
2841
2842        <pre class="signature">
2843typedef int
2844(XMLCALL *XML_NotStandaloneHandler)(void *userData);
2845</pre>
2846        <p>
2847          Set a handler that is called if the document is not "standalone". This happens
2848          when there is an external subset or a reference to a parameter entity, but does
2849          not have standalone set to "yes" in an XML declaration. If this handler returns
2850          <code>XML_STATUS_ERROR</code>, then the parser will throw an
2851          <code>XML_ERROR_NOT_STANDALONE</code> error.
2852        </p>
2853      </div>
2854
2855      <h3>
2856        <a id="position" name="position">Parse position and error reporting functions</a>
2857      </h3>
2858
2859      <p>
2860        These are the functions you'll want to call when the parse functions return
2861        <code>XML_STATUS_ERROR</code> (a parse error has occurred), although the position
2862        reporting functions are useful outside of errors. The position reported is the
2863        byte position (in the original document or entity encoding) of the first of the
2864        sequence of characters that generated the current event (or the error that caused
2865        the parse functions to return <code>XML_STATUS_ERROR</code>.) The exceptions are
2866        callbacks triggered by declarations in the document prologue, in which case they
2867        exact position reported is somewhere in the relevant markup, but not necessarily
2868        as meaningful as for other events.
2869      </p>
2870
2871      <p>
2872        The position reporting functions are accurate only outside of the DTD. In other
2873        words, they usually return bogus information when called from within a DTD
2874        declaration handler.
2875      </p>
2876
2877      <h4 id="XML_GetErrorCode">
2878        XML_GetErrorCode
2879      </h4>
2880
2881      <pre class="fcndec">
2882enum XML_Error XMLCALL
2883XML_GetErrorCode(XML_Parser p);
2884</pre>
2885      <div class="fcndef">
2886        Return what type of error has occurred.
2887      </div>
2888
2889      <h4 id="XML_ErrorString">
2890        XML_ErrorString
2891      </h4>
2892
2893      <pre class="fcndec">
2894const XML_LChar * XMLCALL
2895XML_ErrorString(enum XML_Error code);
2896</pre>
2897      <div class="fcndef">
2898        Return a string describing the error corresponding to code. The code should be
2899        one of the enums that can be returned from <code><a href=
2900        "#XML_GetErrorCode">XML_GetErrorCode</a></code>.
2901      </div>
2902
2903      <h4 id="XML_GetCurrentByteIndex">
2904        XML_GetCurrentByteIndex
2905      </h4>
2906
2907      <pre class="fcndec">
2908XML_Index XMLCALL
2909XML_GetCurrentByteIndex(XML_Parser p);
2910</pre>
2911      <div class="fcndef">
2912        Return the byte offset of the position. This always corresponds to the values
2913        returned by <code><a href=
2914        "#XML_GetCurrentLineNumber">XML_GetCurrentLineNumber</a></code> and
2915        <code><a href="#XML_GetCurrentColumnNumber">XML_GetCurrentColumnNumber</a></code>.
2916      </div>
2917
2918      <h4 id="XML_GetCurrentLineNumber">
2919        XML_GetCurrentLineNumber
2920      </h4>
2921
2922      <pre class="fcndec">
2923XML_Size XMLCALL
2924XML_GetCurrentLineNumber(XML_Parser p);
2925</pre>
2926      <div class="fcndef">
2927        Return the line number of the position. The first line is reported as
2928        <code>1</code>.
2929      </div>
2930
2931      <h4 id="XML_GetCurrentColumnNumber">
2932        XML_GetCurrentColumnNumber
2933      </h4>
2934
2935      <pre class="fcndec">
2936XML_Size XMLCALL
2937XML_GetCurrentColumnNumber(XML_Parser p);
2938</pre>
2939      <div class="fcndef">
2940        Return the <em>offset</em>, from the beginning of the current line, of the
2941        position. The first column is reported as <code>0</code>.
2942      </div>
2943
2944      <h4 id="XML_GetCurrentByteCount">
2945        XML_GetCurrentByteCount
2946      </h4>
2947
2948      <pre class="fcndec">
2949int XMLCALL
2950XML_GetCurrentByteCount(XML_Parser p);
2951</pre>
2952      <div class="fcndef">
2953        Return the number of bytes in the current event. Returns <code>0</code> if the
2954        event is inside a reference to an internal entity and for the end-tag event for
2955        empty element tags (the later can be used to distinguish empty-element tags from
2956        empty elements using separate start and end tags).
2957      </div>
2958
2959      <h4 id="XML_GetInputContext">
2960        XML_GetInputContext
2961      </h4>
2962
2963      <pre class="fcndec">
2964const char * XMLCALL
2965XML_GetInputContext(XML_Parser p,
2966                    int *offset,
2967                    int *size);
2968</pre>
2969      <div class="fcndef">
2970        <p>
2971          Returns the parser's input buffer, sets the integer pointed at by
2972          <code>offset</code> to the offset within this buffer of the current parse
2973          position, and set the integer pointed at by <code>size</code> to the size of
2974          the returned buffer.
2975        </p>
2976
2977        <p>
2978          This should only be called from within a handler during an active parse and the
2979          returned buffer should only be referred to from within the handler that made
2980          the call. This input buffer contains the untranslated bytes of the input.
2981        </p>
2982
2983        <p>
2984          Only a limited amount of context is kept, so if the event triggering a call
2985          spans over a very large amount of input, the actual parse position may be
2986          before the beginning of the buffer.
2987        </p>
2988
2989        <p>
2990          If <code>XML_CONTEXT_BYTES</code> is zero, this will always return
2991          <code>NULL</code>.
2992        </p>
2993      </div>
2994
2995      <h3>
2996        <a id="attack-protection" name="attack-protection">Attack Protection</a><a id=
2997        "billion-laughs" name="billion-laughs"></a>
2998      </h3>
2999
3000      <h4 id="XML_SetBillionLaughsAttackProtectionMaximumAmplification">
3001        XML_SetBillionLaughsAttackProtectionMaximumAmplification
3002      </h4>
3003
3004      <pre class="fcndec">
3005/* Added in Expat 2.4.0. */
3006XML_Bool XMLCALL
3007XML_SetBillionLaughsAttackProtectionMaximumAmplification(XML_Parser p,
3008                                                         float maximumAmplificationFactor);
3009</pre>
3010      <div class="fcndef">
3011        <p>
3012          Sets the maximum tolerated amplification factor for protection against <a href=
3013          "https://en.wikipedia.org/wiki/Billion_laughs_attack">billion laughs
3014          attacks</a> (default: <code>100.0</code>) of parser <code>p</code> to
3015          <code>maximumAmplificationFactor</code>, and returns <code>XML_TRUE</code> upon
3016          success and <code>XML_FALSE</code> upon error.
3017        </p>
3018
3019        <p>
3020          Once the <a href=
3021          "#XML_SetBillionLaughsAttackProtectionActivationThreshold">threshold for
3022          activation</a> is reached, the amplification factor is calculated as ..
3023        </p>
3024
3025        <pre>amplification := (direct + indirect) / direct</pre>
3026        <p>
3027          .. while parsing, whereas <code>direct</code> is the number of bytes read from
3028          the primary document in parsing and <code>indirect</code> is the number of
3029          bytes added by expanding entities and reading of external DTD files, combined.
3030        </p>
3031
3032        <p>
3033          For a call to
3034          <code>XML_SetBillionLaughsAttackProtectionMaximumAmplification</code> to
3035          succeed:
3036        </p>
3037
3038        <ul>
3039          <li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without
3040          any parent parsers) and
3041          </li>
3042
3043          <li>
3044            <code>maximumAmplificationFactor</code> must be non-<code>NaN</code> and
3045            greater than or equal to <code>1.0</code>.
3046          </li>
3047        </ul>
3048
3049        <p>
3050          <strong>Note:</strong> If you ever need to increase this value for non-attack
3051          payload, please <a href="https://github.com/libexpat/libexpat/issues">file a
3052          bug report</a>.
3053        </p>
3054
3055        <p>
3056          <strong>Note:</strong> Peak amplifications of factor 15,000 for the entire
3057          payload and of factor 30,000 in the middle of parsing have been observed with
3058          small benign files in practice. So if you do reduce the maximum allowed
3059          amplification, please make sure that the activation threshold is still big
3060          enough to not end up with undesired false positives (i.e. benign files being
3061          rejected).
3062        </p>
3063      </div>
3064
3065      <h4 id="XML_SetBillionLaughsAttackProtectionActivationThreshold">
3066        XML_SetBillionLaughsAttackProtectionActivationThreshold
3067      </h4>
3068
3069      <pre class="fcndec">
3070/* Added in Expat 2.4.0. */
3071XML_Bool XMLCALL
3072XML_SetBillionLaughsAttackProtectionActivationThreshold(XML_Parser p,
3073                                                        unsigned long long activationThresholdBytes);
3074</pre>
3075      <div class="fcndef">
3076        <p>
3077          Sets number of output bytes (including amplification from entity expansion and
3078          reading DTD files) needed to activate protection against <a href=
3079          "https://en.wikipedia.org/wiki/Billion_laughs_attack">billion laughs
3080          attacks</a> (default: <code>8 MiB</code>) of parser <code>p</code> to
3081          <code>activationThresholdBytes</code>, and returns <code>XML_TRUE</code> upon
3082          success and <code>XML_FALSE</code> upon error.
3083        </p>
3084
3085        <p>
3086          For a call to
3087          <code>XML_SetBillionLaughsAttackProtectionActivationThreshold</code> to
3088          succeed:
3089        </p>
3090
3091        <ul>
3092          <li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without
3093          any parent parsers).
3094          </li>
3095        </ul>
3096
3097        <p>
3098          <strong>Note:</strong> If you ever need to increase this value for non-attack
3099          payload, please <a href="https://github.com/libexpat/libexpat/issues">file a
3100          bug report</a>.
3101        </p>
3102
3103        <p>
3104          <strong>Note:</strong> Activation thresholds below 4 MiB are known to break
3105          support for <a href=
3106          "https://en.wikipedia.org/wiki/Darwin_Information_Typing_Architecture">DITA</a>
3107          1.3 payload and are hence not recommended.
3108        </p>
3109      </div>
3110
3111      <h4 id="XML_SetAllocTrackerMaximumAmplification">
3112        XML_SetAllocTrackerMaximumAmplification
3113      </h4>
3114
3115      <pre class="fcndec">
3116/* Added in Expat 2.7.2. */
3117XML_Bool
3118XML_SetAllocTrackerMaximumAmplification(XML_Parser p,
3119                                        float maximumAmplificationFactor);
3120</pre>
3121      <div class="fcndef">
3122        <p>
3123          Sets the maximum tolerated amplification factor between direct input and bytes
3124          of dynamic memory allocated (default: <code>100.0</code>) of parser
3125          <code>p</code> to <code>maximumAmplificationFactor</code>, and returns
3126          <code>XML_TRUE</code> upon success and <code>XML_FALSE</code> upon error.
3127        </p>
3128
3129        <p>
3130          <strong>Note:</strong> There are three types of allocations that intentionally
3131          bypass tracking and limiting:
3132        </p>
3133
3134        <ul>
3135          <li>application calls to functions <code><a href=
3136          "#XML_MemMalloc">XML_MemMalloc</a></code> and <code><a href="#XML_MemRealloc">
3137            XML_MemRealloc</a></code> — <em>healthy</em> use of these two functions
3138            continues to be a responsibility of the application using Expat —,
3139          </li>
3140
3141          <li>the main character buffer used by functions <code><a href="#XML_GetBuffer">
3142            XML_GetBuffer</a></code> and <code><a href=
3143            "#XML_ParseBuffer">XML_ParseBuffer</a></code> (and thus also by plain
3144            <code><a href="#XML_Parse">XML_Parse</a></code>), and
3145          </li>
3146
3147          <li>the <a href="#XML_SetElementDeclHandler">content model memory</a> (that is
3148          passed to the <a href="#XML_SetElementDeclHandler">element declaration
3149          handler</a> and freed by a call to <code><a href=
3150          "#XML_FreeContentModel">XML_FreeContentModel</a></code>).
3151          </li>
3152        </ul>
3153
3154        <p>
3155          Once the <a href="#XML_SetAllocTrackerActivationThreshold">threshold for
3156          activation</a> is reached, the amplification factor is calculated as ..
3157        </p>
3158
3159        <pre>amplification := allocated / direct</pre>
3160        <p>
3161          .. while parsing, whereas <code>direct</code> is the number of bytes read from
3162          the primary document in parsing and <code>allocated</code> is the number of
3163          bytes of dynamic memory allocated in the parser hierarchy.
3164        </p>
3165
3166        <p>
3167          For a call to <code>XML_SetAllocTrackerMaximumAmplification</code> to succeed:
3168        </p>
3169
3170        <ul>
3171          <li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without
3172          any parent parsers) and
3173          </li>
3174
3175          <li>
3176            <code>maximumAmplificationFactor</code> must be non-<code>NaN</code> and
3177            greater than or equal to <code>1.0</code>.
3178          </li>
3179        </ul>
3180
3181        <p>
3182          <strong>Note:</strong> If you ever need to increase this value for non-attack
3183          payload, please <a href="https://github.com/libexpat/libexpat/issues">file a
3184          bug report</a>.
3185        </p>
3186
3187        <p>
3188          <strong>Note:</strong> Amplifications factors greater than <code>100.0</code>
3189          can been observed near the start of parsing even with benign files in practice.
3190          So if you do reduce the maximum allowed amplification, please make sure that
3191          the activation threshold is still big enough to not end up with undesired false
3192          positives (i.e. benign files being rejected).
3193        </p>
3194      </div>
3195
3196      <h4 id="XML_SetAllocTrackerActivationThreshold">
3197        XML_SetAllocTrackerActivationThreshold
3198      </h4>
3199
3200      <pre class="fcndec">
3201/* Added in Expat 2.7.2. */
3202XML_Bool
3203XML_SetAllocTrackerActivationThreshold(XML_Parser p,
3204                                       unsigned long long activationThresholdBytes);
3205</pre>
3206      <div class="fcndef">
3207        <p>
3208          Sets number of allocated bytes of dynamic memory needed to activate protection
3209          against disproportionate use of RAM (default: <code>64 MiB</code>) of parser
3210          <code>p</code> to <code>activationThresholdBytes</code>, and returns
3211          <code>XML_TRUE</code> upon success and <code>XML_FALSE</code> upon error.
3212        </p>
3213
3214        <p>
3215          <strong>Note:</strong> For types of allocations that intentionally bypass
3216          tracking and limiting, please see <code><a href=
3217          "#XML_SetAllocTrackerMaximumAmplification">XML_SetAllocTrackerMaximumAmplification</a></code>
3218          above.
3219        </p>
3220
3221        <p>
3222          For a call to <code>XML_SetAllocTrackerActivationThreshold</code> to succeed:
3223        </p>
3224
3225        <ul>
3226          <li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without
3227          any parent parsers).
3228          </li>
3229        </ul>
3230
3231        <p>
3232          <strong>Note:</strong> If you ever need to increase this value for non-attack
3233          payload, please <a href="https://github.com/libexpat/libexpat/issues">file a
3234          bug report</a>.
3235        </p>
3236      </div>
3237
3238      <h4 id="XML_SetReparseDeferralEnabled">
3239        XML_SetReparseDeferralEnabled
3240      </h4>
3241
3242      <pre class="fcndec">
3243/* Added in Expat 2.6.0. */
3244XML_Bool XMLCALL
3245XML_SetReparseDeferralEnabled(XML_Parser parser, XML_Bool enabled);
3246</pre>
3247      <div class="fcndef">
3248        <p>
3249          Large tokens may require many parse calls before enough data is available for
3250          Expat to parse it in full. If Expat retried parsing the token on every parse
3251          call, parsing could take quadratic time. To avoid this, Expat only retries once
3252          a significant amount of new data is available. This function allows disabling
3253          this behavior.
3254        </p>
3255
3256        <p>
3257          The <code>enabled</code> argument should be <code>XML_TRUE</code> or
3258          <code>XML_FALSE</code>.
3259        </p>
3260
3261        <p>
3262          Returns <code>XML_TRUE</code> on success, and <code>XML_FALSE</code> on error.
3263        </p>
3264      </div>
3265
3266      <h3>
3267        <a id="miscellaneous" name="miscellaneous">Miscellaneous functions</a>
3268      </h3>
3269
3270      <p>
3271        The functions in this section either obtain state information from the parser or
3272        can be used to dynamically set parser options.
3273      </p>
3274
3275      <h4 id="XML_SetUserData">
3276        XML_SetUserData
3277      </h4>
3278
3279      <pre class="fcndec">
3280void XMLCALL
3281XML_SetUserData(XML_Parser p,
3282                void *userData);
3283</pre>
3284      <div class="fcndef">
3285        This sets the user data pointer that gets passed to handlers. It overwrites any
3286        previous value for this pointer. Note that the application is responsible for
3287        freeing the memory associated with <code>userData</code> when it is finished with
3288        the parser. So if you call this when there's already a pointer there, and you
3289        haven't freed the memory associated with it, then you've probably just leaked
3290        memory.
3291      </div>
3292
3293      <h4 id="XML_GetUserData">
3294        XML_GetUserData
3295      </h4>
3296
3297      <pre class="fcndec">
3298void * XMLCALL
3299XML_GetUserData(XML_Parser p);
3300</pre>
3301      <div class="fcndef">
3302        This returns the user data pointer that gets passed to handlers. It is actually
3303        implemented as a macro.
3304      </div>
3305
3306      <h4 id="XML_UseParserAsHandlerArg">
3307        XML_UseParserAsHandlerArg
3308      </h4>
3309
3310      <pre class="fcndec">
3311void XMLCALL
3312XML_UseParserAsHandlerArg(XML_Parser p);
3313</pre>
3314      <div class="fcndef">
3315        After this is called, handlers receive the parser in their <code>userData</code>
3316        arguments. The user data can still be obtained using the <code><a href=
3317        "#XML_GetUserData">XML_GetUserData</a></code> function.
3318      </div>
3319
3320      <h4 id="XML_SetBase">
3321        XML_SetBase
3322      </h4>
3323
3324      <pre class="fcndec">
3325enum XML_Status XMLCALL
3326XML_SetBase(XML_Parser p,
3327            const XML_Char *base);
3328</pre>
3329      <div class="fcndef">
3330        Set the base to be used for resolving relative URIs in system identifiers. The
3331        return value is <code>XML_STATUS_ERROR</code> if there's no memory to store base,
3332        otherwise it's <code>XML_STATUS_OK</code>.
3333      </div>
3334
3335      <h4 id="XML_GetBase">
3336        XML_GetBase
3337      </h4>
3338
3339      <pre class="fcndec">
3340const XML_Char * XMLCALL
3341XML_GetBase(XML_Parser p);
3342</pre>
3343      <div class="fcndef">
3344        Return the base for resolving relative URIs.
3345      </div>
3346
3347      <h4 id="XML_GetSpecifiedAttributeCount">
3348        XML_GetSpecifiedAttributeCount
3349      </h4>
3350
3351      <pre class="fcndec">
3352int XMLCALL
3353XML_GetSpecifiedAttributeCount(XML_Parser p);
3354</pre>
3355      <div class="fcndef">
3356        When attributes are reported to the start handler in the atts vector, attributes
3357        that were explicitly set in the element occur before any attributes that receive
3358        their value from default information in an ATTLIST declaration. This function
3359        returns the number of attributes that were explicitly set times two, thus giving
3360        the offset in the <code>atts</code> array passed to the start tag handler of the
3361        first attribute set due to defaults. It supplies information for the last call to
3362        a start handler. If called inside a start handler, then that means the current
3363        call.
3364      </div>
3365
3366      <h4 id="XML_GetIdAttributeIndex">
3367        XML_GetIdAttributeIndex
3368      </h4>
3369
3370      <pre class="fcndec">
3371int XMLCALL
3372XML_GetIdAttributeIndex(XML_Parser p);
3373</pre>
3374      <div class="fcndef">
3375        Returns the index of the ID attribute passed in the atts array in the last call
3376        to <code><a href="#XML_StartElementHandler">XML_StartElementHandler</a></code>,
3377        or -1 if there is no ID attribute. If called inside a start handler, then that
3378        means the current call.
3379      </div>
3380
3381      <h4 id="XML_GetAttributeInfo">
3382        XML_GetAttributeInfo
3383      </h4>
3384
3385      <pre class="fcndec">
3386const XML_AttrInfo * XMLCALL
3387XML_GetAttributeInfo(XML_Parser parser);
3388</pre>
3389
3390      <pre class="signature">
3391typedef struct {
3392  XML_Index  nameStart;  /* Offset to beginning of the attribute name. */
3393  XML_Index  nameEnd;    /* Offset after the attribute name's last byte. */
3394  XML_Index  valueStart; /* Offset to beginning of the attribute value. */
3395  XML_Index  valueEnd;   /* Offset after the attribute value's last byte. */
3396} XML_AttrInfo;
3397</pre>
3398      <div class="fcndef">
3399        Returns an array of <code>XML_AttrInfo</code> structures for the attribute/value
3400        pairs passed in the last call to the <code>XML_StartElementHandler</code> that
3401        were specified in the start-tag rather than defaulted. Each attribute/value pair
3402        counts as 1; thus the number of entries in the array is
3403        <code>XML_GetSpecifiedAttributeCount(parser) / 2</code>.
3404      </div>
3405
3406      <h4 id="XML_SetEncoding">
3407        XML_SetEncoding
3408      </h4>
3409
3410      <pre class="fcndec">
3411enum XML_Status XMLCALL
3412XML_SetEncoding(XML_Parser p,
3413                const XML_Char *encoding);
3414</pre>
3415      <div class="fcndef">
3416        Set the encoding to be used by the parser. It is equivalent to passing a
3417        non-<code>NULL</code> encoding argument to the parser creation functions. It must
3418        not be called after <code><a href="#XML_Parse">XML_Parse</a></code> or
3419        <code><a href="#XML_ParseBuffer">XML_ParseBuffer</a></code> have been called on
3420        the given parser. Returns <code>XML_STATUS_OK</code> on success or
3421        <code>XML_STATUS_ERROR</code> on error.
3422      </div>
3423
3424      <h4 id="XML_SetParamEntityParsing">
3425        XML_SetParamEntityParsing
3426      </h4>
3427
3428      <pre class="fcndec">
3429int XMLCALL
3430XML_SetParamEntityParsing(XML_Parser p,
3431                          enum XML_ParamEntityParsing code);
3432</pre>
3433      <div class="fcndef">
3434        This enables parsing of parameter entities, including the external parameter
3435        entity that is the external DTD subset, according to <code>code</code>. The
3436        choices for <code>code</code> are:
3437        <ul>
3438          <li>
3439            <code>XML_PARAM_ENTITY_PARSING_NEVER</code>
3440          </li>
3441
3442          <li>
3443            <code>XML_PARAM_ENTITY_PARSING_UNLESS_STANDALONE</code>
3444          </li>
3445
3446          <li>
3447            <code>XML_PARAM_ENTITY_PARSING_ALWAYS</code>
3448          </li>
3449        </ul>
3450        <b>Note:</b> If <code>XML_SetParamEntityParsing</code> is called after
3451        <code>XML_Parse</code> or <code>XML_ParseBuffer</code>, then it has no effect and
3452        will always return 0.
3453      </div>
3454
3455      <h4 id="XML_SetHashSalt">
3456        XML_SetHashSalt (deprecated)
3457      </h4>
3458
3459      <pre class="fcndec">
3460int XMLCALL
3461XML_SetHashSalt(XML_Parser parser,
3462                unsigned long hash_salt);
3463</pre>
3464      <div class="fcndef">
3465        Sets the hash salt to use for internal hash calculations. Helps in preventing DoS
3466        attacks based on predicting hash function behavior. In order to have an effect
3467        this must be called before parsing has started. Returns 1 if successful, 0 when
3468        called after <code>XML_Parse</code> or <code>XML_ParseBuffer</code> or when
3469        <code>parser</code> is <code>NULL</code>.
3470        <p>
3471          <b>Note:</b> Function <code>XML_SetHashSalt</code> is
3472          <strong>deprecated</strong>. Please use function <code><a href=
3473          "#XML_SetHashSalt16Bytes">XML_SetHashSalt16Bytes</a></code> instead for better
3474          security. <code>XML_SetHashSalt</code> only provides 4 to 8 bytes of entropy
3475          (depending on the size of type <code>unsigned long</code>) while the SipHash
3476          implementation used by Expat can leverage up to 16 bytes of entropy — at least
3477          twice as much. Function <code><a href=
3478          "#XML_SetHashSalt16Bytes">XML_SetHashSalt16Bytes</a></code> of Expat &gt;=2.8.0
3479          (and where backported) matches the amount of entropy supported by SipHash.
3480        </p>
3481
3482        <p>
3483          <b>Note:</b> This call is optional, as the parser will auto-generate a new
3484          random salt value internally if no value has been set by the start of parsing.
3485        </p>
3486
3487        <p>
3488          <b>Note:</b> One should not call <code>XML_SetHashSalt</code> with a hash salt
3489          value of 0, as this value is used as sentinel value to indicate that
3490          <code>XML_SetHashSalt</code> has <b>not</b> been called. Consequently such a
3491          call will have no effect, even if it returns 1.
3492        </p>
3493      </div>
3494
3495      <h4 id="XML_SetHashSalt16Bytes">
3496        XML_SetHashSalt16Bytes
3497      </h4>
3498
3499      <pre class="fcndec">
3500/* Added in Expat 2.8.0. */
3501XML_Bool XMLCALL
3502XML_SetHashSalt16Bytes(XML_Parser parser,
3503                       const uint8_t entropy[16]);
3504</pre>
3505      <div class="fcndef">
3506        Sets the hash salt to use for internal hash calculations. Helps in preventing DoS
3507        attacks based on predicting hash function behavior. In order to have an effect
3508        this must be called before parsing has started. Returns <code>XML_TRUE</code> if
3509        successful, <code>XML_FALSE</code> when called after <code>XML_Parse</code> or
3510        <code>XML_ParseBuffer</code> or when <code>parser</code> is <code>NULL</code>.
3511        <p>
3512          <b>Note:</b> Setting a salt that is <em>not</em> from a source of high quality
3513          entropy (like <code>getentropy(3)</code>) will make the parser vulnerable to
3514          hash flooding attacks.
3515        </p>
3516
3517        <p>
3518          <b>Note:</b> This call is optional, as the parser will auto-generate a new
3519          random salt value internally if no value has been set by the start of parsing.
3520        </p>
3521      </div>
3522
3523      <h4 id="XML_UseForeignDTD">
3524        XML_UseForeignDTD
3525      </h4>
3526
3527      <pre class="fcndec">
3528enum XML_Error XMLCALL
3529XML_UseForeignDTD(XML_Parser parser, XML_Bool useDTD);
3530</pre>
3531      <div class="fcndef">
3532        <p>
3533          This function allows an application to provide an external subset for the
3534          document type declaration for documents which do not specify an external subset
3535          of their own. For documents which specify an external subset in their DOCTYPE
3536          declaration, the application-provided subset will be ignored. If the document
3537          does not contain a DOCTYPE declaration at all and <code>useDTD</code> is true,
3538          the application-provided subset will be parsed, but the
3539          <code>startDoctypeDeclHandler</code> and <code>endDoctypeDeclHandler</code>
3540          functions, if set, will not be called. The setting of parameter entity parsing,
3541          controlled using <code><a href=
3542          "#XML_SetParamEntityParsing">XML_SetParamEntityParsing</a></code>, will be
3543          honored.
3544        </p>
3545
3546        <p>
3547          The application-provided external subset is read by calling the external entity
3548          reference handler set via <code><a href=
3549          "#XML_SetExternalEntityRefHandler">XML_SetExternalEntityRefHandler</a></code>
3550          with both <code>publicId</code> and <code>systemId</code> set to
3551          <code>NULL</code>.
3552        </p>
3553
3554        <p>
3555          If this function is called after parsing has begun, it returns
3556          <code>XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING</code> and ignores
3557          <code>useDTD</code>. If called when Expat has been compiled without DTD
3558          support, it returns <code>XML_ERROR_FEATURE_REQUIRES_XML_DTD</code>. Otherwise,
3559          it returns <code>XML_ERROR_NONE</code>.
3560        </p>
3561
3562        <p>
3563          <b>Note:</b> For the purpose of checking WFC: Entity Declared, passing
3564          <code>useDTD == XML_TRUE</code> will make the parser behave as if the document
3565          had a DTD with an external subset. This holds true even if the external entity
3566          reference handler returns without action.
3567        </p>
3568      </div>
3569
3570      <h4 id="XML_SetReturnNSTriplet">
3571        XML_SetReturnNSTriplet
3572      </h4>
3573
3574      <pre class="fcndec">
3575void XMLCALL
3576XML_SetReturnNSTriplet(XML_Parser parser,
3577                       int        do_nst);
3578</pre>
3579      <div class="fcndef">
3580        <p>
3581          This function only has an effect when using a parser created with
3582          <code><a href="#XML_ParserCreateNS">XML_ParserCreateNS</a></code>, i.e. when
3583          namespace processing is in effect. The <code>do_nst</code> sets whether or not
3584          prefixes are returned with names qualified with a namespace prefix. If this
3585          function is called with <code>do_nst</code> non-zero, then afterwards namespace
3586          qualified names (that is qualified with a prefix as opposed to belonging to a
3587          default namespace) are returned as a triplet with the three parts separated by
3588          the namespace separator specified when the parser was created. The order of
3589          returned parts is URI, local name, and prefix.
3590        </p>
3591
3592        <p>
3593          If <code>do_nst</code> is zero, then namespaces are reported in the default
3594          manner, URI then local_name separated by the namespace separator.
3595        </p>
3596      </div>
3597
3598      <h4 id="XML_DefaultCurrent">
3599        XML_DefaultCurrent
3600      </h4>
3601
3602      <pre class="fcndec">
3603void XMLCALL
3604XML_DefaultCurrent(XML_Parser parser);
3605</pre>
3606      <div class="fcndef">
3607        This can be called within a handler for a start element, end element, processing
3608        instruction or character data. It causes the corresponding markup to be passed to
3609        the default handler set by <code><a href=
3610        "#XML_SetDefaultHandler">XML_SetDefaultHandler</a></code> or <code><a href=
3611        "#XML_SetDefaultHandlerExpand">XML_SetDefaultHandlerExpand</a></code>. It does
3612        nothing if there is not a default handler.
3613      </div>
3614
3615      <h4 id="XML_ExpatVersion">
3616        XML_ExpatVersion
3617      </h4>
3618
3619      <pre class="fcndec">
3620XML_LChar * XMLCALL
3621XML_ExpatVersion();
3622</pre>
3623      <div class="fcndef">
3624        Return the library version as a string (e.g. <code>"expat_1.95.1"</code>).
3625      </div>
3626
3627      <h4 id="XML_ExpatVersionInfo">
3628        XML_ExpatVersionInfo
3629      </h4>
3630
3631      <pre class="fcndec">
3632struct XML_Expat_Version XMLCALL
3633XML_ExpatVersionInfo();
3634</pre>
3635
3636      <pre class="signature">
3637typedef struct {
3638  int major;
3639  int minor;
3640  int micro;
3641} XML_Expat_Version;
3642</pre>
3643      <div class="fcndef">
3644        Return the library version information as a structure. Some macros are also
3645        defined that support compile-time tests of the library version:
3646        <ul>
3647          <li>
3648            <code>XML_MAJOR_VERSION</code>
3649          </li>
3650
3651          <li>
3652            <code>XML_MINOR_VERSION</code>
3653          </li>
3654
3655          <li>
3656            <code>XML_MICRO_VERSION</code>
3657          </li>
3658        </ul>
3659        Testing these constants is currently the best way to determine if particular
3660        parts of the Expat API are available.
3661      </div>
3662
3663      <h4 id="XML_GetFeatureList">
3664        XML_GetFeatureList
3665      </h4>
3666
3667      <pre class="fcndec">
3668const XML_Feature * XMLCALL
3669XML_GetFeatureList();
3670</pre>
3671
3672      <pre class="signature">
3673enum XML_FeatureEnum {
3674  XML_FEATURE_END = 0,
3675  XML_FEATURE_UNICODE,
3676  XML_FEATURE_UNICODE_WCHAR_T,
3677  XML_FEATURE_DTD,
3678  XML_FEATURE_CONTEXT_BYTES,
3679  XML_FEATURE_MIN_SIZE,
3680  XML_FEATURE_SIZEOF_XML_CHAR,
3681  XML_FEATURE_SIZEOF_XML_LCHAR,
3682  XML_FEATURE_NS,
3683  XML_FEATURE_LARGE_SIZE
3684};
3685
3686typedef struct {
3687  enum XML_FeatureEnum  feature;
3688  XML_LChar            *name;
3689  long int              value;
3690} XML_Feature;
3691</pre>
3692      <div class="fcndef">
3693        <p>
3694          Returns a list of "feature" records, providing details on how Expat was
3695          configured at compile time. Most applications should not need to worry about
3696          this, but this information is otherwise not available from Expat. This function
3697          allows code that does need to check these features to do so at runtime.
3698        </p>
3699
3700        <p>
3701          The return value is an array of <code>XML_Feature</code>, terminated by a
3702          record with a <code>feature</code> of <code>XML_FEATURE_END</code> and
3703          <code>name</code> of <code>NULL</code>, identifying the feature-test macros
3704          Expat was compiled with. Since an application that requires this kind of
3705          information needs to determine the type of character the <code>name</code>
3706          points to, records for the <code>XML_FEATURE_SIZEOF_XML_CHAR</code> and
3707          <code>XML_FEATURE_SIZEOF_XML_LCHAR</code> will be located at the beginning of
3708          the list, followed by <code>XML_FEATURE_UNICODE</code> and
3709          <code>XML_FEATURE_UNICODE_WCHAR_T</code>, if they are present at all.
3710        </p>
3711
3712        <p>
3713          Some features have an associated value. If there isn't an associated value, the
3714          <code>value</code> field is set to 0. At this time, the following features have
3715          been defined to have values:
3716        </p>
3717
3718        <dl>
3719          <dt>
3720            <code>XML_FEATURE_SIZEOF_XML_CHAR</code>
3721          </dt>
3722
3723          <dd>
3724            The number of bytes occupied by one <code>XML_Char</code> character.
3725          </dd>
3726
3727          <dt>
3728            <code>XML_FEATURE_SIZEOF_XML_LCHAR</code>
3729          </dt>
3730
3731          <dd>
3732            The number of bytes occupied by one <code>XML_LChar</code> character.
3733          </dd>
3734
3735          <dt>
3736            <code>XML_FEATURE_CONTEXT_BYTES</code>
3737          </dt>
3738
3739          <dd>
3740            The maximum number of characters of context which can be reported by
3741            <code><a href="#XML_GetInputContext">XML_GetInputContext</a></code>.
3742          </dd>
3743        </dl>
3744      </div>
3745
3746      <h4 id="XML_FreeContentModel">
3747        XML_FreeContentModel
3748      </h4>
3749
3750      <pre class="fcndec">
3751void XMLCALL
3752XML_FreeContentModel(XML_Parser parser, XML_Content *model);
3753</pre>
3754      <div class="fcndef">
3755        Function to deallocate the <code>model</code> argument passed to the
3756        <code>XML_ElementDeclHandler</code> callback set using <code><a href=
3757        "#XML_SetElementDeclHandler">XML_ElementDeclHandler</a></code>. This function
3758        should not be used for any other purpose.
3759      </div>
3760
3761      <p>
3762        The following functions allow external code to share the memory allocator an
3763        <code>XML_Parser</code> has been configured to use. This is especially useful for
3764        third-party libraries that interact with a parser object created by application
3765        code, or heavily layered applications. This can be essential when using
3766        dynamically loaded libraries which use different C standard libraries (this can
3767        happen on Windows, at least).
3768      </p>
3769
3770      <h4 id="XML_MemMalloc">
3771        XML_MemMalloc
3772      </h4>
3773
3774      <pre class="fcndec">
3775void * XMLCALL
3776XML_MemMalloc(XML_Parser parser, size_t size);
3777</pre>
3778      <div class="fcndef">
3779        Allocate <code>size</code> bytes of memory using the allocator the
3780        <code>parser</code> object has been configured to use. Returns a pointer to the
3781        memory or <code>NULL</code> on failure. Memory allocated in this way must be
3782        freed using <code><a href="#XML_MemFree">XML_MemFree</a></code>.
3783      </div>
3784
3785      <h4 id="XML_MemRealloc">
3786        XML_MemRealloc
3787      </h4>
3788
3789      <pre class="fcndec">
3790void * XMLCALL
3791XML_MemRealloc(XML_Parser parser, void *ptr, size_t size);
3792</pre>
3793      <div class="fcndef">
3794        Allocate <code>size</code> bytes of memory using the allocator the
3795        <code>parser</code> object has been configured to use. <code>ptr</code> must
3796        point to a block of memory allocated by <code><a href=
3797        "#XML_MemMalloc">XML_MemMalloc</a></code> or <code>XML_MemRealloc</code>, or be
3798        <code>NULL</code>. This function tries to expand the block pointed to by
3799        <code>ptr</code> if possible. Returns a pointer to the memory or
3800        <code>NULL</code> on failure. On success, the original block has either been
3801        expanded or freed. On failure, the original block has not been freed; the caller
3802        is responsible for freeing the original block. Memory allocated in this way must
3803        be freed using <code><a href="#XML_MemFree">XML_MemFree</a></code>.
3804      </div>
3805
3806      <h4 id="XML_MemFree">
3807        XML_MemFree
3808      </h4>
3809
3810      <pre class="fcndec">
3811void XMLCALL
3812XML_MemFree(XML_Parser parser, void *ptr);
3813</pre>
3814      <div class="fcndef">
3815        Free a block of memory pointed to by <code>ptr</code>. The block must have been
3816        allocated by <code><a href="#XML_MemMalloc">XML_MemMalloc</a></code> or
3817        <code>XML_MemRealloc</code>, or be <code>NULL</code>.
3818      </div>
3819
3820      <hr />
3821
3822      <div class="footer">
3823        Found a bug in the documentation? <a href=
3824        "https://github.com/libexpat/libexpat/issues">Please file a bug report.</a>
3825      </div>
3826    </div>
3827  </body>
3828</html>
3829