xref: /freebsd/contrib/expat/doc/reference.html (revision 5c7a97aaf1ca4b8bc078bc18f73e04499d48598f)
1<?xml version="1.0" encoding="utf-8"?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
3    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
4<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
5  <head>
6    <!--
7                            __  __            _
8                         ___\ \/ /_ __   __ _| |_
9                        / _ \\  /| '_ \ / _` | __|
10                       |  __//  \| |_) | (_| | |_
11                        \___/_/\_\ .__/ \__,_|\__|
12                                 |_| XML parser
13
14   Copyright (c) 2000      Clark Cooper <coopercc@users.sourceforge.net>
15   Copyright (c) 2000-2004 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
16   Copyright (c) 2002-2012 Karl Waclawek <karl@waclawek.net>
17   Copyright (c) 2017-2026 Sebastian Pipping <sebastian@pipping.org>
18   Copyright (c) 2017      Jakub Wilk <jwilk@jwilk.net>
19   Copyright (c) 2021      Tomas Korbar <tkorbar@redhat.com>
20   Copyright (c) 2021      Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
21   Copyright (c) 2022      Thijs Schreijer <thijs@thijsschreijer.nl>
22   Copyright (c) 2023-2025 Hanno Böck <hanno@gentoo.org>
23   Copyright (c) 2023      Sony Corporation / Snild Dolkow <snild@sony.com>
24   Licensed under the MIT license:
25
26   Permission is  hereby granted,  free of charge,  to any  person obtaining
27   a  copy  of  this  software   and  associated  documentation  files  (the
28   "Software"),  to  deal in  the  Software  without restriction,  including
29   without  limitation the  rights  to use,  copy,  modify, merge,  publish,
30   distribute, sublicense, and/or sell copies of the Software, and to permit
31   persons  to whom  the Software  is  furnished to  do so,  subject to  the
32   following conditions:
33
34   The above copyright  notice and this permission notice  shall be included
35   in all copies or substantial portions of the Software.
36
37   THE  SOFTWARE  IS  PROVIDED  "AS  IS",  WITHOUT  WARRANTY  OF  ANY  KIND,
38   EXPRESS  OR IMPLIED,  INCLUDING  BUT  NOT LIMITED  TO  THE WARRANTIES  OF
39   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN
40   NO EVENT SHALL THE AUTHORS OR  COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
41   DAMAGES OR  OTHER LIABILITY, WHETHER  IN AN  ACTION OF CONTRACT,  TORT OR
42   OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
43   USE OR OTHER DEALINGS IN THE SOFTWARE.
44-->
45
46    <title>
47      Expat XML Parser
48    </title>
49    <meta name="author" content="Clark Cooper, coopercc@netheaven.com" />
50    <link href="ok.min.css" rel="stylesheet" />
51    <link href="style.css" rel="stylesheet" />
52  </head>
53  <body>
54    <div>
55      <h1>
56        The Expat XML Parser <small>Release 2.8.2</small>
57      </h1>
58    </div>
59
60    <div class="content">
61      <p>
62        Expat is a library, written in C, for parsing XML documents. It's the underlying
63        XML parser for the open source Mozilla project, Perl's <code>XML::Parser</code>,
64        Python's <code>xml.parsers.expat</code>, and other open-source XML parsers.
65      </p>
66
67      <p>
68        This library is the creation of James Clark, who's also given us groff (an nroff
69        look-alike), Jade (an implementation of ISO's DSSSL stylesheet language for
70        SGML), XP (a Java XML parser package), XT (a Java XSL engine). James was also the
71        technical lead on the XML Working Group at W3C that produced the XML
72        specification.
73      </p>
74
75      <p>
76        This is free software, licensed under the <a href="../COPYING">MIT/X Consortium
77        license</a>. You may download it from <a href="https://libexpat.github.io/">the
78        Expat home page</a>.
79      </p>
80
81      <p>
82        The bulk of this document was originally commissioned as an article by <a href=
83        "https://www.xml.com/">XML.com</a>. They graciously allowed Clark Cooper to
84        retain copyright and to distribute it with Expat. This version has been
85        substantially extended to include documentation on features which have been added
86        since the original article was published, and additional information on using the
87        original interface.
88      </p>
89
90      <hr />
91
92      <h2>
93        Table of Contents
94      </h2>
95
96      <ul>
97        <li>
98          <a href="#overview">Overview</a>
99        </li>
100
101        <li>
102          <a href="#building">Building and Installing</a>
103        </li>
104
105        <li>
106          <a href="#using">Using Expat</a>
107        </li>
108
109        <li>
110          <a href="#reference">Reference</a>
111          <ul>
112            <li>
113              <a href="#creation">Parser Creation Functions</a>
114              <ul>
115                <li>
116                  <a href="#XML_ParserCreate">XML_ParserCreate</a>
117                </li>
118
119                <li>
120                  <a href="#XML_ParserCreateNS">XML_ParserCreateNS</a>
121                </li>
122
123                <li>
124                  <a href="#XML_ParserCreate_MM">XML_ParserCreate_MM</a>
125                </li>
126
127                <li>
128                  <a href=
129                  "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a>
130                </li>
131
132                <li>
133                  <a href="#XML_ParserFree">XML_ParserFree</a>
134                </li>
135
136                <li>
137                  <a href="#XML_ParserReset">XML_ParserReset</a>
138                </li>
139              </ul>
140            </li>
141
142            <li>
143              <a href="#parsing">Parsing Functions</a>
144              <ul>
145                <li>
146                  <a href="#XML_Parse">XML_Parse</a>
147                </li>
148
149                <li>
150                  <a href="#XML_ParseBuffer">XML_ParseBuffer</a>
151                </li>
152
153                <li>
154                  <a href="#XML_GetBuffer">XML_GetBuffer</a>
155                </li>
156
157                <li>
158                  <a href="#XML_StopParser">XML_StopParser</a>
159                </li>
160
161                <li>
162                  <a href="#XML_ResumeParser">XML_ResumeParser</a>
163                </li>
164
165                <li>
166                  <a href="#XML_GetParsingStatus">XML_GetParsingStatus</a>
167                </li>
168              </ul>
169            </li>
170
171            <li>
172              <a href="#setting">Handler Setting Functions</a>
173              <ul>
174                <li>
175                  <a href="#XML_SetStartElementHandler">XML_SetStartElementHandler</a>
176                </li>
177
178                <li>
179                  <a href="#XML_SetEndElementHandler">XML_SetEndElementHandler</a>
180                </li>
181
182                <li>
183                  <a href="#XML_SetElementHandler">XML_SetElementHandler</a>
184                </li>
185
186                <li>
187                  <a href="#XML_SetCharacterDataHandler">XML_SetCharacterDataHandler</a>
188                </li>
189
190                <li>
191                  <a href=
192                  "#XML_SetProcessingInstructionHandler">XML_SetProcessingInstructionHandler</a>
193                </li>
194
195                <li>
196                  <a href="#XML_SetCommentHandler">XML_SetCommentHandler</a>
197                </li>
198
199                <li>
200                  <a href=
201                  "#XML_SetStartCdataSectionHandler">XML_SetStartCdataSectionHandler</a>
202                </li>
203
204                <li>
205                  <a href=
206                  "#XML_SetEndCdataSectionHandler">XML_SetEndCdataSectionHandler</a>
207                </li>
208
209                <li>
210                  <a href="#XML_SetCdataSectionHandler">XML_SetCdataSectionHandler</a>
211                </li>
212
213                <li>
214                  <a href="#XML_SetDefaultHandler">XML_SetDefaultHandler</a>
215                </li>
216
217                <li>
218                  <a href="#XML_SetDefaultHandlerExpand">XML_SetDefaultHandlerExpand</a>
219                </li>
220
221                <li>
222                  <a href=
223                  "#XML_SetExternalEntityRefHandler">XML_SetExternalEntityRefHandler</a>
224                </li>
225
226                <li>
227                  <a href=
228                  "#XML_SetExternalEntityRefHandlerArg">XML_SetExternalEntityRefHandlerArg</a>
229                </li>
230
231                <li>
232                  <a href="#XML_SetSkippedEntityHandler">XML_SetSkippedEntityHandler</a>
233                </li>
234
235                <li>
236                  <a href=
237                  "#XML_SetUnknownEncodingHandler">XML_SetUnknownEncodingHandler</a>
238                </li>
239
240                <li>
241                  <a href=
242                  "#XML_SetStartNamespaceDeclHandler">XML_SetStartNamespaceDeclHandler</a>
243                </li>
244
245                <li>
246                  <a href=
247                  "#XML_SetEndNamespaceDeclHandler">XML_SetEndNamespaceDeclHandler</a>
248                </li>
249
250                <li>
251                  <a href="#XML_SetNamespaceDeclHandler">XML_SetNamespaceDeclHandler</a>
252                </li>
253
254                <li>
255                  <a href="#XML_SetXmlDeclHandler">XML_SetXmlDeclHandler</a>
256                </li>
257
258                <li>
259                  <a href=
260                  "#XML_SetStartDoctypeDeclHandler">XML_SetStartDoctypeDeclHandler</a>
261                </li>
262
263                <li>
264                  <a href=
265                  "#XML_SetEndDoctypeDeclHandler">XML_SetEndDoctypeDeclHandler</a>
266                </li>
267
268                <li>
269                  <a href="#XML_SetDoctypeDeclHandler">XML_SetDoctypeDeclHandler</a>
270                </li>
271
272                <li>
273                  <a href="#XML_SetElementDeclHandler">XML_SetElementDeclHandler</a>
274                </li>
275
276                <li>
277                  <a href="#XML_SetAttlistDeclHandler">XML_SetAttlistDeclHandler</a>
278                </li>
279
280                <li>
281                  <a href="#XML_SetEntityDeclHandler">XML_SetEntityDeclHandler</a>
282                </li>
283
284                <li>
285                  <a href=
286                  "#XML_SetUnparsedEntityDeclHandler">XML_SetUnparsedEntityDeclHandler</a>
287                </li>
288
289                <li>
290                  <a href="#XML_SetNotationDeclHandler">XML_SetNotationDeclHandler</a>
291                </li>
292
293                <li>
294                  <a href="#XML_SetNotStandaloneHandler">XML_SetNotStandaloneHandler</a>
295                </li>
296              </ul>
297            </li>
298
299            <li>
300              <a href="#position">Parse Position and Error Reporting Functions</a>
301              <ul>
302                <li>
303                  <a href="#XML_GetErrorCode">XML_GetErrorCode</a>
304                </li>
305
306                <li>
307                  <a href="#XML_ErrorString">XML_ErrorString</a>
308                </li>
309
310                <li>
311                  <a href="#XML_GetCurrentByteIndex">XML_GetCurrentByteIndex</a>
312                </li>
313
314                <li>
315                  <a href="#XML_GetCurrentLineNumber">XML_GetCurrentLineNumber</a>
316                </li>
317
318                <li>
319                  <a href="#XML_GetCurrentColumnNumber">XML_GetCurrentColumnNumber</a>
320                </li>
321
322                <li>
323                  <a href="#XML_GetCurrentByteCount">XML_GetCurrentByteCount</a>
324                </li>
325
326                <li>
327                  <a href="#XML_GetInputContext">XML_GetInputContext</a>
328                </li>
329              </ul>
330            </li>
331
332            <li>
333              <a href="#attack-protection">Attack Protection</a>
334              <ul>
335                <li>
336                  <a href=
337                  "#XML_SetBillionLaughsAttackProtectionMaximumAmplification">XML_SetBillionLaughsAttackProtectionMaximumAmplification</a>
338                </li>
339
340                <li>
341                  <a href=
342                  "#XML_SetBillionLaughsAttackProtectionActivationThreshold">XML_SetBillionLaughsAttackProtectionActivationThreshold</a>
343                </li>
344
345                <li>
346                  <a href=
347                  "#XML_SetAllocTrackerMaximumAmplification">XML_SetAllocTrackerMaximumAmplification</a>
348                </li>
349
350                <li>
351                  <a href=
352                  "#XML_SetAllocTrackerActivationThreshold">XML_SetAllocTrackerActivationThreshold</a>
353                </li>
354
355                <li>
356                  <a href=
357                  "#XML_SetReparseDeferralEnabled">XML_SetReparseDeferralEnabled</a>
358                </li>
359              </ul>
360            </li>
361
362            <li>
363              <a href="#miscellaneous">Miscellaneous Functions</a>
364              <ul>
365                <li>
366                  <a href="#XML_SetUserData">XML_SetUserData</a>
367                </li>
368
369                <li>
370                  <a href="#XML_GetUserData">XML_GetUserData</a>
371                </li>
372
373                <li>
374                  <a href="#XML_UseParserAsHandlerArg">XML_UseParserAsHandlerArg</a>
375                </li>
376
377                <li>
378                  <a href="#XML_SetBase">XML_SetBase</a>
379                </li>
380
381                <li>
382                  <a href="#XML_GetBase">XML_GetBase</a>
383                </li>
384
385                <li>
386                  <a href=
387                  "#XML_GetSpecifiedAttributeCount">XML_GetSpecifiedAttributeCount</a>
388                </li>
389
390                <li>
391                  <a href="#XML_GetIdAttributeIndex">XML_GetIdAttributeIndex</a>
392                </li>
393
394                <li>
395                  <a href="#XML_GetAttributeInfo">XML_GetAttributeInfo</a>
396                </li>
397
398                <li>
399                  <a href="#XML_SetEncoding">XML_SetEncoding</a>
400                </li>
401
402                <li>
403                  <a href="#XML_SetParamEntityParsing">XML_SetParamEntityParsing</a>
404                </li>
405
406                <li>
407                  <a href="#XML_SetHashSalt">XML_SetHashSalt</a> (deprecated)
408                </li>
409
410                <li>
411                  <a href="#XML_SetHashSalt16Bytes">XML_SetHashSalt16Bytes</a>
412                </li>
413
414                <li>
415                  <a href="#XML_UseForeignDTD">XML_UseForeignDTD</a>
416                </li>
417
418                <li>
419                  <a href="#XML_SetReturnNSTriplet">XML_SetReturnNSTriplet</a>
420                </li>
421
422                <li>
423                  <a href="#XML_DefaultCurrent">XML_DefaultCurrent</a>
424                </li>
425
426                <li>
427                  <a href="#XML_ExpatVersion">XML_ExpatVersion</a>
428                </li>
429
430                <li>
431                  <a href="#XML_ExpatVersionInfo">XML_ExpatVersionInfo</a>
432                </li>
433
434                <li>
435                  <a href="#XML_GetFeatureList">XML_GetFeatureList</a>
436                </li>
437
438                <li>
439                  <a href="#XML_FreeContentModel">XML_FreeContentModel</a>
440                </li>
441
442                <li>
443                  <a href="#XML_MemMalloc">XML_MemMalloc</a>
444                </li>
445
446                <li>
447                  <a href="#XML_MemRealloc">XML_MemRealloc</a>
448                </li>
449
450                <li>
451                  <a href="#XML_MemFree">XML_MemFree</a>
452                </li>
453              </ul>
454            </li>
455          </ul>
456        </li>
457      </ul>
458
459      <hr />
460
461      <h2>
462        <a id="overview" name="overview">Overview</a>
463      </h2>
464
465      <p>
466        Expat is a stream-oriented parser. You register callback (or handler) functions
467        with the parser and then start feeding it the document. As the parser recognizes
468        parts of the document, it will call the appropriate handler for that part (if
469        you've registered one.) The document is fed to the parser in pieces, so you can
470        start parsing before you have all the document. This also allows you to parse
471        really huge documents that won't fit into memory.
472      </p>
473
474      <p>
475        Expat can be intimidating due to the many kinds of handlers and options you can
476        set. But you only need to learn four functions in order to do 90% of what you'll
477        want to do with it:
478      </p>
479
480      <dl>
481        <dt>
482          <code><a href="#XML_ParserCreate">XML_ParserCreate</a></code>
483        </dt>
484
485        <dd>
486          Create a new parser object.
487        </dd>
488
489        <dt>
490          <code><a href="#XML_SetElementHandler">XML_SetElementHandler</a></code>
491        </dt>
492
493        <dd>
494          Set handlers for start and end tags.
495        </dd>
496
497        <dt>
498          <code><a href=
499          "#XML_SetCharacterDataHandler">XML_SetCharacterDataHandler</a></code>
500        </dt>
501
502        <dd>
503          Set handler for text.
504        </dd>
505
506        <dt>
507          <code><a href="#XML_Parse">XML_Parse</a></code>
508        </dt>
509
510        <dd>
511          Pass a buffer full of document to the parser
512        </dd>
513      </dl>
514
515      <p>
516        These functions and others are described in the <a href=
517        "#reference">reference</a> part of this document. The reference section also
518        describes in detail the parameters passed to the different types of handlers.
519      </p>
520
521      <p>
522        Let's look at a very simple example program that only uses 3 of the above
523        functions (it doesn't need to set a character handler.) The program <a href=
524        "../examples/outline.c">outline.c</a> prints an element outline, indenting child
525        elements to distinguish them from the parent element that contains them. The
526        start handler does all the work. It prints two indenting spaces for every level
527        of ancestor elements, then it prints the element and attribute information.
528        Finally it increments the global <code>Depth</code> variable.
529      </p>
530
531      <pre class="eg">
532int Depth;
533
534void XMLCALL
535start(void *data, const char *el, const char **attr) {
536  int i;
537
538  for (i = 0; i &lt; Depth; i++)
539    printf("  ");
540
541  printf("%s", el);
542
543  for (i = 0; attr[i]; i += 2) {
544    printf(" %s='%s'", attr[i], attr[i + 1]);
545  }
546
547  printf("\n");
548  Depth++;
549}  /* End of start handler */
550</pre>
551      <p>
552        The end tag simply does the bookkeeping work of decrementing <code>Depth</code>.
553      </p>
554
555      <pre class="eg">
556void XMLCALL
557end(void *data, const char *el) {
558  Depth--;
559}  /* End of end handler */
560</pre>
561      <p>
562        Note the <code>XMLCALL</code> annotation used for the callbacks. This is used to
563        ensure that the Expat and the callbacks are using the same calling convention in
564        case the compiler options used for Expat itself and the client code are
565        different. Expat tries not to care what the default calling convention is, though
566        it may require that it be compiled with a default convention of "cdecl" on some
567        platforms. For code which uses Expat, however, the calling convention is
568        specified by the <code>XMLCALL</code> annotation on most platforms; callbacks
569        should be defined using this annotation.
570      </p>
571
572      <p>
573        The <code>XMLCALL</code> annotation was added in Expat 1.95.7, but existing
574        working Expat applications don't need to add it (since they are already using the
575        "cdecl" calling convention, or they wouldn't be working). The annotation is only
576        needed if the default calling convention may be something other than "cdecl". To
577        use the annotation safely with older versions of Expat, you can conditionally
578        define it <em>after</em> including Expat's header file:
579      </p>
580
581      <pre class="eg">
582#include &lt;expat.h&gt;
583
584#ifndef XMLCALL
585#if defined(_MSC_VER) &amp;&amp; !defined(__BEOS__) &amp;&amp; !defined(__CYGWIN__)
586#define XMLCALL __cdecl
587#elif defined(__GNUC__)
588#define XMLCALL __attribute__((cdecl))
589#else
590#define XMLCALL
591#endif
592#endif
593</pre>
594      <p>
595        After creating the parser, the main program just has the job of shoveling the
596        document to the parser so that it can do its work.
597      </p>
598
599      <hr />
600
601      <h2>
602        <a id="building" name="building">Building and Installing Expat</a>
603      </h2>
604
605      <p>
606        The Expat distribution comes as a compressed (with GNU gzip) tar file. You may
607        download the latest version from <a href=
608        "https://sourceforge.net/projects/expat/">Source Forge</a>. After unpacking this,
609        cd into the directory. Then follow either the Win32 directions or Unix directions
610        below.
611      </p>
612
613      <h3>
614        Building under Win32
615      </h3>
616
617      <p>
618        If you're using the GNU compiler under cygwin, follow the Unix directions in the
619        next section. Otherwise if you have Microsoft's Developer Studio installed, you
620        can use CMake to generate a <code>.sln</code> file, e.g. <code>cmake -G"Visual
621        Studio 17 2022" -DCMAKE_BUILD_TYPE=RelWithDebInfo .</code> , and build Expat
622        using <code>msbuild /m expat.sln</code> after.
623      </p>
624
625      <p>
626        Alternatively, you may download the Win32 binary package that contains the
627        "expat.h" include file and a pre-built DLL.
628      </p>
629
630      <h3>
631        Building under Unix (or GNU)
632      </h3>
633
634      <p>
635        First you'll need to run the configure shell script in order to configure the
636        Makefiles and headers for your system.
637      </p>
638
639      <p>
640        If you're happy with all the defaults that configure picks for you, and you have
641        permission on your system to install into /usr/local, you can install Expat with
642        this sequence of commands:
643      </p>
644
645      <pre class="eg">
646./configure
647make
648make install
649</pre>
650      <p>
651        There are some options that you can provide to this script, but the only one
652        we'll mention here is the <code>--prefix</code> option. You can find out all the
653        options available by running configure with just the <code>--help</code> option.
654      </p>
655
656      <p>
657        By default, the configure script sets things up so that the library gets
658        installed in <code>/usr/local/lib</code> and the associated header file in
659        <code>/usr/local/include</code>. But if you were to give the option,
660        <code>--prefix=/home/me/mystuff</code>, then the library and header would get
661        installed in <code>/home/me/mystuff/lib</code> and
662        <code>/home/me/mystuff/include</code> respectively.
663      </p>
664
665      <h3>
666        Configuring Expat Using the Pre-Processor
667      </h3>
668
669      <p>
670        Expat's feature set can be configured using a small number of pre-processor
671        definitions. The symbols are:
672      </p>
673
674      <dl class="cpp-symbols">
675        <dt>
676          <a id="XML_GE" name="XML_GE">XML_GE</a>
677        </dt>
678
679        <dd>
680          Added in Expat 2.6.0. Include support for <a href=
681          "https://www.w3.org/TR/2006/REC-xml-20060816/#sec-physical-struct">general
682          entities</a> (syntax <code>&amp;e1;</code> to reference and syntax
683          <code>&lt;!ENTITY e1 'value1'&gt;</code> (an internal general entity) or
684          <code>&lt;!ENTITY e2 SYSTEM 'file2'&gt;</code> (an external general entity) to
685          declare). With <code>XML_GE</code> enabled, general entities will be replaced
686          by their declared replacement text; for this to work for <em>external</em>
687          general entities, in addition an <code><a href=
688          "#XML_SetExternalEntityRefHandler">XML_ExternalEntityRefHandler</a></code> must
689          be set using <code><a href=
690          "#XML_SetExternalEntityRefHandler">XML_SetExternalEntityRefHandler</a></code>.
691          Also, enabling <code>XML_GE</code> makes the functions <code><a href=
692          "#XML_SetBillionLaughsAttackProtectionMaximumAmplification">XML_SetBillionLaughsAttackProtectionMaximumAmplification</a></code>
693          and <code><a href=
694          "#XML_SetBillionLaughsAttackProtectionActivationThreshold">XML_SetBillionLaughsAttackProtectionActivationThreshold</a></code>
695          available.<br />
696          With <code>XML_GE</code> disabled, Expat has a smaller memory footprint and can
697          be faster, but will not load external general entities and will replace all
698          general entities (except the <a href=
699          "https://www.w3.org/TR/2006/REC-xml-20060816/#sec-predefined-ent">predefined
700          five</a>: <code>amp</code>, <code>apos</code>, <code>gt</code>,
701          <code>lt</code>, <code>quot</code>) with a self-reference: for example,
702          referencing an entity <code>e1</code> via <code>&amp;e1;</code> will be
703          replaced by text <code>&amp;e1;</code>.
704        </dd>
705
706        <dt>
707          <a id="XML_DTD" name="XML_DTD">XML_DTD</a>
708        </dt>
709
710        <dd>
711          Include support for using and reporting DTD-based content. If this is defined,
712          default attribute values from an external DTD subset are reported and attribute
713          value normalization occurs based on the type of attributes defined in the
714          external subset. Without this, Expat has a smaller memory footprint and can be
715          faster, but will not load external parameter entities or process conditional
716          sections. If defined, makes the functions <code><a href=
717          "#XML_SetBillionLaughsAttackProtectionMaximumAmplification">XML_SetBillionLaughsAttackProtectionMaximumAmplification</a></code>
718          and <code><a href=
719          "#XML_SetBillionLaughsAttackProtectionActivationThreshold">XML_SetBillionLaughsAttackProtectionActivationThreshold</a></code>
720          available.
721        </dd>
722
723        <dt>
724          <a id="XML_NS" name="XML_NS">XML_NS</a>
725        </dt>
726
727        <dd>
728          When defined, support for the <cite><a href=
729          "https://www.w3.org/TR/REC-xml-names/">Namespaces in XML</a></cite>
730          specification is included.
731        </dd>
732
733        <dt>
734          <a id="XML_UNICODE" name="XML_UNICODE">XML_UNICODE</a>
735        </dt>
736
737        <dd>
738          When defined, character data reported to the application is encoded in UTF-16
739          using wide characters of the type <code>XML_Char</code>. This is implied if
740          <code>XML_UNICODE_WCHAR_T</code> is defined.
741        </dd>
742
743        <dt>
744          <a id="XML_UNICODE_WCHAR_T" name="XML_UNICODE_WCHAR_T">XML_UNICODE_WCHAR_T</a>
745        </dt>
746
747        <dd>
748          If defined, causes the <code>XML_Char</code> character type to be defined using
749          the <code>wchar_t</code> type; otherwise, <code>unsigned short</code> is used.
750          Defining this implies <code>XML_UNICODE</code>.
751        </dd>
752
753        <dt>
754          <a id="XML_LARGE_SIZE" name="XML_LARGE_SIZE">XML_LARGE_SIZE</a>
755        </dt>
756
757        <dd>
758          If defined, causes the <code>XML_Size</code> and <code>XML_Index</code> integer
759          types to be at least 64 bits in size. This is intended to support processing of
760          very large input streams, where the return values of <code><a href=
761          "#XML_GetCurrentByteIndex">XML_GetCurrentByteIndex</a></code>, <code><a href=
762          "#XML_GetCurrentLineNumber">XML_GetCurrentLineNumber</a></code> and
763          <code><a href=
764          "#XML_GetCurrentColumnNumber">XML_GetCurrentColumnNumber</a></code> could
765          overflow. It may not be supported by all compilers, and is turned off by
766          default.
767        </dd>
768
769        <dt>
770          <a id="XML_CONTEXT_BYTES" name="XML_CONTEXT_BYTES">XML_CONTEXT_BYTES</a>
771        </dt>
772
773        <dd>
774          The number of input bytes of markup context which the parser will ensure are
775          available for reporting via <code><a href=
776          "#XML_GetInputContext">XML_GetInputContext</a></code>. This is normally set to
777          1024, and must be set to a positive integer to enable. If this is set to zero,
778          the input context will not be available and <code><a href=
779          "#XML_GetInputContext">XML_GetInputContext</a></code> will always report
780          <code>NULL</code>. Without this, Expat has a smaller memory footprint and can
781          be faster.
782        </dd>
783
784        <dt>
785          <a id="XML_STATIC" name="XML_STATIC">XML_STATIC</a>
786        </dt>
787
788        <dd>
789          On Windows, this should be set if Expat is going to be linked statically with
790          the code that calls it; this is required to get all the right MSVC magic
791          annotations correct. This is ignored on other platforms.
792        </dd>
793
794        <dt>
795          <a id="XML_ATTR_INFO" name="XML_ATTR_INFO">XML_ATTR_INFO</a>
796        </dt>
797
798        <dd>
799          If defined, makes the additional function <code><a href=
800          "#XML_GetAttributeInfo">XML_GetAttributeInfo</a></code> available for reporting
801          attribute byte offsets.
802        </dd>
803      </dl>
804
805      <hr />
806
807      <h2>
808        <a id="using" name="using">Using Expat</a>
809      </h2>
810
811      <h3>
812        Compiling and Linking Against Expat
813      </h3>
814
815      <p>
816        Unless you installed Expat in a location not expected by your compiler and
817        linker, all you have to do to use Expat in your programs is to include the Expat
818        header (<code>#include &lt;expat.h&gt;</code>) in your files that make calls to
819        it and to tell the linker that it needs to link against the Expat library. On
820        Unix systems, this would usually be done with the <code>-lexpat</code> argument.
821        Otherwise, you'll need to tell the compiler where to look for the Expat header
822        and the linker where to find the Expat library. You may also need to take steps
823        to tell the operating system where to find this library at run time.
824      </p>
825
826      <p>
827        On a Unix-based system, here's what a Makefile might look like when Expat is
828        installed in a standard location:
829      </p>
830
831      <pre class="eg">
832CC=cc
833LDFLAGS=
834LIBS= -lexpat
835xmlapp: xmlapp.o
836        $(CC) $(LDFLAGS) -o xmlapp xmlapp.o $(LIBS)
837</pre>
838      <p>
839        If you installed Expat in, say, <code>/home/me/mystuff</code>, then the Makefile
840        would look like this:
841      </p>
842
843      <pre class="eg">
844CC=cc
845CFLAGS= -I/home/me/mystuff/include
846LDFLAGS=
847LIBS= -L/home/me/mystuff/lib -lexpat
848xmlapp: xmlapp.o
849        $(CC) $(LDFLAGS) -o xmlapp xmlapp.o $(LIBS)
850</pre>
851      <p>
852        You'd also have to set the environment variable <code>LD_LIBRARY_PATH</code> to
853        <code>/home/me/mystuff/lib</code> (or to
854        <code>${LD_LIBRARY_PATH}:/home/me/mystuff/lib</code> if LD_LIBRARY_PATH already
855        has some directories in it) in order to run your application.
856      </p>
857
858      <h3>
859        Expat Basics
860      </h3>
861
862      <p>
863        As we saw in the example in the overview, the first step in parsing an XML
864        document with Expat is to create a parser object. There are <a href=
865        "#creation">three functions</a> in the Expat API for creating a parser object.
866        However, only two of these (<code><a href=
867        "#XML_ParserCreate">XML_ParserCreate</a></code> and <code><a href=
868        "#XML_ParserCreateNS">XML_ParserCreateNS</a></code>) can be used for constructing
869        a parser for a top-level document. The object returned by these functions is an
870        opaque pointer (i.e. "expat.h" declares it as void *) to data with further
871        internal structure. In order to free the memory associated with this object you
872        must call <code><a href="#XML_ParserFree">XML_ParserFree</a></code>. Note that if
873        you have provided any <a href="#userdata">user data</a> that gets stored in the
874        parser, then your application is responsible for freeing it prior to calling
875        <code>XML_ParserFree</code>.
876      </p>
877
878      <p>
879        The objects returned by the parser creation functions are good for parsing only
880        one XML document or external parsed entity. If your application needs to parse
881        many XML documents, then it needs to create a parser object for each one. The
882        best way to deal with this is to create a higher level object that contains all
883        the default initialization you want for your parser objects.
884      </p>
885
886      <p>
887        Walking through a document hierarchy with a stream oriented parser will require a
888        good stack mechanism in order to keep track of current context. For instance, to
889        answer the simple question, "What element does this text belong to?" requires a
890        stack, since the parser may have descended into other elements that are children
891        of the current one and has encountered this text on the way out.
892      </p>
893
894      <p>
895        The things you're likely to want to keep on a stack are the currently opened
896        element and it's attributes. You push this information onto the stack in the
897        start handler and you pop it off in the end handler.
898      </p>
899
900      <p>
901        For some tasks, it is sufficient to just keep information on what the depth of
902        the stack is (or would be if you had one.) The outline program shown above
903        presents one example. Another such task would be skipping over a complete
904        element. When you see the start tag for the element you want to skip, you set a
905        skip flag and record the depth at which the element started. When the end tag
906        handler encounters the same depth, the skipped element has ended and the flag may
907        be cleared. If you follow the convention that the root element starts at 1, then
908        you can use the same variable for skip flag and skip depth.
909      </p>
910
911      <pre class="eg">
912void
913init_info(Parseinfo *info) {
914  info-&gt;skip = 0;
915  info-&gt;depth = 1;
916  /* Other initializations here */
917}  /* End of init_info */
918
919void XMLCALL
920rawstart(void *data, const char *el, const char **attr) {
921  Parseinfo *inf = (Parseinfo *) data;
922
923  if (! inf-&gt;skip) {
924    if (should_skip(inf, el, attr)) {
925      inf-&gt;skip = inf-&gt;depth;
926    }
927    else
928      start(inf, el, attr);     /* This does rest of start handling */
929  }
930
931  inf-&gt;depth++;
932}  /* End of rawstart */
933
934void XMLCALL
935rawend(void *data, const char *el) {
936  Parseinfo *inf = (Parseinfo *) data;
937
938  inf-&gt;depth--;
939
940  if (! inf-&gt;skip)
941    end(inf, el);              /* This does rest of end handling */
942
943  if (inf-&gt;skip == inf-&gt;depth)
944    inf-&gt;skip = 0;
945}  /* End rawend */
946</pre>
947      <p>
948        Notice in the above example the difference in how depth is manipulated in the
949        start and end handlers. The end tag handler should be the mirror image of the
950        start tag handler. This is necessary to properly model containment. Since, in the
951        start tag handler, we incremented depth <em>after</em> the main body of start tag
952        code, then in the end handler, we need to manipulate it <em>before</em> the main
953        body. If we'd decided to increment it first thing in the start handler, then we'd
954        have had to decrement it last thing in the end handler.
955      </p>
956
957      <h3 id="userdata">
958        Communicating between handlers
959      </h3>
960
961      <p>
962        In order to be able to pass information between different handlers without using
963        globals, you'll need to define a data structure to hold the shared variables. You
964        can then tell Expat (with the <code><a href=
965        "#XML_SetUserData">XML_SetUserData</a></code> function) to pass a pointer to this
966        structure to the handlers. This is the first argument received by most handlers.
967        In the <a href="#reference">reference section</a>, an argument to a callback
968        function is named <code>userData</code> and have type <code>void *</code> if the
969        user data is passed; it will have the type <code>XML_Parser</code> if the parser
970        itself is passed. When the parser is passed, the user data may be retrieved using
971        <code><a href="#XML_GetUserData">XML_GetUserData</a></code>.
972      </p>
973
974      <p>
975        One common case where multiple calls to a single handler may need to communicate
976        using an application data structure is the case when content passed to the
977        character data handler (set by <code><a href=
978        "#XML_SetCharacterDataHandler">XML_SetCharacterDataHandler</a></code>) needs to
979        be accumulated. A common first-time mistake with any of the event-oriented
980        interfaces to an XML parser is to expect all the text contained in an element to
981        be reported by a single call to the character data handler. Expat, like many
982        other XML parsers, reports such data as a sequence of calls; there's no way to
983        know when the end of the sequence is reached until a different callback is made.
984        A buffer referenced by the user data structure proves both an effective and
985        convenient place to accumulate character data.
986      </p>
987      <!-- XXX example needed here -->
988
989      <h3>
990        XML Version
991      </h3>
992
993      <p>
994        Expat is an XML 1.0 parser, and as such never complains based on the value of the
995        <code>version</code> pseudo-attribute in the XML declaration, if present.
996      </p>
997
998      <p>
999        If an application needs to check the version number (to support alternate
1000        processing), it should use the <code><a href=
1001        "#XML_SetXmlDeclHandler">XML_SetXmlDeclHandler</a></code> function to set a
1002        handler that uses the information in the XML declaration to determine what to do.
1003        This example shows how to check that only a version number of <code>"1.0"</code>
1004        is accepted:
1005      </p>
1006
1007      <pre class="eg">
1008static int wrong_version;
1009static XML_Parser parser;
1010
1011static void XMLCALL
1012xmldecl_handler(void            *userData,
1013                const XML_Char  *version,
1014                const XML_Char  *encoding,
1015                int              standalone)
1016{
1017  static const XML_Char Version_1_0[] = {'1', '.', '0', 0};
1018
1019  int i;
1020
1021  for (i = 0; i &lt; (sizeof(Version_1_0) / sizeof(Version_1_0[0])); ++i) {
1022    if (version[i] != Version_1_0[i]) {
1023      wrong_version = 1;
1024      /* also clear all other handlers: */
1025      XML_SetCharacterDataHandler(parser, NULL);
1026      ...
1027      return;
1028    }
1029  }
1030  ...
1031}
1032</pre>
1033      <h3>
1034        Namespace Processing
1035      </h3>
1036
1037      <p>
1038        When the parser is created using the <code><a href=
1039        "#XML_ParserCreateNS">XML_ParserCreateNS</a></code>, function, Expat performs
1040        namespace processing. Under namespace processing, Expat consumes
1041        <code>xmlns</code> and <code>xmlns:...</code> attributes, which declare
1042        namespaces for the scope of the element in which they occur. This means that your
1043        start handler will not see these attributes. Your application can still be
1044        informed of these declarations by setting namespace declaration handlers with
1045        <a href=
1046        "#XML_SetNamespaceDeclHandler"><code>XML_SetNamespaceDeclHandler</code></a>.
1047      </p>
1048
1049      <p>
1050        Element type and attribute names that belong to a given namespace are passed to
1051        the appropriate handler in expanded form. By default this expanded form is a
1052        concatenation of the namespace URI, the separator character (which is the 2nd
1053        argument to <code><a href="#XML_ParserCreateNS">XML_ParserCreateNS</a></code>),
1054        and the local name (i.e. the part after the colon). Names with undeclared
1055        prefixes are not well-formed when namespace processing is enabled, and will
1056        trigger an error. Unprefixed attribute names are never expanded, and unprefixed
1057        element names are only expanded when they are in the scope of a default
1058        namespace.
1059      </p>
1060
1061      <p>
1062        However if <code><a href=
1063        "#XML_SetReturnNSTriplet">XML_SetReturnNSTriplet</a></code> has been called with
1064        a non-zero <code>do_nst</code> parameter, then the expanded form for names with
1065        an explicit prefix is a concatenation of: URI, separator, local name, separator,
1066        prefix.
1067      </p>
1068
1069      <p>
1070        You can set handlers for the start of a namespace declaration and for the end of
1071        a scope of a declaration with the <code><a href=
1072        "#XML_SetNamespaceDeclHandler">XML_SetNamespaceDeclHandler</a></code> function.
1073        The StartNamespaceDeclHandler is called prior to the start tag handler and the
1074        EndNamespaceDeclHandler is called after the corresponding end tag that ends the
1075        namespace's scope. The namespace start handler gets passed the prefix and URI for
1076        the namespace. For a default namespace declaration (xmlns='...'), the prefix will
1077        be <code>NULL</code>. The URI will be <code>NULL</code> for the case where the
1078        default namespace is being unset. The namespace end handler just gets the prefix
1079        for the closing scope.
1080      </p>
1081
1082      <p>
1083        These handlers are called for each declaration. So if, for instance, a start tag
1084        had three namespace declarations, then the StartNamespaceDeclHandler would be
1085        called three times before the start tag handler is called, once for each
1086        declaration.
1087      </p>
1088
1089      <h3>
1090        Character Encodings
1091      </h3>
1092
1093      <p>
1094        While XML is based on Unicode, and every XML processor is required to recognized
1095        UTF-8 and UTF-16 (1 and 2 byte encodings of Unicode), other encodings may be
1096        declared in XML documents or entities. For the main document, an XML declaration
1097        may contain an encoding declaration:
1098      </p>
1099
1100      <pre>
1101&lt;?xml version="1.0" encoding="ISO-8859-2"?&gt;
1102</pre>
1103      <p>
1104        External parsed entities may begin with a text declaration, which looks like an
1105        XML declaration with just an encoding declaration:
1106      </p>
1107
1108      <pre>
1109&lt;?xml encoding="Big5"?&gt;
1110</pre>
1111      <p>
1112        With Expat, you may also specify an encoding at the time of creating a parser.
1113        This is useful when the encoding information may come from a source outside the
1114        document itself (like a higher level protocol.)
1115      </p>
1116
1117      <p>
1118        <a id="builtin_encodings" name="builtin_encodings"></a>There are four built-in
1119        encodings in Expat:
1120      </p>
1121
1122      <ul>
1123        <li>UTF-8
1124        </li>
1125
1126        <li>UTF-16
1127        </li>
1128
1129        <li>ISO-8859-1
1130        </li>
1131
1132        <li>US-ASCII
1133        </li>
1134      </ul>
1135
1136      <p>
1137        Anything else discovered in an encoding declaration or in the protocol encoding
1138        specified in the parser constructor, triggers a call to the
1139        <code>UnknownEncodingHandler</code>. This handler gets passed the encoding name
1140        and a pointer to an <code>XML_Encoding</code> data structure. Your handler must
1141        fill in this structure and return <code>XML_STATUS_OK</code> if it knows how to
1142        deal with the encoding. Otherwise the handler should return
1143        <code>XML_STATUS_ERROR</code>. The handler also gets passed a pointer to an
1144        optional application data structure that you may indicate when you set the
1145        handler.
1146      </p>
1147
1148      <p>
1149        Expat places restrictions on character encodings that it can support by filling
1150        in the <code>XML_Encoding</code> structure. include file:
1151      </p>
1152
1153      <ol>
1154        <li>Every ASCII character that can appear in a well-formed XML document must be
1155        represented by a single byte, and that byte must correspond to it's ASCII
1156        encoding (except for the characters $@\^'{}~)
1157        </li>
1158
1159        <li>Characters must be encoded in 4 bytes or less.
1160        </li>
1161
1162        <li>All characters encoded must have Unicode scalar values less than or equal to
1163        65535 (0xFFFF)<em>This does not apply to the built-in support for UTF-16 and
1164        UTF-8</em>
1165        </li>
1166
1167        <li>No character may be encoded by more that one distinct sequence of bytes
1168        </li>
1169      </ol>
1170
1171      <p>
1172        <code>XML_Encoding</code> contains an array of integers that correspond to the
1173        1st byte of an encoding sequence. If the value in the array for a byte is zero or
1174        positive, then the byte is a single byte encoding that encodes the Unicode scalar
1175        value contained in the array. A -1 in this array indicates a malformed byte. If
1176        the value is -2, -3, or -4, then the byte is the beginning of a 2, 3, or 4 byte
1177        sequence respectively. Multi-byte sequences are sent to the convert function
1178        pointed at in the <code>XML_Encoding</code> structure. This function should
1179        return the Unicode scalar value for the sequence or -1 if the sequence is
1180        malformed.
1181      </p>
1182
1183      <p>
1184        One pitfall that novice Expat users are likely to fall into is that although
1185        Expat may accept input in various encodings, the strings that it passes to the
1186        handlers are always encoded in UTF-8 or UTF-16 (depending on how Expat was
1187        compiled). Your application is responsible for any translation of these strings
1188        into other encodings.
1189      </p>
1190
1191      <h3>
1192        Handling External Entity References
1193      </h3>
1194
1195      <p>
1196        Expat does not read or parse external entities directly. Note that any external
1197        DTD is a special case of an external entity. If you've set no
1198        <code>ExternalEntityRefHandler</code>, then external entity references are
1199        silently ignored. Otherwise, it calls your handler with the information needed to
1200        read and parse the external entity.
1201      </p>
1202
1203      <p>
1204        Your handler isn't actually responsible for parsing the entity, but it is
1205        responsible for creating a subsidiary parser with <code><a href=
1206        "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code> that
1207        will do the job. This returns an instance of <code>XML_Parser</code> that has
1208        handlers and other data structures initialized from the parent parser. You may
1209        then use <code><a href="#XML_Parse">XML_Parse</a></code> or <code><a href=
1210        "#XML_ParseBuffer">XML_ParseBuffer</a></code> calls against this parser. Since
1211        external entities my refer to other external entities, your handler should be
1212        prepared to be called recursively.
1213      </p>
1214
1215      <h3>
1216        Parsing DTDs
1217      </h3>
1218
1219      <p>
1220        In order to parse parameter entities, before starting the parse, you must call
1221        <code><a href="#XML_SetParamEntityParsing">XML_SetParamEntityParsing</a></code>
1222        with one of the following arguments:
1223      </p>
1224
1225      <dl>
1226        <dt>
1227          <code>XML_PARAM_ENTITY_PARSING_NEVER</code>
1228        </dt>
1229
1230        <dd>
1231          Don't parse parameter entities or the external subset
1232        </dd>
1233
1234        <dt>
1235          <code>XML_PARAM_ENTITY_PARSING_UNLESS_STANDALONE</code>
1236        </dt>
1237
1238        <dd>
1239          Parse parameter entities and the external subset unless <code>standalone</code>
1240          was set to "yes" in the XML declaration.
1241        </dd>
1242
1243        <dt>
1244          <code>XML_PARAM_ENTITY_PARSING_ALWAYS</code>
1245        </dt>
1246
1247        <dd>
1248          Always parse parameter entities and the external subset
1249        </dd>
1250      </dl>
1251
1252      <p>
1253        In order to read an external DTD, you also have to set an external entity
1254        reference handler as described above.
1255      </p>
1256
1257      <h3 id="stop-resume">
1258        Temporarily Stopping Parsing
1259      </h3>
1260
1261      <p>
1262        Expat 1.95.8 introduces a new feature: its now possible to stop parsing
1263        temporarily from within a handler function, even if more data has already been
1264        passed into the parser. Applications for this include
1265      </p>
1266
1267      <ul>
1268        <li>Supporting the <a href="https://www.w3.org/TR/xinclude/">XInclude</a>
1269        specification.
1270        </li>
1271
1272        <li>Delaying further processing until additional information is available from
1273        some other source.
1274        </li>
1275
1276        <li>Adjusting processor load as task priorities shift within an application.
1277        </li>
1278
1279        <li>Stopping parsing completely (simply free or reset the parser instead of
1280        resuming in the outer parsing loop). This can be useful if an application-domain
1281        error is found in the XML being parsed or if the result of the parse is
1282        determined not to be useful after all.
1283        </li>
1284      </ul>
1285
1286      <p>
1287        To take advantage of this feature, the main parsing loop of an application needs
1288        to support this specifically. It cannot be supported with a parsing loop
1289        compatible with Expat 1.95.7 or earlier (though existing loops will continue to
1290        work without supporting the stop/resume feature).
1291      </p>
1292
1293      <p>
1294        An application that uses this feature for a single parser will have the rough
1295        structure (in pseudo-code):
1296      </p>
1297
1298      <pre class="pseudocode">
1299fd = open_input()
1300p = create_parser()
1301
1302if parse_xml(p, fd) {
1303  /* suspended */
1304
1305  int suspended = 1;
1306
1307  while (suspended) {
1308    do_something_else()
1309    if ready_to_resume() {
1310      suspended = continue_parsing(p, fd);
1311    }
1312  }
1313}
1314</pre>
1315      <p>
1316        An application that may resume any of several parsers based on input (either from
1317        the XML being parsed or some other source) will certainly have more interesting
1318        control structures.
1319      </p>
1320
1321      <p>
1322        This C function could be used for the <code>parse_xml</code> function mentioned
1323        in the pseudo-code above:
1324      </p>
1325
1326      <pre class="eg">
1327#define BUFF_SIZE 10240
1328
1329/* Parse a document from the open file descriptor 'fd' until the parse
1330   is complete (the document has been completely parsed, or there's
1331   been an error), or the parse is stopped.  Return non-zero when
1332   the parse is merely suspended.
1333*/
1334int
1335parse_xml(XML_Parser p, int fd)
1336{
1337  for (;;) {
1338    int last_chunk;
1339    int bytes_read;
1340    enum XML_Status status;
1341
1342    void *buff = XML_GetBuffer(p, BUFF_SIZE);
1343    if (buff == NULL) {
1344      /* handle error... */
1345      return 0;
1346    }
1347    bytes_read = read(fd, buff, BUFF_SIZE);
1348    if (bytes_read &lt; 0) {
1349      /* handle error... */
1350      return 0;
1351    }
1352    status = XML_ParseBuffer(p, bytes_read, bytes_read == 0);
1353    switch (status) {
1354      case XML_STATUS_ERROR:
1355        /* handle error... */
1356        return 0;
1357      case XML_STATUS_SUSPENDED:
1358        return 1;
1359    }
1360    if (bytes_read == 0)
1361      return 0;
1362  }
1363}
1364</pre>
1365      <p>
1366        The corresponding <code>continue_parsing</code> function is somewhat simpler,
1367        since it only need deal with the return code from <code><a href=
1368        "#XML_ResumeParser">XML_ResumeParser</a></code>; it can delegate the input
1369        handling to the <code>parse_xml</code> function:
1370      </p>
1371
1372      <pre class="eg">
1373/* Continue parsing a document which had been suspended.  The 'p' and
1374   'fd' arguments are the same as passed to parse_xml().  Return
1375   non-zero when the parse is suspended.
1376*/
1377int
1378continue_parsing(XML_Parser p, int fd)
1379{
1380  enum XML_Status status = XML_ResumeParser(p);
1381  switch (status) {
1382    case XML_STATUS_ERROR:
1383      /* handle error... */
1384      return 0;
1385    case XML_ERROR_NOT_SUSPENDED:
1386      /* handle error... */
1387      return 0;.
1388    case XML_STATUS_SUSPENDED:
1389      return 1;
1390  }
1391  return parse_xml(p, fd);
1392}
1393</pre>
1394      <p>
1395        Now that we've seen what a mess the top-level parsing loop can become, what have
1396        we gained? Very simply, we can now use the <code><a href=
1397        "#XML_StopParser">XML_StopParser</a></code> function to stop parsing, without
1398        having to go to great lengths to avoid additional processing that we're expecting
1399        to ignore. As a bonus, we get to stop parsing <em>temporarily</em>, and come back
1400        to it when we're ready.
1401      </p>
1402
1403      <p>
1404        To stop parsing from a handler function, use the <code><a href=
1405        "#XML_StopParser">XML_StopParser</a></code> function. This function takes two
1406        arguments; the parser being stopped and a flag indicating whether the parse can
1407        be resumed in the future.
1408      </p>
1409      <!-- XXX really need more here -->
1410
1411      <hr />
1412      <!-- ================================================================ -->
1413
1414      <h2>
1415        <a id="reference" name="reference">Expat Reference</a>
1416      </h2>
1417
1418      <h3>
1419        <a id="creation" name="creation">Parser Creation</a>
1420      </h3>
1421
1422      <h4 id="XML_ParserCreate">
1423        XML_ParserCreate
1424      </h4>
1425
1426      <pre class="fcndec">
1427XML_Parser XMLCALL
1428XML_ParserCreate(const XML_Char *encoding);
1429</pre>
1430      <div class="fcndef">
1431        <p>
1432          Construct a new parser. If encoding is non-<code>NULL</code>, it specifies a
1433          character encoding to use for the document. This overrides the document
1434          encoding declaration. There are four built-in encodings:
1435        </p>
1436
1437        <ul>
1438          <li>US-ASCII
1439          </li>
1440
1441          <li>UTF-8
1442          </li>
1443
1444          <li>UTF-16
1445          </li>
1446
1447          <li>ISO-8859-1
1448          </li>
1449        </ul>
1450
1451        <p>
1452          Any other value will invoke a call to the UnknownEncodingHandler.
1453        </p>
1454      </div>
1455
1456      <h4 id="XML_ParserCreateNS">
1457        XML_ParserCreateNS
1458      </h4>
1459
1460      <pre class="fcndec">
1461XML_Parser XMLCALL
1462XML_ParserCreateNS(const XML_Char *encoding,
1463                   XML_Char sep);
1464</pre>
1465      <div class="fcndef">
1466        Constructs a new parser that has namespace processing in effect. Namespace
1467        expanded element names and attribute names are returned as a concatenation of the
1468        namespace URI, <em>sep</em>, and the local part of the name. This means that you
1469        should pick a character for <em>sep</em> that can't be part of an URI. Since
1470        Expat does not check namespace URIs for conformance, the only safe choice for a
1471        namespace separator is a character that is illegal in XML. For instance,
1472        <code>'\xFF'</code> is not legal in UTF-8, and <code>'\xFFFF'</code> is not legal
1473        in UTF-16. There is a special case when <em>sep</em> is the null character
1474        <code>'\0'</code>: the namespace URI and the local part will be concatenated
1475        without any separator - this is intended to support RDF processors. It is a
1476        programming error to use the null separator with <a href=
1477        "#XML_SetReturnNSTriplet">namespace triplets</a>.
1478      </div>
1479
1480      <p>
1481        <strong>Note:</strong> Expat does not validate namespace URIs (beyond encoding)
1482        against RFC 3986 today (and is not required to do so with regard to the XML 1.0
1483        namespaces specification) but it may start doing that in future releases. Before
1484        that, an application using Expat must be ready to receive namespace URIs
1485        containing non-URI characters.
1486      </p>
1487
1488      <h4 id="XML_ParserCreate_MM">
1489        XML_ParserCreate_MM
1490      </h4>
1491
1492      <pre class="fcndec">
1493XML_Parser XMLCALL
1494XML_ParserCreate_MM(const XML_Char *encoding,
1495                    const XML_Memory_Handling_Suite *ms,
1496                    const XML_Char *sep);
1497</pre>
1498
1499      <pre class="signature">
1500typedef struct {
1501  void *(XMLCALL *malloc_fcn)(size_t size);
1502  void *(XMLCALL *realloc_fcn)(void *ptr, size_t size);
1503  void (XMLCALL *free_fcn)(void *ptr);
1504} XML_Memory_Handling_Suite;
1505</pre>
1506      <div class="fcndef">
1507        <p>
1508          Construct a new parser using the suite of memory handling functions specified
1509          in <code>ms</code>. If <code>ms</code> is <code>NULL</code>, then use the
1510          standard set of memory management functions. If <code>sep</code> is
1511          non-<code>NULL</code>, then namespace processing is enabled in the created
1512          parser and the character pointed at by sep is used as the separator between the
1513          namespace URI and the local part of the name.
1514        </p>
1515      </div>
1516
1517      <h4 id="XML_ExternalEntityParserCreate">
1518        XML_ExternalEntityParserCreate
1519      </h4>
1520
1521      <pre class="fcndec">
1522XML_Parser XMLCALL
1523XML_ExternalEntityParserCreate(XML_Parser p,
1524                               const XML_Char *context,
1525                               const XML_Char *encoding);
1526</pre>
1527      <div class="fcndef">
1528        <p>
1529          Construct a new <code>XML_Parser</code> object for parsing an external general
1530          entity. Context is the context argument passed in a call to a
1531          ExternalEntityRefHandler. Other state information such as handlers, user data,
1532          namespace processing is inherited from the parser passed as the 1st argument.
1533          So you shouldn't need to call any of the behavior changing functions on this
1534          parser (unless you want it to act differently than the parent parser).
1535        </p>
1536
1537        <p>
1538          <strong>Note:</strong> Please be sure to free subparsers created by
1539          <code><a href=
1540          "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>
1541          <em>prior to</em> freeing their related parent parser, as subparsers reference
1542          and use parts of their respective parent parser, internally. Parent parsers
1543          must outlive subparsers.
1544        </p>
1545      </div>
1546
1547      <h4 id="XML_ParserFree">
1548        XML_ParserFree
1549      </h4>
1550
1551      <pre class="fcndec">
1552void XMLCALL
1553XML_ParserFree(XML_Parser p);
1554</pre>
1555      <div class="fcndef">
1556        <p>
1557          Free memory used by the parser.
1558        </p>
1559
1560        <p>
1561          <strong>Note:</strong> Your application is responsible for freeing any memory
1562          associated with <a href="#userdata">user data</a>.
1563        </p>
1564
1565        <p>
1566          <strong>Note:</strong> Please be sure to free subparsers created by
1567          <code><a href=
1568          "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>
1569          <em>prior to</em> freeing their related parent parser, as subparsers reference
1570          and use parts of their respective parent parser, internally. Parent parsers
1571          must outlive subparsers.
1572        </p>
1573      </div>
1574
1575      <h4 id="XML_ParserReset">
1576        XML_ParserReset
1577      </h4>
1578
1579      <pre class="fcndec">
1580XML_Bool XMLCALL
1581XML_ParserReset(XML_Parser p,
1582                const XML_Char *encoding);
1583</pre>
1584      <div class="fcndef">
1585        Clean up the memory structures maintained by the parser so that it may be used
1586        again. After this has been called, <code>parser</code> is ready to start parsing
1587        a new document. All handlers are cleared from the parser, except for the
1588        unknownEncodingHandler. The parser's external state is re-initialized except for
1589        the values of ns and ns_triplets. This function may not be used on a parser
1590        created using <code><a href=
1591        "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>; it
1592        will return <code>XML_FALSE</code> in that case. Returns <code>XML_TRUE</code> on
1593        success. Your application is responsible for dealing with any memory associated
1594        with <a href="#userdata">user data</a>.
1595      </div>
1596
1597      <h3>
1598        <a id="parsing" name="parsing">Parsing</a>
1599      </h3>
1600
1601      <p>
1602        To state the obvious: the three parsing functions <code><a href=
1603        "#XML_Parse">XML_Parse</a></code>, <code><a href=
1604        "#XML_ParseBuffer">XML_ParseBuffer</a></code> and <code><a href=
1605        "#XML_GetBuffer">XML_GetBuffer</a></code> as well as the two cleanup functions
1606        <code><a href="#XML_ParserFree">XML_ParserFree</a></code> and <code><a href=
1607        "#XML_ParserReset">XML_ParserReset</a></code> must not be called from within a
1608        handler unless they operate on a separate parser instance, that is, one that did
1609        not call the handler. For example, it is OK to call the parsing functions from
1610        within an <code>XML_ExternalEntityRefHandler</code>, if they apply to the parser
1611        created by <code><a href=
1612        "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>.
1613      </p>
1614
1615      <p>
1616        Note: The <code>len</code> argument passed to these functions should be
1617        considerably less than the maximum value for an integer, as it could create an
1618        integer overflow situation if the added lengths of a buffer and the unprocessed
1619        portion of the previous buffer exceed the maximum integer value. Input data at
1620        the end of a buffer will remain unprocessed if it is part of an XML token for
1621        which the end is not part of that buffer.
1622      </p>
1623
1624      <p>
1625        <a id="isFinal" name="isFinal"></a>The application <em>must</em> make a
1626        concluding <code><a href="#XML_Parse">XML_Parse</a></code> or <code><a href=
1627        "#XML_ParseBuffer">XML_ParseBuffer</a></code> call with <code>isFinal</code> set
1628        to <code>XML_TRUE</code>.
1629      </p>
1630
1631      <h4 id="XML_Parse">
1632        XML_Parse
1633      </h4>
1634
1635      <pre class="fcndec">
1636enum XML_Status XMLCALL
1637XML_Parse(XML_Parser p,
1638          const char *s,
1639          int len,
1640          int isFinal);
1641</pre>
1642
1643      <pre class="signature">
1644enum XML_Status {
1645  XML_STATUS_ERROR = 0,
1646  XML_STATUS_OK = 1
1647};
1648</pre>
1649      <div class="fcndef">
1650        <p>
1651          Parse some more of the document. The string <code>s</code> is a buffer
1652          containing part (or perhaps all) of the document. The number of bytes of s that
1653          are part of the document is indicated by <code>len</code>. This means that
1654          <code>s</code> doesn't have to be null-terminated. It also means that if
1655          <code>len</code> is larger than the number of bytes in the block of memory that
1656          <code>s</code> points at, then a memory fault is likely. Negative values for
1657          <code>len</code> are rejected since Expat 2.2.1. The <code>isFinal</code>
1658          parameter informs the parser that this is the last piece of the document.
1659          Frequently, the last piece is empty (i.e. <code>len</code> is zero.)
1660        </p>
1661
1662        <p>
1663          If a parse error occurred, it returns <code>XML_STATUS_ERROR</code>. Otherwise
1664          it returns <code>XML_STATUS_OK</code> value. Note that regardless of the return
1665          value, there is no guarantee that all provided input has been parsed; only
1666          after <a href="#isFinal">the concluding call</a> will all handler callbacks and
1667          parsing errors have happened.
1668        </p>
1669
1670        <p>
1671          Simplified, <code>XML_Parse</code> can be considered a convenience wrapper that
1672          is pairing calls to <code><a href="#XML_GetBuffer">XML_GetBuffer</a></code> and
1673          <code><a href="#XML_ParseBuffer">XML_ParseBuffer</a></code> (when Expat is
1674          built with macro <code>XML_CONTEXT_BYTES</code> defined to a positive value,
1675          which is both common and default). <code>XML_Parse</code> is then functionally
1676          equivalent to calling <code><a href="#XML_GetBuffer">XML_GetBuffer</a></code>,
1677          <code>memcpy</code>, and <code><a href=
1678          "#XML_ParseBuffer">XML_ParseBuffer</a></code>.
1679        </p>
1680
1681        <p>
1682          To avoid double copying of the input, direct use of functions <code><a href=
1683          "#XML_GetBuffer">XML_GetBuffer</a></code> and <code><a href=
1684          "#XML_ParseBuffer">XML_ParseBuffer</a></code> is advised for most production
1685          use, e.g. if you're using <code>read</code> or similar functionality to fill
1686          your buffers, fill directly into the buffer from <code><a href=
1687          "#XML_GetBuffer">XML_GetBuffer</a></code>, then parse with <code><a href=
1688          "#XML_ParseBuffer">XML_ParseBuffer</a></code>.
1689        </p>
1690      </div>
1691
1692      <h4 id="XML_ParseBuffer">
1693        XML_ParseBuffer
1694      </h4>
1695
1696      <pre class="fcndec">
1697enum XML_Status XMLCALL
1698XML_ParseBuffer(XML_Parser p,
1699                int len,
1700                int isFinal);
1701</pre>
1702      <div class="fcndef">
1703        <p>
1704          This is just like <code><a href="#XML_Parse">XML_Parse</a></code>, except in
1705          this case Expat provides the buffer. By obtaining the buffer from Expat with
1706          the <code><a href="#XML_GetBuffer">XML_GetBuffer</a></code> function, the
1707          application can avoid double copying of the input.
1708        </p>
1709
1710        <p>
1711          Negative values for <code>len</code> are rejected since Expat 2.6.3.
1712        </p>
1713      </div>
1714
1715      <h4 id="XML_GetBuffer">
1716        XML_GetBuffer
1717      </h4>
1718
1719      <pre class="fcndec">
1720void * XMLCALL
1721XML_GetBuffer(XML_Parser p,
1722              int len);
1723</pre>
1724      <div class="fcndef">
1725        Obtain a buffer of size <code>len</code> to read a piece of the document into. A
1726        <code>NULL</code> value is returned if Expat can't allocate enough memory for
1727        this buffer. A <code>NULL</code> value may also be returned if <code>len</code>
1728        is zero. This has to be called prior to every call to <code><a href=
1729        "#XML_ParseBuffer">XML_ParseBuffer</a></code>. A typical use would look like
1730        this:
1731
1732        <pre class="eg">
1733for (;;) {
1734  int bytes_read;
1735  void *buff = XML_GetBuffer(p, BUFF_SIZE);
1736  if (buff == NULL) {
1737    /* handle error */
1738  }
1739
1740  bytes_read = read(docfd, buff, BUFF_SIZE);
1741  if (bytes_read &lt; 0) {
1742    /* handle error */
1743  }
1744
1745  if (! XML_ParseBuffer(p, bytes_read, bytes_read == 0)) {
1746    /* handle parse error */
1747  }
1748
1749  if (bytes_read == 0)
1750    break;
1751}
1752</pre>
1753      </div>
1754
1755      <h4 id="XML_StopParser">
1756        XML_StopParser
1757      </h4>
1758
1759      <pre class="fcndec">
1760enum XML_Status XMLCALL
1761XML_StopParser(XML_Parser p,
1762               XML_Bool resumable);
1763</pre>
1764      <div class="fcndef">
1765        <p>
1766          Stops parsing, causing <code><a href="#XML_Parse">XML_Parse</a></code> or
1767          <code><a href="#XML_ParseBuffer">XML_ParseBuffer</a></code> to return. Must be
1768          called from within a call-back handler, except when aborting (when
1769          <code>resumable</code> is <code>XML_FALSE</code>) an already suspended parser.
1770          Some call-backs may still follow because they would otherwise get lost,
1771          including
1772        </p>
1773
1774        <ul>
1775          <li>the end element handler for empty elements when stopped in the start
1776          element handler,
1777          </li>
1778
1779          <li>the end namespace declaration handler when stopped in the end element
1780          handler,
1781          </li>
1782
1783          <li>the character data handler when stopped in the character data handler while
1784          making multiple call-backs on a contiguous chunk of characters,
1785          </li>
1786        </ul>
1787
1788        <p>
1789          and possibly others.
1790        </p>
1791
1792        <p>
1793          This can be called from most handlers, including DTD related call-backs, except
1794          when parsing an external parameter entity and <code>resumable</code> is
1795          <code>XML_TRUE</code>. Returns <code>XML_STATUS_OK</code> when successful,
1796          <code>XML_STATUS_ERROR</code> otherwise. The possible error codes are:
1797        </p>
1798
1799        <dl>
1800          <dt>
1801            <code>XML_ERROR_NOT_STARTED</code>
1802          </dt>
1803
1804          <dd>
1805            when stopping or suspending a parser before it has started, added in Expat
1806            2.6.4.
1807          </dd>
1808
1809          <dt>
1810            <code>XML_ERROR_SUSPENDED</code>
1811          </dt>
1812
1813          <dd>
1814            when suspending an already suspended parser.
1815          </dd>
1816
1817          <dt>
1818            <code>XML_ERROR_FINISHED</code>
1819          </dt>
1820
1821          <dd>
1822            when the parser has already finished.
1823          </dd>
1824
1825          <dt>
1826            <code>XML_ERROR_SUSPEND_PE</code>
1827          </dt>
1828
1829          <dd>
1830            when suspending while parsing an external PE.
1831          </dd>
1832        </dl>
1833
1834        <p>
1835          Since the stop/resume feature requires application support in the outer parsing
1836          loop, it is an error to call this function for a parser not being handled
1837          appropriately; see <a href="#stop-resume">Temporarily Stopping Parsing</a> for
1838          more information.
1839        </p>
1840
1841        <p>
1842          When <code>resumable</code> is <code>XML_TRUE</code> then parsing is
1843          <em>suspended</em>, that is, <code><a href="#XML_Parse">XML_Parse</a></code>
1844          and <code><a href="#XML_ParseBuffer">XML_ParseBuffer</a></code> return
1845          <code>XML_STATUS_SUSPENDED</code>. Otherwise, parsing is <em>aborted</em>, that
1846          is, <code><a href="#XML_Parse">XML_Parse</a></code> and <code><a href=
1847          "#XML_ParseBuffer">XML_ParseBuffer</a></code> return
1848          <code>XML_STATUS_ERROR</code> with error code <code>XML_ERROR_ABORTED</code>.
1849        </p>
1850
1851        <p>
1852          <strong>Note:</strong> This will be applied to the current parser instance
1853          only, that is, if there is a parent parser then it will continue parsing when
1854          the external entity reference handler returns. It is up to the implementation
1855          of that handler to call <code><a href=
1856          "#XML_StopParser">XML_StopParser</a></code> on the parent parser (recursively),
1857          if one wants to stop parsing altogether.
1858        </p>
1859
1860        <p>
1861          When suspended, parsing can be resumed by calling <code><a href=
1862          "#XML_ResumeParser">XML_ResumeParser</a></code>.
1863        </p>
1864
1865        <p>
1866          New in Expat 1.95.8.
1867        </p>
1868      </div>
1869
1870      <h4 id="XML_ResumeParser">
1871        XML_ResumeParser
1872      </h4>
1873
1874      <pre class="fcndec">
1875enum XML_Status XMLCALL
1876XML_ResumeParser(XML_Parser p);
1877</pre>
1878      <div class="fcndef">
1879        <p>
1880          Resumes parsing after it has been suspended with <code><a href=
1881          "#XML_StopParser">XML_StopParser</a></code>. Must not be called from within a
1882          handler call-back. Returns same status codes as <code><a href=
1883          "#XML_Parse">XML_Parse</a></code> or <code><a href=
1884          "#XML_ParseBuffer">XML_ParseBuffer</a></code>. An additional error code,
1885          <code>XML_ERROR_NOT_SUSPENDED</code>, will be returned if the parser was not
1886          currently suspended.
1887        </p>
1888
1889        <p>
1890          <strong>Note:</strong> This must be called on the most deeply nested child
1891          parser instance first, and on its parent parser only after the child parser has
1892          finished, to be applied recursively until the document entity's parser is
1893          restarted. That is, the parent parser will not resume by itself and it is up to
1894          the application to call <code><a href=
1895          "#XML_ResumeParser">XML_ResumeParser</a></code> on it at the appropriate
1896          moment.
1897        </p>
1898
1899        <p>
1900          New in Expat 1.95.8.
1901        </p>
1902      </div>
1903
1904      <h4 id="XML_GetParsingStatus">
1905        XML_GetParsingStatus
1906      </h4>
1907
1908      <pre class="fcndec">
1909void XMLCALL
1910XML_GetParsingStatus(XML_Parser p,
1911                     XML_ParsingStatus *status);
1912</pre>
1913
1914      <pre class="signature">
1915enum XML_Parsing {
1916  XML_INITIALIZED,
1917  XML_PARSING,
1918  XML_FINISHED,
1919  XML_SUSPENDED
1920};
1921
1922typedef struct {
1923  enum XML_Parsing parsing;
1924  XML_Bool finalBuffer;
1925} XML_ParsingStatus;
1926</pre>
1927      <div class="fcndef">
1928        <p>
1929          Returns status of parser with respect to being initialized, parsing, finished,
1930          or suspended, and whether the final buffer is being processed. The
1931          <code>status</code> parameter <em>must not</em> be <code>NULL</code>.
1932        </p>
1933
1934        <p>
1935          New in Expat 1.95.8.
1936        </p>
1937      </div>
1938
1939      <h3>
1940        <a id="setting" name="setting">Handler Setting</a>
1941      </h3>
1942
1943      <p>
1944        Although handlers are typically set prior to parsing and left alone, an
1945        application may choose to set or change the handler for a parsing event while the
1946        parse is in progress. For instance, your application may choose to ignore all
1947        text not descended from a <code>para</code> element. One way it could do this is
1948        to set the character handler when a para start tag is seen, and unset it for the
1949        corresponding end tag.
1950      </p>
1951
1952      <p>
1953        A handler may be <em>unset</em> by providing a <code>NULL</code> pointer to the
1954        appropriate handler setter. None of the handler setting functions have a return
1955        value.
1956      </p>
1957
1958      <p>
1959        Your handlers will be receiving strings in arrays of type <code>XML_Char</code>.
1960        This type is conditionally defined in expat.h as either <code>char</code>,
1961        <code>wchar_t</code> or <code>unsigned short</code>. The former implies UTF-8
1962        encoding, the latter two imply UTF-16 encoding. Note that you'll receive them in
1963        this form independent of the original encoding of the document.
1964      </p>
1965
1966      <div class="handler">
1967        <h4 id="XML_SetStartElementHandler">
1968          XML_SetStartElementHandler
1969        </h4>
1970
1971        <pre class="setter">
1972void XMLCALL
1973XML_SetStartElementHandler(XML_Parser p,
1974                           XML_StartElementHandler start);
1975</pre>
1976
1977        <pre class="signature">
1978typedef void
1979(XMLCALL *XML_StartElementHandler)(void *userData,
1980                                   const XML_Char *name,
1981                                   const XML_Char **atts);
1982</pre>
1983        <p>
1984          Set handler for start (and empty) tags. Attributes are passed to the start
1985          handler as a pointer to a vector of char pointers. Each attribute seen in a
1986          start (or empty) tag occupies 2 consecutive places in this vector: the
1987          attribute name followed by the attribute value. These pairs are terminated by a
1988          <code>NULL</code> pointer.
1989        </p>
1990
1991        <p>
1992          Note that an empty tag generates a call to both start and end handlers (in that
1993          order).
1994        </p>
1995      </div>
1996
1997      <div class="handler">
1998        <h4 id="XML_SetEndElementHandler">
1999          XML_SetEndElementHandler
2000        </h4>
2001
2002        <pre class="setter">
2003void XMLCALL
2004XML_SetEndElementHandler(XML_Parser p,
2005                         XML_EndElementHandler);
2006</pre>
2007
2008        <pre class="signature">
2009typedef void
2010(XMLCALL *XML_EndElementHandler)(void *userData,
2011                                 const XML_Char *name);
2012</pre>
2013        <p>
2014          Set handler for end (and empty) tags. As noted above, an empty tag generates a
2015          call to both start and end handlers.
2016        </p>
2017      </div>
2018
2019      <div class="handler">
2020        <h4 id="XML_SetElementHandler">
2021          XML_SetElementHandler
2022        </h4>
2023
2024        <pre class="setter">
2025void XMLCALL
2026XML_SetElementHandler(XML_Parser p,
2027                      XML_StartElementHandler start,
2028                      XML_EndElementHandler end);
2029</pre>
2030        <p>
2031          Set handlers for start and end tags with one call.
2032        </p>
2033      </div>
2034
2035      <div class="handler">
2036        <h4 id="XML_SetCharacterDataHandler">
2037          XML_SetCharacterDataHandler
2038        </h4>
2039
2040        <pre class="setter">
2041void XMLCALL
2042XML_SetCharacterDataHandler(XML_Parser p,
2043                            XML_CharacterDataHandler charhndl)
2044</pre>
2045
2046        <pre class="signature">
2047typedef void
2048(XMLCALL *XML_CharacterDataHandler)(void *userData,
2049                                    const XML_Char *s,
2050                                    int len);
2051</pre>
2052        <p>
2053          Set a text handler. The string your handler receives is <em>NOT
2054          null-terminated</em>. You have to use the length argument to deal with the end
2055          of the string. A single block of contiguous text free of markup may still
2056          result in a sequence of calls to this handler. In other words, if you're
2057          searching for a pattern in the text, it may be split across calls to this
2058          handler. Note: Setting this handler to <code>NULL</code> may <em>NOT
2059          immediately</em> terminate call-backs if the parser is currently processing
2060          such a single block of contiguous markup-free text, as the parser will continue
2061          calling back until the end of the block is reached.
2062        </p>
2063      </div>
2064
2065      <div class="handler">
2066        <h4 id="XML_SetProcessingInstructionHandler">
2067          XML_SetProcessingInstructionHandler
2068        </h4>
2069
2070        <pre class="setter">
2071void XMLCALL
2072XML_SetProcessingInstructionHandler(XML_Parser p,
2073                                    XML_ProcessingInstructionHandler proc)
2074</pre>
2075
2076        <pre class="signature">
2077typedef void
2078(XMLCALL *XML_ProcessingInstructionHandler)(void *userData,
2079                                            const XML_Char *target,
2080                                            const XML_Char *data);
2081
2082</pre>
2083        <p>
2084          Set a handler for processing instructions. The target is the first word in the
2085          processing instruction. The data is the rest of the characters in it after
2086          skipping all whitespace after the initial word.
2087        </p>
2088      </div>
2089
2090      <div class="handler">
2091        <h4 id="XML_SetCommentHandler">
2092          XML_SetCommentHandler
2093        </h4>
2094
2095        <pre class="setter">
2096void XMLCALL
2097XML_SetCommentHandler(XML_Parser p,
2098                      XML_CommentHandler cmnt)
2099</pre>
2100
2101        <pre class="signature">
2102typedef void
2103(XMLCALL *XML_CommentHandler)(void *userData,
2104                              const XML_Char *data);
2105</pre>
2106        <p>
2107          Set a handler for comments. The data is all text inside the comment delimiters.
2108        </p>
2109      </div>
2110
2111      <div class="handler">
2112        <h4 id="XML_SetStartCdataSectionHandler">
2113          XML_SetStartCdataSectionHandler
2114        </h4>
2115
2116        <pre class="setter">
2117void XMLCALL
2118XML_SetStartCdataSectionHandler(XML_Parser p,
2119                                XML_StartCdataSectionHandler start);
2120</pre>
2121
2122        <pre class="signature">
2123typedef void
2124(XMLCALL *XML_StartCdataSectionHandler)(void *userData);
2125</pre>
2126        <p>
2127          Set a handler that gets called at the beginning of a CDATA section.
2128        </p>
2129      </div>
2130
2131      <div class="handler">
2132        <h4 id="XML_SetEndCdataSectionHandler">
2133          XML_SetEndCdataSectionHandler
2134        </h4>
2135
2136        <pre class="setter">
2137void XMLCALL
2138XML_SetEndCdataSectionHandler(XML_Parser p,
2139                              XML_EndCdataSectionHandler end);
2140</pre>
2141
2142        <pre class="signature">
2143typedef void
2144(XMLCALL *XML_EndCdataSectionHandler)(void *userData);
2145</pre>
2146        <p>
2147          Set a handler that gets called at the end of a CDATA section.
2148        </p>
2149      </div>
2150
2151      <div class="handler">
2152        <h4 id="XML_SetCdataSectionHandler">
2153          XML_SetCdataSectionHandler
2154        </h4>
2155
2156        <pre class="setter">
2157void XMLCALL
2158XML_SetCdataSectionHandler(XML_Parser p,
2159                           XML_StartCdataSectionHandler start,
2160                           XML_EndCdataSectionHandler end)
2161</pre>
2162        <p>
2163          Sets both CDATA section handlers with one call.
2164        </p>
2165      </div>
2166
2167      <div class="handler">
2168        <h4 id="XML_SetDefaultHandler">
2169          XML_SetDefaultHandler
2170        </h4>
2171
2172        <pre class="setter">
2173void XMLCALL
2174XML_SetDefaultHandler(XML_Parser p,
2175                      XML_DefaultHandler hndl)
2176</pre>
2177
2178        <pre class="signature">
2179typedef void
2180(XMLCALL *XML_DefaultHandler)(void *userData,
2181                              const XML_Char *s,
2182                              int len);
2183</pre>
2184        <p>
2185          Sets a handler for any characters in the document which wouldn't otherwise be
2186          handled. This includes both data for which no handlers can be set (like some
2187          kinds of DTD declarations) and data which could be reported but which currently
2188          has no handler set. The characters are passed exactly as they were present in
2189          the XML document except that they will be encoded in UTF-8 or UTF-16. Line
2190          boundaries are not normalized. Note that a byte order mark character is not
2191          passed to the default handler. There are no guarantees about how characters are
2192          divided between calls to the default handler: for example, a comment might be
2193          split between multiple calls. Setting the handler with this call has the side
2194          effect of turning off expansion of references to internally defined general
2195          entities. Instead these references are passed to the default handler.
2196        </p>
2197
2198        <p>
2199          See also <code><a href="#XML_DefaultCurrent">XML_DefaultCurrent</a></code>.
2200        </p>
2201      </div>
2202
2203      <div class="handler">
2204        <h4 id="XML_SetDefaultHandlerExpand">
2205          XML_SetDefaultHandlerExpand
2206        </h4>
2207
2208        <pre class="setter">
2209void XMLCALL
2210XML_SetDefaultHandlerExpand(XML_Parser p,
2211                            XML_DefaultHandler hndl)
2212</pre>
2213
2214        <pre class="signature">
2215typedef void
2216(XMLCALL *XML_DefaultHandler)(void *userData,
2217                              const XML_Char *s,
2218                              int len);
2219</pre>
2220        <p>
2221          This sets a default handler, but doesn't inhibit the expansion of internal
2222          entity references. The entity reference will not be passed to the default
2223          handler.
2224        </p>
2225
2226        <p>
2227          See also <code><a href="#XML_DefaultCurrent">XML_DefaultCurrent</a></code>.
2228        </p>
2229      </div>
2230
2231      <div class="handler">
2232        <h4 id="XML_SetExternalEntityRefHandler">
2233          XML_SetExternalEntityRefHandler
2234        </h4>
2235
2236        <pre class="setter">
2237void XMLCALL
2238XML_SetExternalEntityRefHandler(XML_Parser p,
2239                                XML_ExternalEntityRefHandler hndl)
2240</pre>
2241
2242        <pre class="signature">
2243typedef int
2244(XMLCALL *XML_ExternalEntityRefHandler)(XML_Parser p,
2245                                        const XML_Char *context,
2246                                        const XML_Char *base,
2247                                        const XML_Char *systemId,
2248                                        const XML_Char *publicId);
2249</pre>
2250        <p>
2251          Set an external entity reference handler. This handler is also called for
2252          processing an external DTD subset if parameter entity parsing is in effect.
2253          (See <a href=
2254          "#XML_SetParamEntityParsing"><code>XML_SetParamEntityParsing</code></a>.)
2255        </p>
2256
2257        <p>
2258          <strong>Warning:</strong> Using an external entity reference handler can lead
2259          to <a href="https://libexpat.github.io/doc/xml-security/#external-entities">XXE
2260          vulnerabilities</a>. It should only be used in applications that do not parse
2261          untrusted XML input.
2262        </p>
2263
2264        <p>
2265          The <code>context</code> parameter specifies the parsing context in the format
2266          expected by the <code>context</code> argument to <code><a href=
2267          "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>.
2268          <code>code</code> is valid only until the handler returns, so if the referenced
2269          entity is to be parsed later, it must be copied. <code>context</code> is
2270          <code>NULL</code> only when the entity is a parameter entity, which is how one
2271          can differentiate between general and parameter entities.
2272        </p>
2273
2274        <p>
2275          The <code>base</code> parameter is the base to use for relative system
2276          identifiers. It is set by <code><a href="#XML_SetBase">XML_SetBase</a></code>
2277          and may be <code>NULL</code>. The <code>publicId</code> parameter is the public
2278          id given in the entity declaration and may be <code>NULL</code>.
2279          <code>systemId</code> is the system identifier specified in the entity
2280          declaration and is never <code>NULL</code>.
2281        </p>
2282
2283        <p>
2284          There are a couple of ways in which this handler differs from others. First,
2285          this handler returns a status indicator (an integer).
2286          <code>XML_STATUS_OK</code> should be returned for successful handling of the
2287          external entity reference. Returning <code>XML_STATUS_ERROR</code> indicates
2288          failure, and causes the calling parser to return an
2289          <code>XML_ERROR_EXTERNAL_ENTITY_HANDLING</code> error.
2290        </p>
2291
2292        <p>
2293          Second, instead of having the user data as its first argument, it receives the
2294          parser that encountered the entity reference. This, along with the context
2295          parameter, may be used as arguments to a call to <code><a href=
2296          "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>.
2297          Using the returned parser, the body of the external entity can be recursively
2298          parsed.
2299        </p>
2300
2301        <p>
2302          Since this handler may be called recursively, it should not be saving
2303          information into global or static variables.
2304        </p>
2305      </div>
2306
2307      <h4 id="XML_SetExternalEntityRefHandlerArg">
2308        XML_SetExternalEntityRefHandlerArg
2309      </h4>
2310
2311      <pre class="fcndec">
2312void XMLCALL
2313XML_SetExternalEntityRefHandlerArg(XML_Parser p,
2314                                   void *arg)
2315</pre>
2316      <div class="fcndef">
2317        <p>
2318          Set the argument passed to the ExternalEntityRefHandler. If <code>arg</code> is
2319          not <code>NULL</code>, it is the new value passed to the handler set using
2320          <code><a href=
2321          "#XML_SetExternalEntityRefHandler">XML_SetExternalEntityRefHandler</a></code>;
2322          if <code>arg</code> is <code>NULL</code>, the argument passed to the handler
2323          function will be the parser object itself.
2324        </p>
2325
2326        <p>
2327          <strong>Note:</strong> The type of <code>arg</code> and the type of the first
2328          argument to the ExternalEntityRefHandler do not match. This function takes a
2329          <code>void *</code> to be passed to the handler, while the handler accepts an
2330          <code>XML_Parser</code>. This is a historical accident, but will not be
2331          corrected before Expat 2.0 (at the earliest) to avoid causing compiler warnings
2332          for code that's known to work with this API. It is the responsibility of the
2333          application code to know the actual type of the argument passed to the handler
2334          and to manage it properly.
2335        </p>
2336      </div>
2337
2338      <div class="handler">
2339        <h4 id="XML_SetSkippedEntityHandler">
2340          XML_SetSkippedEntityHandler
2341        </h4>
2342
2343        <pre class="setter">
2344void XMLCALL
2345XML_SetSkippedEntityHandler(XML_Parser p,
2346                            XML_SkippedEntityHandler handler)
2347</pre>
2348
2349        <pre class="signature">
2350typedef void
2351(XMLCALL *XML_SkippedEntityHandler)(void *userData,
2352                                    const XML_Char *entityName,
2353                                    int is_parameter_entity);
2354</pre>
2355        <p>
2356          Set a skipped entity handler. This is called in two situations:
2357        </p>
2358
2359        <ol>
2360          <li>An entity reference is encountered for which no declaration has been read
2361          <em>and</em> this is not an error.
2362          </li>
2363
2364          <li>An internal entity reference is read, but not expanded, because <a href=
2365          "#XML_SetDefaultHandler"><code>XML_SetDefaultHandler</code></a> has been
2366          called.
2367          </li>
2368        </ol>
2369
2370        <p>
2371          The <code>is_parameter_entity</code> argument will be non-zero for a parameter
2372          entity and zero for a general entity.
2373        </p>
2374
2375        <p>
2376          Note: Skipped parameter entities in declarations and skipped general entities
2377          in attribute values cannot be reported, because the event would be out of sync
2378          with the reporting of the declarations or attribute values
2379        </p>
2380      </div>
2381
2382      <div class="handler">
2383        <h4 id="XML_SetUnknownEncodingHandler">
2384          XML_SetUnknownEncodingHandler
2385        </h4>
2386
2387        <pre class="setter">
2388void XMLCALL
2389XML_SetUnknownEncodingHandler(XML_Parser p,
2390                              XML_UnknownEncodingHandler enchandler,
2391                              void *encodingHandlerData)
2392</pre>
2393
2394        <pre class="signature">
2395typedef int
2396(XMLCALL *XML_UnknownEncodingHandler)(void *encodingHandlerData,
2397                                      const XML_Char *name,
2398                                      XML_Encoding *info);
2399
2400typedef struct {
2401  int map[256];
2402  void *data;
2403  int (XMLCALL *convert)(void *data, const char *s);
2404  void (XMLCALL *release)(void *data);
2405} XML_Encoding;
2406</pre>
2407        <p>
2408          Set a handler to deal with encodings other than the <a href=
2409          "#builtin_encodings">built in set</a>. This should be done before
2410          <code><a href="#XML_Parse">XML_Parse</a></code> or <code><a href=
2411          "#XML_ParseBuffer">XML_ParseBuffer</a></code> have been called on the given
2412          parser.
2413        </p>
2414
2415        <p>
2416          If the handler knows how to deal with an encoding with the given name, it
2417          should fill in the <code>info</code> data structure and return
2418          <code>XML_STATUS_OK</code>. Otherwise it should return
2419          <code>XML_STATUS_ERROR</code>. The handler will be called at most once per
2420          parsed (external) entity. The optional application data pointer
2421          <code>encodingHandlerData</code> will be passed back to the handler.
2422        </p>
2423
2424        <p>
2425          The map array contains information for every possible leading byte in a byte
2426          sequence. If the corresponding value is &gt;= 0, then it's a single byte
2427          sequence and the byte encodes that Unicode value. If the value is -1, then that
2428          byte is invalid as the initial byte in a sequence. If the value is -n, where n
2429          is an integer &gt; 1, then n is the number of bytes in the sequence and the
2430          actual conversion is accomplished by a call to the function pointed at by
2431          convert. This function may return -1 if the sequence itself is invalid. The
2432          convert pointer may be <code>NULL</code> if there are only single byte codes.
2433          The data parameter passed to the convert function is the data pointer from
2434          <code>XML_Encoding</code>. The string s is <em>NOT</em> null-terminated and
2435          points at the sequence of bytes to be converted.
2436        </p>
2437
2438        <p>
2439          The function pointed at by <code>release</code> is called by the parser when it
2440          is finished with the encoding. It may be <code>NULL</code>.
2441        </p>
2442      </div>
2443
2444      <div class="handler">
2445        <h4 id="XML_SetStartNamespaceDeclHandler">
2446          XML_SetStartNamespaceDeclHandler
2447        </h4>
2448
2449        <pre class="setter">
2450void XMLCALL
2451XML_SetStartNamespaceDeclHandler(XML_Parser p,
2452                                 XML_StartNamespaceDeclHandler start);
2453</pre>
2454
2455        <pre class="signature">
2456typedef void
2457(XMLCALL *XML_StartNamespaceDeclHandler)(void *userData,
2458                                         const XML_Char *prefix,
2459                                         const XML_Char *uri);
2460</pre>
2461        <p>
2462          Set a handler to be called when a namespace is declared. Namespace declarations
2463          occur inside start tags. But the namespace declaration start handler is called
2464          before the start tag handler for each namespace declared in that start tag.
2465        </p>
2466      </div>
2467
2468      <div class="handler">
2469        <h4 id="XML_SetEndNamespaceDeclHandler">
2470          XML_SetEndNamespaceDeclHandler
2471        </h4>
2472
2473        <pre class="setter">
2474void XMLCALL
2475XML_SetEndNamespaceDeclHandler(XML_Parser p,
2476                               XML_EndNamespaceDeclHandler end);
2477</pre>
2478
2479        <pre class="signature">
2480typedef void
2481(XMLCALL *XML_EndNamespaceDeclHandler)(void *userData,
2482                                       const XML_Char *prefix);
2483</pre>
2484        <p>
2485          Set a handler to be called when leaving the scope of a namespace declaration.
2486          This will be called, for each namespace declaration, after the handler for the
2487          end tag of the element in which the namespace was declared.
2488        </p>
2489      </div>
2490
2491      <div class="handler">
2492        <h4 id="XML_SetNamespaceDeclHandler">
2493          XML_SetNamespaceDeclHandler
2494        </h4>
2495
2496        <pre class="setter">
2497void XMLCALL
2498XML_SetNamespaceDeclHandler(XML_Parser p,
2499                            XML_StartNamespaceDeclHandler start,
2500                            XML_EndNamespaceDeclHandler end)
2501</pre>
2502        <p>
2503          Sets both namespace declaration handlers with a single call.
2504        </p>
2505      </div>
2506
2507      <div class="handler">
2508        <h4 id="XML_SetXmlDeclHandler">
2509          XML_SetXmlDeclHandler
2510        </h4>
2511
2512        <pre class="setter">
2513void XMLCALL
2514XML_SetXmlDeclHandler(XML_Parser p,
2515                      XML_XmlDeclHandler xmldecl);
2516</pre>
2517
2518        <pre class="signature">
2519typedef void
2520(XMLCALL *XML_XmlDeclHandler)(void            *userData,
2521                              const XML_Char  *version,
2522                              const XML_Char  *encoding,
2523                              int             standalone);
2524</pre>
2525        <p>
2526          Sets a handler that is called for XML declarations and also for text
2527          declarations discovered in external entities. The way to distinguish is that
2528          the <code>version</code> parameter will be <code>NULL</code> for text
2529          declarations. The <code>encoding</code> parameter may be <code>NULL</code> for
2530          an XML declaration. The <code>standalone</code> argument will contain -1, 0, or
2531          1 indicating respectively that there was no standalone parameter in the
2532          declaration, that it was given as no, or that it was given as yes.
2533        </p>
2534      </div>
2535
2536      <div class="handler">
2537        <h4 id="XML_SetStartDoctypeDeclHandler">
2538          XML_SetStartDoctypeDeclHandler
2539        </h4>
2540
2541        <pre class="setter">
2542void XMLCALL
2543XML_SetStartDoctypeDeclHandler(XML_Parser p,
2544                               XML_StartDoctypeDeclHandler start);
2545</pre>
2546
2547        <pre class="signature">
2548typedef void
2549(XMLCALL *XML_StartDoctypeDeclHandler)(void           *userData,
2550                                       const XML_Char *doctypeName,
2551                                       const XML_Char *sysid,
2552                                       const XML_Char *pubid,
2553                                       int            has_internal_subset);
2554</pre>
2555        <p>
2556          Set a handler that is called at the start of a DOCTYPE declaration, before any
2557          external or internal subset is parsed. Both <code>sysid</code> and
2558          <code>pubid</code> may be <code>NULL</code>. The
2559          <code>has_internal_subset</code> will be non-zero if the DOCTYPE declaration
2560          has an internal subset.
2561        </p>
2562      </div>
2563
2564      <div class="handler">
2565        <h4 id="XML_SetEndDoctypeDeclHandler">
2566          XML_SetEndDoctypeDeclHandler
2567        </h4>
2568
2569        <pre class="setter">
2570void XMLCALL
2571XML_SetEndDoctypeDeclHandler(XML_Parser p,
2572                             XML_EndDoctypeDeclHandler end);
2573</pre>
2574
2575        <pre class="signature">
2576typedef void
2577(XMLCALL *XML_EndDoctypeDeclHandler)(void *userData);
2578</pre>
2579        <p>
2580          Set a handler that is called at the end of a DOCTYPE declaration, after parsing
2581          any external subset.
2582        </p>
2583      </div>
2584
2585      <div class="handler">
2586        <h4 id="XML_SetDoctypeDeclHandler">
2587          XML_SetDoctypeDeclHandler
2588        </h4>
2589
2590        <pre class="setter">
2591void XMLCALL
2592XML_SetDoctypeDeclHandler(XML_Parser p,
2593                          XML_StartDoctypeDeclHandler start,
2594                          XML_EndDoctypeDeclHandler end);
2595</pre>
2596        <p>
2597          Set both doctype handlers with one call.
2598        </p>
2599      </div>
2600
2601      <div class="handler">
2602        <h4 id="XML_SetElementDeclHandler">
2603          XML_SetElementDeclHandler
2604        </h4>
2605
2606        <pre class="setter">
2607void XMLCALL
2608XML_SetElementDeclHandler(XML_Parser p,
2609                          XML_ElementDeclHandler eldecl);
2610</pre>
2611
2612        <pre class="signature">
2613typedef void
2614(XMLCALL *XML_ElementDeclHandler)(void *userData,
2615                                  const XML_Char *name,
2616                                  XML_Content *model);
2617</pre>
2618
2619        <pre class="signature">
2620enum XML_Content_Type {
2621  XML_CTYPE_EMPTY = 1,
2622  XML_CTYPE_ANY,
2623  XML_CTYPE_MIXED,
2624  XML_CTYPE_NAME,
2625  XML_CTYPE_CHOICE,
2626  XML_CTYPE_SEQ
2627};
2628
2629enum XML_Content_Quant {
2630  XML_CQUANT_NONE,
2631  XML_CQUANT_OPT,
2632  XML_CQUANT_REP,
2633  XML_CQUANT_PLUS
2634};
2635
2636typedef struct XML_cp XML_Content;
2637
2638struct XML_cp {
2639  enum XML_Content_Type         type;
2640  enum XML_Content_Quant        quant;
2641  const XML_Char *              name;
2642  unsigned int                  numchildren;
2643  XML_Content *                 children;
2644};
2645</pre>
2646        <p>
2647          Sets a handler for element declarations in a DTD. The handler gets called with
2648          the name of the element in the declaration and a pointer to a structure that
2649          contains the element model. It's the user code's responsibility to free model
2650          when finished with via a call to <code><a href=
2651          "#XML_FreeContentModel">XML_FreeContentModel</a></code>. There is no need to
2652          free the model from the handler, it can be kept around and freed at a later
2653          stage.
2654        </p>
2655
2656        <p>
2657          The <code>model</code> argument is the root of a tree of
2658          <code>XML_Content</code> nodes. If <code>type</code> equals
2659          <code>XML_CTYPE_EMPTY</code> or <code>XML_CTYPE_ANY</code>, then
2660          <code>quant</code> will be <code>XML_CQUANT_NONE</code>, and the other fields
2661          will be zero or <code>NULL</code>. If <code>type</code> is
2662          <code>XML_CTYPE_MIXED</code>, then <code>quant</code> will be
2663          <code>XML_CQUANT_NONE</code> or <code>XML_CQUANT_REP</code> and
2664          <code>numchildren</code> will contain the number of elements that are allowed
2665          to be mixed in and <code>children</code> points to an array of
2666          <code>XML_Content</code> structures that will all have type XML_CTYPE_NAME with
2667          no quantification. Only the root node can be type <code>XML_CTYPE_EMPTY</code>,
2668          <code>XML_CTYPE_ANY</code>, or <code>XML_CTYPE_MIXED</code>.
2669        </p>
2670
2671        <p>
2672          For type <code>XML_CTYPE_NAME</code>, the <code>name</code> field points to the
2673          name and the <code>numchildren</code> and <code>children</code> fields will be
2674          zero and <code>NULL</code>. The <code>quant</code> field will indicate any
2675          quantifiers placed on the name.
2676        </p>
2677
2678        <p>
2679          Types <code>XML_CTYPE_CHOICE</code> and <code>XML_CTYPE_SEQ</code> indicate a
2680          choice or sequence respectively. The <code>numchildren</code> field indicates
2681          how many nodes in the choice or sequence and <code>children</code> points to
2682          the nodes.
2683        </p>
2684      </div>
2685
2686      <div class="handler">
2687        <h4 id="XML_SetAttlistDeclHandler">
2688          XML_SetAttlistDeclHandler
2689        </h4>
2690
2691        <pre class="setter">
2692void XMLCALL
2693XML_SetAttlistDeclHandler(XML_Parser p,
2694                          XML_AttlistDeclHandler attdecl);
2695</pre>
2696
2697        <pre class="signature">
2698typedef void
2699(XMLCALL *XML_AttlistDeclHandler)(void           *userData,
2700                                  const XML_Char *elname,
2701                                  const XML_Char *attname,
2702                                  const XML_Char *att_type,
2703                                  const XML_Char *dflt,
2704                                  int            isrequired);
2705</pre>
2706        <p>
2707          Set a handler for attlist declarations in the DTD. This handler is called for
2708          <em>each</em> attribute. So a single attlist declaration with multiple
2709          attributes declared will generate multiple calls to this handler. The
2710          <code>elname</code> parameter returns the name of the element for which the
2711          attribute is being declared. The attribute name is in the <code>attname</code>
2712          parameter. The attribute type is in the <code>att_type</code> parameter. It is
2713          the string representing the type in the declaration with whitespace removed.
2714        </p>
2715
2716        <p>
2717          The <code>dflt</code> parameter holds the default value. It will be
2718          <code>NULL</code> in the case of "#IMPLIED" or "#REQUIRED" attributes. You can
2719          distinguish these two cases by checking the <code>isrequired</code> parameter,
2720          which will be true in the case of "#REQUIRED" attributes. Attributes which are
2721          "#FIXED" will have also have a true <code>isrequired</code>, but they will have
2722          the non-<code>NULL</code> fixed value in the <code>dflt</code> parameter.
2723        </p>
2724      </div>
2725
2726      <div class="handler">
2727        <h4 id="XML_SetEntityDeclHandler">
2728          XML_SetEntityDeclHandler
2729        </h4>
2730
2731        <pre class="setter">
2732void XMLCALL
2733XML_SetEntityDeclHandler(XML_Parser p,
2734                         XML_EntityDeclHandler handler);
2735</pre>
2736
2737        <pre class="signature">
2738typedef void
2739(XMLCALL *XML_EntityDeclHandler)(void           *userData,
2740                                 const XML_Char *entityName,
2741                                 int            is_parameter_entity,
2742                                 const XML_Char *value,
2743                                 int            value_length,
2744                                 const XML_Char *base,
2745                                 const XML_Char *systemId,
2746                                 const XML_Char *publicId,
2747                                 const XML_Char *notationName);
2748</pre>
2749        <p>
2750          Sets a handler that will be called for all entity declarations. The
2751          <code>is_parameter_entity</code> argument will be non-zero in the case of
2752          parameter entities and zero otherwise.
2753        </p>
2754
2755        <p>
2756          For internal entities (<code>&lt;!ENTITY foo "bar"&gt;</code>),
2757          <code>value</code> will be non-<code>NULL</code> and <code>systemId</code>,
2758          <code>publicId</code>, and <code>notationName</code> will all be
2759          <code>NULL</code>. The value string is <em>not</em> null-terminated; the length
2760          is provided in the <code>value_length</code> parameter. Do not use
2761          <code>value_length</code> to test for internal entities, since it is legal to
2762          have zero-length values. Instead check for whether or not <code>value</code> is
2763          <code>NULL</code>.
2764        </p>
2765
2766        <p>
2767          The <code>notationName</code> argument will have a non-<code>NULL</code> value
2768          only for unparsed entity declarations.
2769        </p>
2770      </div>
2771
2772      <div class="handler">
2773        <h4 id="XML_SetUnparsedEntityDeclHandler">
2774          XML_SetUnparsedEntityDeclHandler
2775        </h4>
2776
2777        <pre class="setter">
2778void XMLCALL
2779XML_SetUnparsedEntityDeclHandler(XML_Parser p,
2780                                 XML_UnparsedEntityDeclHandler h)
2781</pre>
2782
2783        <pre class="signature">
2784typedef void
2785(XMLCALL *XML_UnparsedEntityDeclHandler)(void *userData,
2786                                         const XML_Char *entityName,
2787                                         const XML_Char *base,
2788                                         const XML_Char *systemId,
2789                                         const XML_Char *publicId,
2790                                         const XML_Char *notationName);
2791</pre>
2792        <p>
2793          Set a handler that receives declarations of unparsed entities. These are entity
2794          declarations that have a notation (NDATA) field:
2795        </p>
2796
2797        <div id="eg">
2798          <pre>
2799&lt;!ENTITY logo SYSTEM "images/logo.gif" NDATA gif&gt;
2800</pre>
2801        </div>
2802
2803        <p>
2804          This handler is obsolete and is provided for backwards compatibility. Use
2805          instead <a href="#XML_SetEntityDeclHandler">XML_SetEntityDeclHandler</a>.
2806        </p>
2807      </div>
2808
2809      <div class="handler">
2810        <h4 id="XML_SetNotationDeclHandler">
2811          XML_SetNotationDeclHandler
2812        </h4>
2813
2814        <pre class="setter">
2815void XMLCALL
2816XML_SetNotationDeclHandler(XML_Parser p,
2817                           XML_NotationDeclHandler h)
2818</pre>
2819
2820        <pre class="signature">
2821typedef void
2822(XMLCALL *XML_NotationDeclHandler)(void *userData,
2823                                   const XML_Char *notationName,
2824                                   const XML_Char *base,
2825                                   const XML_Char *systemId,
2826                                   const XML_Char *publicId);
2827</pre>
2828        <p>
2829          Set a handler that receives notation declarations.
2830        </p>
2831      </div>
2832
2833      <div class="handler">
2834        <h4 id="XML_SetNotStandaloneHandler">
2835          XML_SetNotStandaloneHandler
2836        </h4>
2837
2838        <pre class="setter">
2839void XMLCALL
2840XML_SetNotStandaloneHandler(XML_Parser p,
2841                            XML_NotStandaloneHandler h)
2842</pre>
2843
2844        <pre class="signature">
2845typedef int
2846(XMLCALL *XML_NotStandaloneHandler)(void *userData);
2847</pre>
2848        <p>
2849          Set a handler that is called if the document is not "standalone". This happens
2850          when there is an external subset or a reference to a parameter entity, but does
2851          not have standalone set to "yes" in an XML declaration. If this handler returns
2852          <code>XML_STATUS_ERROR</code>, then the parser will throw an
2853          <code>XML_ERROR_NOT_STANDALONE</code> error.
2854        </p>
2855      </div>
2856
2857      <h3>
2858        <a id="position" name="position">Parse position and error reporting functions</a>
2859      </h3>
2860
2861      <p>
2862        These are the functions you'll want to call when the parse functions return
2863        <code>XML_STATUS_ERROR</code> (a parse error has occurred), although the position
2864        reporting functions are useful outside of errors. The position reported is the
2865        byte position (in the original document or entity encoding) of the first of the
2866        sequence of characters that generated the current event (or the error that caused
2867        the parse functions to return <code>XML_STATUS_ERROR</code>.) The exceptions are
2868        callbacks triggered by declarations in the document prologue, in which case they
2869        exact position reported is somewhere in the relevant markup, but not necessarily
2870        as meaningful as for other events.
2871      </p>
2872
2873      <p>
2874        The position reporting functions are accurate only outside of the DTD. In other
2875        words, they usually return bogus information when called from within a DTD
2876        declaration handler.
2877      </p>
2878
2879      <h4 id="XML_GetErrorCode">
2880        XML_GetErrorCode
2881      </h4>
2882
2883      <pre class="fcndec">
2884enum XML_Error XMLCALL
2885XML_GetErrorCode(XML_Parser p);
2886</pre>
2887      <div class="fcndef">
2888        Return what type of error has occurred.
2889      </div>
2890
2891      <h4 id="XML_ErrorString">
2892        XML_ErrorString
2893      </h4>
2894
2895      <pre class="fcndec">
2896const XML_LChar * XMLCALL
2897XML_ErrorString(enum XML_Error code);
2898</pre>
2899      <div class="fcndef">
2900        Return a string describing the error corresponding to code. The code should be
2901        one of the enums that can be returned from <code><a href=
2902        "#XML_GetErrorCode">XML_GetErrorCode</a></code>.
2903      </div>
2904
2905      <h4 id="XML_GetCurrentByteIndex">
2906        XML_GetCurrentByteIndex
2907      </h4>
2908
2909      <pre class="fcndec">
2910XML_Index XMLCALL
2911XML_GetCurrentByteIndex(XML_Parser p);
2912</pre>
2913      <div class="fcndef">
2914        Return the byte offset of the position. This always corresponds to the values
2915        returned by <code><a href=
2916        "#XML_GetCurrentLineNumber">XML_GetCurrentLineNumber</a></code> and
2917        <code><a href="#XML_GetCurrentColumnNumber">XML_GetCurrentColumnNumber</a></code>.
2918      </div>
2919
2920      <h4 id="XML_GetCurrentLineNumber">
2921        XML_GetCurrentLineNumber
2922      </h4>
2923
2924      <pre class="fcndec">
2925XML_Size XMLCALL
2926XML_GetCurrentLineNumber(XML_Parser p);
2927</pre>
2928      <div class="fcndef">
2929        Return the line number of the position. The first line is reported as
2930        <code>1</code>.
2931      </div>
2932
2933      <h4 id="XML_GetCurrentColumnNumber">
2934        XML_GetCurrentColumnNumber
2935      </h4>
2936
2937      <pre class="fcndec">
2938XML_Size XMLCALL
2939XML_GetCurrentColumnNumber(XML_Parser p);
2940</pre>
2941      <div class="fcndef">
2942        Return the <em>offset</em>, from the beginning of the current line, of the
2943        position. The first column is reported as <code>0</code>.
2944      </div>
2945
2946      <h4 id="XML_GetCurrentByteCount">
2947        XML_GetCurrentByteCount
2948      </h4>
2949
2950      <pre class="fcndec">
2951int XMLCALL
2952XML_GetCurrentByteCount(XML_Parser p);
2953</pre>
2954      <div class="fcndef">
2955        Return the number of bytes in the current event. Returns <code>0</code> if the
2956        event is inside a reference to an internal entity and for the end-tag event for
2957        empty element tags (the later can be used to distinguish empty-element tags from
2958        empty elements using separate start and end tags).
2959      </div>
2960
2961      <h4 id="XML_GetInputContext">
2962        XML_GetInputContext
2963      </h4>
2964
2965      <pre class="fcndec">
2966const char * XMLCALL
2967XML_GetInputContext(XML_Parser p,
2968                    int *offset,
2969                    int *size);
2970</pre>
2971      <div class="fcndef">
2972        <p>
2973          Returns the parser's input buffer, sets the integer pointed at by
2974          <code>offset</code> to the offset within this buffer of the current parse
2975          position, and set the integer pointed at by <code>size</code> to the size of
2976          the returned buffer.
2977        </p>
2978
2979        <p>
2980          This should only be called from within a handler during an active parse and the
2981          returned buffer should only be referred to from within the handler that made
2982          the call. This input buffer contains the untranslated bytes of the input.
2983        </p>
2984
2985        <p>
2986          Only a limited amount of context is kept, so if the event triggering a call
2987          spans over a very large amount of input, the actual parse position may be
2988          before the beginning of the buffer.
2989        </p>
2990
2991        <p>
2992          If <code>XML_CONTEXT_BYTES</code> is zero, this will always return
2993          <code>NULL</code>.
2994        </p>
2995      </div>
2996
2997      <h3>
2998        <a id="attack-protection" name="attack-protection">Attack Protection</a><a id=
2999        "billion-laughs" name="billion-laughs"></a>
3000      </h3>
3001
3002      <h4 id="XML_SetBillionLaughsAttackProtectionMaximumAmplification">
3003        XML_SetBillionLaughsAttackProtectionMaximumAmplification
3004      </h4>
3005
3006      <pre class="fcndec">
3007/* Added in Expat 2.4.0. */
3008XML_Bool XMLCALL
3009XML_SetBillionLaughsAttackProtectionMaximumAmplification(XML_Parser p,
3010                                                         float maximumAmplificationFactor);
3011</pre>
3012      <div class="fcndef">
3013        <p>
3014          Sets the maximum tolerated amplification factor for protection against <a href=
3015          "https://en.wikipedia.org/wiki/Billion_laughs_attack">billion laughs
3016          attacks</a> (default: <code>100.0</code>) of parser <code>p</code> to
3017          <code>maximumAmplificationFactor</code>, and returns <code>XML_TRUE</code> upon
3018          success and <code>XML_FALSE</code> upon error.
3019        </p>
3020
3021        <p>
3022          Once the <a href=
3023          "#XML_SetBillionLaughsAttackProtectionActivationThreshold">threshold for
3024          activation</a> is reached, the amplification factor is calculated as ..
3025        </p>
3026
3027        <pre>amplification := (direct + indirect) / direct</pre>
3028        <p>
3029          .. while parsing, whereas <code>direct</code> is the number of bytes read from
3030          the primary document in parsing and <code>indirect</code> is the number of
3031          bytes added by expanding entities and reading of external DTD files, combined.
3032        </p>
3033
3034        <p>
3035          For a call to
3036          <code>XML_SetBillionLaughsAttackProtectionMaximumAmplification</code> to
3037          succeed:
3038        </p>
3039
3040        <ul>
3041          <li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without
3042          any parent parsers) and
3043          </li>
3044
3045          <li>
3046            <code>maximumAmplificationFactor</code> must be non-<code>NaN</code> and
3047            greater than or equal to <code>1.0</code>.
3048          </li>
3049        </ul>
3050
3051        <p>
3052          <strong>Note:</strong> If you ever need to increase this value for non-attack
3053          payload, please <a href="https://github.com/libexpat/libexpat/issues">file a
3054          bug report</a>.
3055        </p>
3056
3057        <p>
3058          <strong>Note:</strong> Peak amplifications of factor 15,000 for the entire
3059          payload and of factor 30,000 in the middle of parsing have been observed with
3060          small benign files in practice. So if you do reduce the maximum allowed
3061          amplification, please make sure that the activation threshold is still big
3062          enough to not end up with undesired false positives (i.e. benign files being
3063          rejected).
3064        </p>
3065      </div>
3066
3067      <h4 id="XML_SetBillionLaughsAttackProtectionActivationThreshold">
3068        XML_SetBillionLaughsAttackProtectionActivationThreshold
3069      </h4>
3070
3071      <pre class="fcndec">
3072/* Added in Expat 2.4.0. */
3073XML_Bool XMLCALL
3074XML_SetBillionLaughsAttackProtectionActivationThreshold(XML_Parser p,
3075                                                        unsigned long long activationThresholdBytes);
3076</pre>
3077      <div class="fcndef">
3078        <p>
3079          Sets number of output bytes (including amplification from entity expansion and
3080          reading DTD files) needed to activate protection against <a href=
3081          "https://en.wikipedia.org/wiki/Billion_laughs_attack">billion laughs
3082          attacks</a> (default: <code>8 MiB</code>) of parser <code>p</code> to
3083          <code>activationThresholdBytes</code>, and returns <code>XML_TRUE</code> upon
3084          success and <code>XML_FALSE</code> upon error.
3085        </p>
3086
3087        <p>
3088          For a call to
3089          <code>XML_SetBillionLaughsAttackProtectionActivationThreshold</code> to
3090          succeed:
3091        </p>
3092
3093        <ul>
3094          <li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without
3095          any parent parsers).
3096          </li>
3097        </ul>
3098
3099        <p>
3100          <strong>Note:</strong> If you ever need to increase this value for non-attack
3101          payload, please <a href="https://github.com/libexpat/libexpat/issues">file a
3102          bug report</a>.
3103        </p>
3104
3105        <p>
3106          <strong>Note:</strong> Activation thresholds below 4 MiB are known to break
3107          support for <a href=
3108          "https://en.wikipedia.org/wiki/Darwin_Information_Typing_Architecture">DITA</a>
3109          1.3 payload and are hence not recommended.
3110        </p>
3111      </div>
3112
3113      <h4 id="XML_SetAllocTrackerMaximumAmplification">
3114        XML_SetAllocTrackerMaximumAmplification
3115      </h4>
3116
3117      <pre class="fcndec">
3118/* Added in Expat 2.7.2. */
3119XML_Bool
3120XML_SetAllocTrackerMaximumAmplification(XML_Parser p,
3121                                        float maximumAmplificationFactor);
3122</pre>
3123      <div class="fcndef">
3124        <p>
3125          Sets the maximum tolerated amplification factor between direct input and bytes
3126          of dynamic memory allocated (default: <code>100.0</code>) of parser
3127          <code>p</code> to <code>maximumAmplificationFactor</code>, and returns
3128          <code>XML_TRUE</code> upon success and <code>XML_FALSE</code> upon error.
3129        </p>
3130
3131        <p>
3132          <strong>Note:</strong> There are three types of allocations that intentionally
3133          bypass tracking and limiting:
3134        </p>
3135
3136        <ul>
3137          <li>application calls to functions <code><a href=
3138          "#XML_MemMalloc">XML_MemMalloc</a></code> and <code><a href="#XML_MemRealloc">
3139            XML_MemRealloc</a></code> — <em>healthy</em> use of these two functions
3140            continues to be a responsibility of the application using Expat —,
3141          </li>
3142
3143          <li>the main character buffer used by functions <code><a href="#XML_GetBuffer">
3144            XML_GetBuffer</a></code> and <code><a href=
3145            "#XML_ParseBuffer">XML_ParseBuffer</a></code> (and thus also by plain
3146            <code><a href="#XML_Parse">XML_Parse</a></code>), and
3147          </li>
3148
3149          <li>the <a href="#XML_SetElementDeclHandler">content model memory</a> (that is
3150          passed to the <a href="#XML_SetElementDeclHandler">element declaration
3151          handler</a> and freed by a call to <code><a href=
3152          "#XML_FreeContentModel">XML_FreeContentModel</a></code>).
3153          </li>
3154        </ul>
3155
3156        <p>
3157          Once the <a href="#XML_SetAllocTrackerActivationThreshold">threshold for
3158          activation</a> is reached, the amplification factor is calculated as ..
3159        </p>
3160
3161        <pre>amplification := allocated / direct</pre>
3162        <p>
3163          .. while parsing, whereas <code>direct</code> is the number of bytes read from
3164          the primary document in parsing and <code>allocated</code> is the number of
3165          bytes of dynamic memory allocated in the parser hierarchy.
3166        </p>
3167
3168        <p>
3169          For a call to <code>XML_SetAllocTrackerMaximumAmplification</code> to succeed:
3170        </p>
3171
3172        <ul>
3173          <li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without
3174          any parent parsers) and
3175          </li>
3176
3177          <li>
3178            <code>maximumAmplificationFactor</code> must be non-<code>NaN</code> and
3179            greater than or equal to <code>1.0</code>.
3180          </li>
3181        </ul>
3182
3183        <p>
3184          <strong>Note:</strong> If you ever need to increase this value for non-attack
3185          payload, please <a href="https://github.com/libexpat/libexpat/issues">file a
3186          bug report</a>.
3187        </p>
3188
3189        <p>
3190          <strong>Note:</strong> Amplifications factors greater than <code>100.0</code>
3191          can been observed near the start of parsing even with benign files in practice.
3192          So if you do reduce the maximum allowed amplification, please make sure that
3193          the activation threshold is still big enough to not end up with undesired false
3194          positives (i.e. benign files being rejected).
3195        </p>
3196      </div>
3197
3198      <h4 id="XML_SetAllocTrackerActivationThreshold">
3199        XML_SetAllocTrackerActivationThreshold
3200      </h4>
3201
3202      <pre class="fcndec">
3203/* Added in Expat 2.7.2. */
3204XML_Bool
3205XML_SetAllocTrackerActivationThreshold(XML_Parser p,
3206                                       unsigned long long activationThresholdBytes);
3207</pre>
3208      <div class="fcndef">
3209        <p>
3210          Sets number of allocated bytes of dynamic memory needed to activate protection
3211          against disproportionate use of RAM (default: <code>64 MiB</code>) of parser
3212          <code>p</code> to <code>activationThresholdBytes</code>, and returns
3213          <code>XML_TRUE</code> upon success and <code>XML_FALSE</code> upon error.
3214        </p>
3215
3216        <p>
3217          <strong>Note:</strong> For types of allocations that intentionally bypass
3218          tracking and limiting, please see <code><a href=
3219          "#XML_SetAllocTrackerMaximumAmplification">XML_SetAllocTrackerMaximumAmplification</a></code>
3220          above.
3221        </p>
3222
3223        <p>
3224          For a call to <code>XML_SetAllocTrackerActivationThreshold</code> to succeed:
3225        </p>
3226
3227        <ul>
3228          <li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without
3229          any parent parsers).
3230          </li>
3231        </ul>
3232
3233        <p>
3234          <strong>Note:</strong> If you ever need to increase this value for non-attack
3235          payload, please <a href="https://github.com/libexpat/libexpat/issues">file a
3236          bug report</a>.
3237        </p>
3238      </div>
3239
3240      <h4 id="XML_SetReparseDeferralEnabled">
3241        XML_SetReparseDeferralEnabled
3242      </h4>
3243
3244      <pre class="fcndec">
3245/* Added in Expat 2.6.0. */
3246XML_Bool XMLCALL
3247XML_SetReparseDeferralEnabled(XML_Parser parser, XML_Bool enabled);
3248</pre>
3249      <div class="fcndef">
3250        <p>
3251          Large tokens may require many parse calls before enough data is available for
3252          Expat to parse it in full. If Expat retried parsing the token on every parse
3253          call, parsing could take quadratic time. To avoid this, Expat only retries once
3254          a significant amount of new data is available. This function allows disabling
3255          this behavior.
3256        </p>
3257
3258        <p>
3259          The <code>enabled</code> argument should be <code>XML_TRUE</code> or
3260          <code>XML_FALSE</code>.
3261        </p>
3262
3263        <p>
3264          Returns <code>XML_TRUE</code> on success, and <code>XML_FALSE</code> on error.
3265        </p>
3266      </div>
3267
3268      <h3>
3269        <a id="miscellaneous" name="miscellaneous">Miscellaneous functions</a>
3270      </h3>
3271
3272      <p>
3273        The functions in this section either obtain state information from the parser or
3274        can be used to dynamically set parser options.
3275      </p>
3276
3277      <h4 id="XML_SetUserData">
3278        XML_SetUserData
3279      </h4>
3280
3281      <pre class="fcndec">
3282void XMLCALL
3283XML_SetUserData(XML_Parser p,
3284                void *userData);
3285</pre>
3286      <div class="fcndef">
3287        This sets the user data pointer that gets passed to handlers. It overwrites any
3288        previous value for this pointer. Note that the application is responsible for
3289        freeing the memory associated with <code>userData</code> when it is finished with
3290        the parser. So if you call this when there's already a pointer there, and you
3291        haven't freed the memory associated with it, then you've probably just leaked
3292        memory.
3293      </div>
3294
3295      <h4 id="XML_GetUserData">
3296        XML_GetUserData
3297      </h4>
3298
3299      <pre class="fcndec">
3300void * XMLCALL
3301XML_GetUserData(XML_Parser p);
3302</pre>
3303      <div class="fcndef">
3304        This returns the user data pointer that gets passed to handlers. It is actually
3305        implemented as a macro.
3306      </div>
3307
3308      <h4 id="XML_UseParserAsHandlerArg">
3309        XML_UseParserAsHandlerArg
3310      </h4>
3311
3312      <pre class="fcndec">
3313void XMLCALL
3314XML_UseParserAsHandlerArg(XML_Parser p);
3315</pre>
3316      <div class="fcndef">
3317        After this is called, handlers receive the parser in their <code>userData</code>
3318        arguments. The user data can still be obtained using the <code><a href=
3319        "#XML_GetUserData">XML_GetUserData</a></code> function.
3320      </div>
3321
3322      <h4 id="XML_SetBase">
3323        XML_SetBase
3324      </h4>
3325
3326      <pre class="fcndec">
3327enum XML_Status XMLCALL
3328XML_SetBase(XML_Parser p,
3329            const XML_Char *base);
3330</pre>
3331      <div class="fcndef">
3332        Set the base to be used for resolving relative URIs in system identifiers. The
3333        return value is <code>XML_STATUS_ERROR</code> if there's no memory to store base,
3334        otherwise it's <code>XML_STATUS_OK</code>.
3335      </div>
3336
3337      <h4 id="XML_GetBase">
3338        XML_GetBase
3339      </h4>
3340
3341      <pre class="fcndec">
3342const XML_Char * XMLCALL
3343XML_GetBase(XML_Parser p);
3344</pre>
3345      <div class="fcndef">
3346        Return the base for resolving relative URIs.
3347      </div>
3348
3349      <h4 id="XML_GetSpecifiedAttributeCount">
3350        XML_GetSpecifiedAttributeCount
3351      </h4>
3352
3353      <pre class="fcndec">
3354int XMLCALL
3355XML_GetSpecifiedAttributeCount(XML_Parser p);
3356</pre>
3357      <div class="fcndef">
3358        When attributes are reported to the start handler in the atts vector, attributes
3359        that were explicitly set in the element occur before any attributes that receive
3360        their value from default information in an ATTLIST declaration. This function
3361        returns the number of attributes that were explicitly set times two, thus giving
3362        the offset in the <code>atts</code> array passed to the start tag handler of the
3363        first attribute set due to defaults. It supplies information for the last call to
3364        a start handler. If called inside a start handler, then that means the current
3365        call.
3366      </div>
3367
3368      <h4 id="XML_GetIdAttributeIndex">
3369        XML_GetIdAttributeIndex
3370      </h4>
3371
3372      <pre class="fcndec">
3373int XMLCALL
3374XML_GetIdAttributeIndex(XML_Parser p);
3375</pre>
3376      <div class="fcndef">
3377        Returns the index of the ID attribute passed in the atts array in the last call
3378        to <code><a href="#XML_StartElementHandler">XML_StartElementHandler</a></code>,
3379        or -1 if there is no ID attribute. If called inside a start handler, then that
3380        means the current call.
3381      </div>
3382
3383      <h4 id="XML_GetAttributeInfo">
3384        XML_GetAttributeInfo
3385      </h4>
3386
3387      <pre class="fcndec">
3388const XML_AttrInfo * XMLCALL
3389XML_GetAttributeInfo(XML_Parser parser);
3390</pre>
3391
3392      <pre class="signature">
3393typedef struct {
3394  XML_Index  nameStart;  /* Offset to beginning of the attribute name. */
3395  XML_Index  nameEnd;    /* Offset after the attribute name's last byte. */
3396  XML_Index  valueStart; /* Offset to beginning of the attribute value. */
3397  XML_Index  valueEnd;   /* Offset after the attribute value's last byte. */
3398} XML_AttrInfo;
3399</pre>
3400      <div class="fcndef">
3401        Returns an array of <code>XML_AttrInfo</code> structures for the attribute/value
3402        pairs passed in the last call to the <code>XML_StartElementHandler</code> that
3403        were specified in the start-tag rather than defaulted. Each attribute/value pair
3404        counts as 1; thus the number of entries in the array is
3405        <code>XML_GetSpecifiedAttributeCount(parser) / 2</code>.
3406      </div>
3407
3408      <h4 id="XML_SetEncoding">
3409        XML_SetEncoding
3410      </h4>
3411
3412      <pre class="fcndec">
3413enum XML_Status XMLCALL
3414XML_SetEncoding(XML_Parser p,
3415                const XML_Char *encoding);
3416</pre>
3417      <div class="fcndef">
3418        Set the encoding to be used by the parser. It is equivalent to passing a
3419        non-<code>NULL</code> encoding argument to the parser creation functions. It must
3420        not be called after <code><a href="#XML_Parse">XML_Parse</a></code> or
3421        <code><a href="#XML_ParseBuffer">XML_ParseBuffer</a></code> have been called on
3422        the given parser. Returns <code>XML_STATUS_OK</code> on success or
3423        <code>XML_STATUS_ERROR</code> on error.
3424      </div>
3425
3426      <h4 id="XML_SetParamEntityParsing">
3427        XML_SetParamEntityParsing
3428      </h4>
3429
3430      <pre class="fcndec">
3431int XMLCALL
3432XML_SetParamEntityParsing(XML_Parser p,
3433                          enum XML_ParamEntityParsing code);
3434</pre>
3435      <div class="fcndef">
3436        This enables parsing of parameter entities, including the external parameter
3437        entity that is the external DTD subset, according to <code>code</code>. The
3438        choices for <code>code</code> are:
3439        <ul>
3440          <li>
3441            <code>XML_PARAM_ENTITY_PARSING_NEVER</code>
3442          </li>
3443
3444          <li>
3445            <code>XML_PARAM_ENTITY_PARSING_UNLESS_STANDALONE</code>
3446          </li>
3447
3448          <li>
3449            <code>XML_PARAM_ENTITY_PARSING_ALWAYS</code>
3450          </li>
3451        </ul>
3452        <b>Note:</b> If <code>XML_SetParamEntityParsing</code> is called after
3453        <code>XML_Parse</code> or <code>XML_ParseBuffer</code>, then it has no effect and
3454        will always return 0.
3455      </div>
3456
3457      <h4 id="XML_SetHashSalt">
3458        XML_SetHashSalt (deprecated)
3459      </h4>
3460
3461      <pre class="fcndec">
3462int XMLCALL
3463XML_SetHashSalt(XML_Parser parser,
3464                unsigned long hash_salt);
3465</pre>
3466      <div class="fcndef">
3467        Sets the hash salt to use for internal hash calculations. Helps in preventing DoS
3468        attacks based on predicting hash function behavior. In order to have an effect
3469        this must be called before parsing has started. Returns 1 if successful, 0 when
3470        called after <code>XML_Parse</code> or <code>XML_ParseBuffer</code> or when
3471        <code>parser</code> is <code>NULL</code>.
3472        <p>
3473          <b>Note:</b> Function <code>XML_SetHashSalt</code> is
3474          <strong>deprecated</strong>. Please use function <code><a href=
3475          "#XML_SetHashSalt16Bytes">XML_SetHashSalt16Bytes</a></code> instead for better
3476          security. <code>XML_SetHashSalt</code> only provides 4 to 8 bytes of entropy
3477          (depending on the size of type <code>unsigned long</code>) while the SipHash
3478          implementation used by Expat can leverage up to 16 bytes of entropy — at least
3479          twice as much. Function <code><a href=
3480          "#XML_SetHashSalt16Bytes">XML_SetHashSalt16Bytes</a></code> of Expat &gt;=2.8.0
3481          (and where backported) matches the amount of entropy supported by SipHash.
3482        </p>
3483
3484        <p>
3485          <b>Note:</b> This call is optional, as the parser will auto-generate a new
3486          random salt value internally if no value has been set by the start of parsing.
3487        </p>
3488
3489        <p>
3490          <b>Note:</b> One should not call <code>XML_SetHashSalt</code> with a hash salt
3491          value of 0, as this value is used as sentinel value to indicate that
3492          <code>XML_SetHashSalt</code> has <b>not</b> been called. Consequently such a
3493          call will have no effect, even if it returns 1.
3494        </p>
3495      </div>
3496
3497      <h4 id="XML_SetHashSalt16Bytes">
3498        XML_SetHashSalt16Bytes
3499      </h4>
3500
3501      <pre class="fcndec">
3502/* Added in Expat 2.8.0. */
3503XML_Bool XMLCALL
3504XML_SetHashSalt16Bytes(XML_Parser parser,
3505                       const uint8_t entropy[16]);
3506</pre>
3507      <div class="fcndef">
3508        Sets the hash salt to use for internal hash calculations. Helps in preventing DoS
3509        attacks based on predicting hash function behavior. In order to have an effect
3510        this must be called before parsing has started. Returns <code>XML_TRUE</code> if
3511        successful, <code>XML_FALSE</code> when called after <code>XML_Parse</code> or
3512        <code>XML_ParseBuffer</code> or when <code>parser</code> is <code>NULL</code>.
3513        <p>
3514          <b>Note:</b> Setting a salt that is <em>not</em> from a source of high quality
3515          entropy (like <code>getentropy(3)</code>) will make the parser vulnerable to
3516          hash flooding attacks.
3517        </p>
3518
3519        <p>
3520          <b>Note:</b> This call is optional, as the parser will auto-generate a new
3521          random salt value internally if no value has been set by the start of parsing.
3522        </p>
3523      </div>
3524
3525      <h4 id="XML_UseForeignDTD">
3526        XML_UseForeignDTD
3527      </h4>
3528
3529      <pre class="fcndec">
3530enum XML_Error XMLCALL
3531XML_UseForeignDTD(XML_Parser parser, XML_Bool useDTD);
3532</pre>
3533      <div class="fcndef">
3534        <p>
3535          This function allows an application to provide an external subset for the
3536          document type declaration for documents which do not specify an external subset
3537          of their own. For documents which specify an external subset in their DOCTYPE
3538          declaration, the application-provided subset will be ignored. If the document
3539          does not contain a DOCTYPE declaration at all and <code>useDTD</code> is true,
3540          the application-provided subset will be parsed, but the
3541          <code>startDoctypeDeclHandler</code> and <code>endDoctypeDeclHandler</code>
3542          functions, if set, will not be called. The setting of parameter entity parsing,
3543          controlled using <code><a href=
3544          "#XML_SetParamEntityParsing">XML_SetParamEntityParsing</a></code>, will be
3545          honored.
3546        </p>
3547
3548        <p>
3549          The application-provided external subset is read by calling the external entity
3550          reference handler set via <code><a href=
3551          "#XML_SetExternalEntityRefHandler">XML_SetExternalEntityRefHandler</a></code>
3552          with both <code>publicId</code> and <code>systemId</code> set to
3553          <code>NULL</code>.
3554        </p>
3555
3556        <p>
3557          If this function is called after parsing has begun, it returns
3558          <code>XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING</code> and ignores
3559          <code>useDTD</code>. If called when Expat has been compiled without DTD
3560          support, it returns <code>XML_ERROR_FEATURE_REQUIRES_XML_DTD</code>. Otherwise,
3561          it returns <code>XML_ERROR_NONE</code>.
3562        </p>
3563
3564        <p>
3565          <b>Note:</b> For the purpose of checking WFC: Entity Declared, passing
3566          <code>useDTD == XML_TRUE</code> will make the parser behave as if the document
3567          had a DTD with an external subset. This holds true even if the external entity
3568          reference handler returns without action.
3569        </p>
3570      </div>
3571
3572      <h4 id="XML_SetReturnNSTriplet">
3573        XML_SetReturnNSTriplet
3574      </h4>
3575
3576      <pre class="fcndec">
3577void XMLCALL
3578XML_SetReturnNSTriplet(XML_Parser parser,
3579                       int        do_nst);
3580</pre>
3581      <div class="fcndef">
3582        <p>
3583          This function only has an effect when using a parser created with
3584          <code><a href="#XML_ParserCreateNS">XML_ParserCreateNS</a></code>, i.e. when
3585          namespace processing is in effect. The <code>do_nst</code> sets whether or not
3586          prefixes are returned with names qualified with a namespace prefix. If this
3587          function is called with <code>do_nst</code> non-zero, then afterwards namespace
3588          qualified names (that is qualified with a prefix as opposed to belonging to a
3589          default namespace) are returned as a triplet with the three parts separated by
3590          the namespace separator specified when the parser was created. The order of
3591          returned parts is URI, local name, and prefix.
3592        </p>
3593
3594        <p>
3595          If <code>do_nst</code> is zero, then namespaces are reported in the default
3596          manner, URI then local_name separated by the namespace separator.
3597        </p>
3598      </div>
3599
3600      <h4 id="XML_DefaultCurrent">
3601        XML_DefaultCurrent
3602      </h4>
3603
3604      <pre class="fcndec">
3605void XMLCALL
3606XML_DefaultCurrent(XML_Parser parser);
3607</pre>
3608      <div class="fcndef">
3609        This can be called within a handler for a start element, end element, processing
3610        instruction or character data. It causes the corresponding markup to be passed to
3611        the default handler set by <code><a href=
3612        "#XML_SetDefaultHandler">XML_SetDefaultHandler</a></code> or <code><a href=
3613        "#XML_SetDefaultHandlerExpand">XML_SetDefaultHandlerExpand</a></code>. It does
3614        nothing if there is not a default handler.
3615      </div>
3616
3617      <h4 id="XML_ExpatVersion">
3618        XML_ExpatVersion
3619      </h4>
3620
3621      <pre class="fcndec">
3622XML_LChar * XMLCALL
3623XML_ExpatVersion();
3624</pre>
3625      <div class="fcndef">
3626        Return the library version as a string (e.g. <code>"expat_1.95.1"</code>).
3627      </div>
3628
3629      <h4 id="XML_ExpatVersionInfo">
3630        XML_ExpatVersionInfo
3631      </h4>
3632
3633      <pre class="fcndec">
3634struct XML_Expat_Version XMLCALL
3635XML_ExpatVersionInfo();
3636</pre>
3637
3638      <pre class="signature">
3639typedef struct {
3640  int major;
3641  int minor;
3642  int micro;
3643} XML_Expat_Version;
3644</pre>
3645      <div class="fcndef">
3646        Return the library version information as a structure. Some macros are also
3647        defined that support compile-time tests of the library version:
3648        <ul>
3649          <li>
3650            <code>XML_MAJOR_VERSION</code>
3651          </li>
3652
3653          <li>
3654            <code>XML_MINOR_VERSION</code>
3655          </li>
3656
3657          <li>
3658            <code>XML_MICRO_VERSION</code>
3659          </li>
3660        </ul>
3661        Testing these constants is currently the best way to determine if particular
3662        parts of the Expat API are available.
3663      </div>
3664
3665      <h4 id="XML_GetFeatureList">
3666        XML_GetFeatureList
3667      </h4>
3668
3669      <pre class="fcndec">
3670const XML_Feature * XMLCALL
3671XML_GetFeatureList();
3672</pre>
3673
3674      <pre class="signature">
3675enum XML_FeatureEnum {
3676  XML_FEATURE_END = 0,
3677  XML_FEATURE_UNICODE,
3678  XML_FEATURE_UNICODE_WCHAR_T,
3679  XML_FEATURE_DTD,
3680  XML_FEATURE_CONTEXT_BYTES,
3681  XML_FEATURE_MIN_SIZE,
3682  XML_FEATURE_SIZEOF_XML_CHAR,
3683  XML_FEATURE_SIZEOF_XML_LCHAR,
3684  XML_FEATURE_NS,
3685  XML_FEATURE_LARGE_SIZE
3686};
3687
3688typedef struct {
3689  enum XML_FeatureEnum  feature;
3690  XML_LChar            *name;
3691  long int              value;
3692} XML_Feature;
3693</pre>
3694      <div class="fcndef">
3695        <p>
3696          Returns a list of "feature" records, providing details on how Expat was
3697          configured at compile time. Most applications should not need to worry about
3698          this, but this information is otherwise not available from Expat. This function
3699          allows code that does need to check these features to do so at runtime.
3700        </p>
3701
3702        <p>
3703          The return value is an array of <code>XML_Feature</code>, terminated by a
3704          record with a <code>feature</code> of <code>XML_FEATURE_END</code> and
3705          <code>name</code> of <code>NULL</code>, identifying the feature-test macros
3706          Expat was compiled with. Since an application that requires this kind of
3707          information needs to determine the type of character the <code>name</code>
3708          points to, records for the <code>XML_FEATURE_SIZEOF_XML_CHAR</code> and
3709          <code>XML_FEATURE_SIZEOF_XML_LCHAR</code> will be located at the beginning of
3710          the list, followed by <code>XML_FEATURE_UNICODE</code> and
3711          <code>XML_FEATURE_UNICODE_WCHAR_T</code>, if they are present at all.
3712        </p>
3713
3714        <p>
3715          Some features have an associated value. If there isn't an associated value, the
3716          <code>value</code> field is set to 0. At this time, the following features have
3717          been defined to have values:
3718        </p>
3719
3720        <dl>
3721          <dt>
3722            <code>XML_FEATURE_SIZEOF_XML_CHAR</code>
3723          </dt>
3724
3725          <dd>
3726            The number of bytes occupied by one <code>XML_Char</code> character.
3727          </dd>
3728
3729          <dt>
3730            <code>XML_FEATURE_SIZEOF_XML_LCHAR</code>
3731          </dt>
3732
3733          <dd>
3734            The number of bytes occupied by one <code>XML_LChar</code> character.
3735          </dd>
3736
3737          <dt>
3738            <code>XML_FEATURE_CONTEXT_BYTES</code>
3739          </dt>
3740
3741          <dd>
3742            The maximum number of characters of context which can be reported by
3743            <code><a href="#XML_GetInputContext">XML_GetInputContext</a></code>.
3744          </dd>
3745        </dl>
3746      </div>
3747
3748      <h4 id="XML_FreeContentModel">
3749        XML_FreeContentModel
3750      </h4>
3751
3752      <pre class="fcndec">
3753void XMLCALL
3754XML_FreeContentModel(XML_Parser parser, XML_Content *model);
3755</pre>
3756      <div class="fcndef">
3757        Function to deallocate the <code>model</code> argument passed to the
3758        <code>XML_ElementDeclHandler</code> callback set using <code><a href=
3759        "#XML_SetElementDeclHandler">XML_ElementDeclHandler</a></code>. This function
3760        should not be used for any other purpose.
3761      </div>
3762
3763      <p>
3764        The following functions allow external code to share the memory allocator an
3765        <code>XML_Parser</code> has been configured to use. This is especially useful for
3766        third-party libraries that interact with a parser object created by application
3767        code, or heavily layered applications. This can be essential when using
3768        dynamically loaded libraries which use different C standard libraries (this can
3769        happen on Windows, at least).
3770      </p>
3771
3772      <h4 id="XML_MemMalloc">
3773        XML_MemMalloc
3774      </h4>
3775
3776      <pre class="fcndec">
3777void * XMLCALL
3778XML_MemMalloc(XML_Parser parser, size_t size);
3779</pre>
3780      <div class="fcndef">
3781        Allocate <code>size</code> bytes of memory using the allocator the
3782        <code>parser</code> object has been configured to use. Returns a pointer to the
3783        memory or <code>NULL</code> on failure. Memory allocated in this way must be
3784        freed using <code><a href="#XML_MemFree">XML_MemFree</a></code>.
3785      </div>
3786
3787      <h4 id="XML_MemRealloc">
3788        XML_MemRealloc
3789      </h4>
3790
3791      <pre class="fcndec">
3792void * XMLCALL
3793XML_MemRealloc(XML_Parser parser, void *ptr, size_t size);
3794</pre>
3795      <div class="fcndef">
3796        Allocate <code>size</code> bytes of memory using the allocator the
3797        <code>parser</code> object has been configured to use. <code>ptr</code> must
3798        point to a block of memory allocated by <code><a href=
3799        "#XML_MemMalloc">XML_MemMalloc</a></code> or <code>XML_MemRealloc</code>, or be
3800        <code>NULL</code>. This function tries to expand the block pointed to by
3801        <code>ptr</code> if possible. Returns a pointer to the memory or
3802        <code>NULL</code> on failure. On success, the original block has either been
3803        expanded or freed. On failure, the original block has not been freed; the caller
3804        is responsible for freeing the original block. Memory allocated in this way must
3805        be freed using <code><a href="#XML_MemFree">XML_MemFree</a></code>.
3806      </div>
3807
3808      <h4 id="XML_MemFree">
3809        XML_MemFree
3810      </h4>
3811
3812      <pre class="fcndec">
3813void XMLCALL
3814XML_MemFree(XML_Parser parser, void *ptr);
3815</pre>
3816      <div class="fcndef">
3817        Free a block of memory pointed to by <code>ptr</code>. The block must have been
3818        allocated by <code><a href="#XML_MemMalloc">XML_MemMalloc</a></code> or
3819        <code>XML_MemRealloc</code>, or be <code>NULL</code>.
3820      </div>
3821
3822      <hr />
3823
3824      <div class="footer">
3825        Found a bug in the documentation? <a href=
3826        "https://github.com/libexpat/libexpat/issues">Please file a bug report.</a>
3827      </div>
3828    </div>
3829  </body>
3830</html>
3831