xref: /freebsd/contrib/expat/doc/reference.html (revision ae04c7bbf065278687fa930e81a96767e9009d38)
1<?xml version="1.0" encoding="utf-8"?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
3    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
4<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
5  <head>
6    <!--
7                            __  __            _
8                         ___\ \/ /_ __   __ _| |_
9                        / _ \\  /| '_ \ / _` | __|
10                       |  __//  \| |_) | (_| | |_
11                        \___/_/\_\ .__/ \__,_|\__|
12                                 |_| XML parser
13
14   Copyright (c) 2000      Clark Cooper <coopercc@users.sourceforge.net>
15   Copyright (c) 2000-2004 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
16   Copyright (c) 2002-2012 Karl Waclawek <karl@waclawek.net>
17   Copyright (c) 2017-2026 Sebastian Pipping <sebastian@pipping.org>
18   Copyright (c) 2017      Jakub Wilk <jwilk@jwilk.net>
19   Copyright (c) 2021      Tomas Korbar <tkorbar@redhat.com>
20   Copyright (c) 2021      Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
21   Copyright (c) 2022      Thijs Schreijer <thijs@thijsschreijer.nl>
22   Copyright (c) 2023-2025 Hanno Böck <hanno@gentoo.org>
23   Copyright (c) 2023      Sony Corporation / Snild Dolkow <snild@sony.com>
24   Licensed under the MIT license:
25
26   Permission is  hereby granted,  free of charge,  to any  person obtaining
27   a  copy  of  this  software   and  associated  documentation  files  (the
28   "Software"),  to  deal in  the  Software  without restriction,  including
29   without  limitation the  rights  to use,  copy,  modify, merge,  publish,
30   distribute, sublicense, and/or sell copies of the Software, and to permit
31   persons  to whom  the Software  is  furnished to  do so,  subject to  the
32   following conditions:
33
34   The above copyright  notice and this permission notice  shall be included
35   in all copies or substantial portions of the Software.
36
37   THE  SOFTWARE  IS  PROVIDED  "AS  IS",  WITHOUT  WARRANTY  OF  ANY  KIND,
38   EXPRESS  OR IMPLIED,  INCLUDING  BUT  NOT LIMITED  TO  THE WARRANTIES  OF
39   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN
40   NO EVENT SHALL THE AUTHORS OR  COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
41   DAMAGES OR  OTHER LIABILITY, WHETHER  IN AN  ACTION OF CONTRACT,  TORT OR
42   OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
43   USE OR OTHER DEALINGS IN THE SOFTWARE.
44-->
45
46    <title>
47      Expat XML Parser
48    </title>
49    <meta name="author" content="Clark Cooper, coopercc@netheaven.com" />
50    <link href="ok.min.css" rel="stylesheet" />
51    <link href="style.css" rel="stylesheet" />
52  </head>
53  <body>
54    <div>
55      <h1>
56        The Expat XML Parser <small>Release 2.7.5</small>
57      </h1>
58    </div>
59
60    <div class="content">
61      <p>
62        Expat is a library, written in C, for parsing XML documents. It's the underlying
63        XML parser for the open source Mozilla project, Perl's <code>XML::Parser</code>,
64        Python's <code>xml.parsers.expat</code>, and other open-source XML parsers.
65      </p>
66
67      <p>
68        This library is the creation of James Clark, who's also given us groff (an nroff
69        look-alike), Jade (an implementation of ISO's DSSSL stylesheet language for
70        SGML), XP (a Java XML parser package), XT (a Java XSL engine). James was also the
71        technical lead on the XML Working Group at W3C that produced the XML
72        specification.
73      </p>
74
75      <p>
76        This is free software, licensed under the <a href="../COPYING">MIT/X Consortium
77        license</a>. You may download it from <a href="https://libexpat.github.io/">the
78        Expat home page</a>.
79      </p>
80
81      <p>
82        The bulk of this document was originally commissioned as an article by <a href=
83        "https://www.xml.com/">XML.com</a>. They graciously allowed Clark Cooper to
84        retain copyright and to distribute it with Expat. This version has been
85        substantially extended to include documentation on features which have been added
86        since the original article was published, and additional information on using the
87        original interface.
88      </p>
89
90      <hr />
91
92      <h2>
93        Table of Contents
94      </h2>
95
96      <ul>
97        <li>
98          <a href="#overview">Overview</a>
99        </li>
100
101        <li>
102          <a href="#building">Building and Installing</a>
103        </li>
104
105        <li>
106          <a href="#using">Using Expat</a>
107        </li>
108
109        <li>
110          <a href="#reference">Reference</a>
111          <ul>
112            <li>
113              <a href="#creation">Parser Creation Functions</a>
114              <ul>
115                <li>
116                  <a href="#XML_ParserCreate">XML_ParserCreate</a>
117                </li>
118
119                <li>
120                  <a href="#XML_ParserCreateNS">XML_ParserCreateNS</a>
121                </li>
122
123                <li>
124                  <a href="#XML_ParserCreate_MM">XML_ParserCreate_MM</a>
125                </li>
126
127                <li>
128                  <a href=
129                  "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a>
130                </li>
131
132                <li>
133                  <a href="#XML_ParserFree">XML_ParserFree</a>
134                </li>
135
136                <li>
137                  <a href="#XML_ParserReset">XML_ParserReset</a>
138                </li>
139              </ul>
140            </li>
141
142            <li>
143              <a href="#parsing">Parsing Functions</a>
144              <ul>
145                <li>
146                  <a href="#XML_Parse">XML_Parse</a>
147                </li>
148
149                <li>
150                  <a href="#XML_ParseBuffer">XML_ParseBuffer</a>
151                </li>
152
153                <li>
154                  <a href="#XML_GetBuffer">XML_GetBuffer</a>
155                </li>
156
157                <li>
158                  <a href="#XML_StopParser">XML_StopParser</a>
159                </li>
160
161                <li>
162                  <a href="#XML_ResumeParser">XML_ResumeParser</a>
163                </li>
164
165                <li>
166                  <a href="#XML_GetParsingStatus">XML_GetParsingStatus</a>
167                </li>
168              </ul>
169            </li>
170
171            <li>
172              <a href="#setting">Handler Setting Functions</a>
173              <ul>
174                <li>
175                  <a href="#XML_SetStartElementHandler">XML_SetStartElementHandler</a>
176                </li>
177
178                <li>
179                  <a href="#XML_SetEndElementHandler">XML_SetEndElementHandler</a>
180                </li>
181
182                <li>
183                  <a href="#XML_SetElementHandler">XML_SetElementHandler</a>
184                </li>
185
186                <li>
187                  <a href="#XML_SetCharacterDataHandler">XML_SetCharacterDataHandler</a>
188                </li>
189
190                <li>
191                  <a href=
192                  "#XML_SetProcessingInstructionHandler">XML_SetProcessingInstructionHandler</a>
193                </li>
194
195                <li>
196                  <a href="#XML_SetCommentHandler">XML_SetCommentHandler</a>
197                </li>
198
199                <li>
200                  <a href=
201                  "#XML_SetStartCdataSectionHandler">XML_SetStartCdataSectionHandler</a>
202                </li>
203
204                <li>
205                  <a href=
206                  "#XML_SetEndCdataSectionHandler">XML_SetEndCdataSectionHandler</a>
207                </li>
208
209                <li>
210                  <a href="#XML_SetCdataSectionHandler">XML_SetCdataSectionHandler</a>
211                </li>
212
213                <li>
214                  <a href="#XML_SetDefaultHandler">XML_SetDefaultHandler</a>
215                </li>
216
217                <li>
218                  <a href="#XML_SetDefaultHandlerExpand">XML_SetDefaultHandlerExpand</a>
219                </li>
220
221                <li>
222                  <a href=
223                  "#XML_SetExternalEntityRefHandler">XML_SetExternalEntityRefHandler</a>
224                </li>
225
226                <li>
227                  <a href=
228                  "#XML_SetExternalEntityRefHandlerArg">XML_SetExternalEntityRefHandlerArg</a>
229                </li>
230
231                <li>
232                  <a href="#XML_SetSkippedEntityHandler">XML_SetSkippedEntityHandler</a>
233                </li>
234
235                <li>
236                  <a href=
237                  "#XML_SetUnknownEncodingHandler">XML_SetUnknownEncodingHandler</a>
238                </li>
239
240                <li>
241                  <a href=
242                  "#XML_SetStartNamespaceDeclHandler">XML_SetStartNamespaceDeclHandler</a>
243                </li>
244
245                <li>
246                  <a href=
247                  "#XML_SetEndNamespaceDeclHandler">XML_SetEndNamespaceDeclHandler</a>
248                </li>
249
250                <li>
251                  <a href="#XML_SetNamespaceDeclHandler">XML_SetNamespaceDeclHandler</a>
252                </li>
253
254                <li>
255                  <a href="#XML_SetXmlDeclHandler">XML_SetXmlDeclHandler</a>
256                </li>
257
258                <li>
259                  <a href=
260                  "#XML_SetStartDoctypeDeclHandler">XML_SetStartDoctypeDeclHandler</a>
261                </li>
262
263                <li>
264                  <a href=
265                  "#XML_SetEndDoctypeDeclHandler">XML_SetEndDoctypeDeclHandler</a>
266                </li>
267
268                <li>
269                  <a href="#XML_SetDoctypeDeclHandler">XML_SetDoctypeDeclHandler</a>
270                </li>
271
272                <li>
273                  <a href="#XML_SetElementDeclHandler">XML_SetElementDeclHandler</a>
274                </li>
275
276                <li>
277                  <a href="#XML_SetAttlistDeclHandler">XML_SetAttlistDeclHandler</a>
278                </li>
279
280                <li>
281                  <a href="#XML_SetEntityDeclHandler">XML_SetEntityDeclHandler</a>
282                </li>
283
284                <li>
285                  <a href=
286                  "#XML_SetUnparsedEntityDeclHandler">XML_SetUnparsedEntityDeclHandler</a>
287                </li>
288
289                <li>
290                  <a href="#XML_SetNotationDeclHandler">XML_SetNotationDeclHandler</a>
291                </li>
292
293                <li>
294                  <a href="#XML_SetNotStandaloneHandler">XML_SetNotStandaloneHandler</a>
295                </li>
296              </ul>
297            </li>
298
299            <li>
300              <a href="#position">Parse Position and Error Reporting Functions</a>
301              <ul>
302                <li>
303                  <a href="#XML_GetErrorCode">XML_GetErrorCode</a>
304                </li>
305
306                <li>
307                  <a href="#XML_ErrorString">XML_ErrorString</a>
308                </li>
309
310                <li>
311                  <a href="#XML_GetCurrentByteIndex">XML_GetCurrentByteIndex</a>
312                </li>
313
314                <li>
315                  <a href="#XML_GetCurrentLineNumber">XML_GetCurrentLineNumber</a>
316                </li>
317
318                <li>
319                  <a href="#XML_GetCurrentColumnNumber">XML_GetCurrentColumnNumber</a>
320                </li>
321
322                <li>
323                  <a href="#XML_GetCurrentByteCount">XML_GetCurrentByteCount</a>
324                </li>
325
326                <li>
327                  <a href="#XML_GetInputContext">XML_GetInputContext</a>
328                </li>
329              </ul>
330            </li>
331
332            <li>
333              <a href="#attack-protection">Attack Protection</a>
334              <ul>
335                <li>
336                  <a href=
337                  "#XML_SetBillionLaughsAttackProtectionMaximumAmplification">XML_SetBillionLaughsAttackProtectionMaximumAmplification</a>
338                </li>
339
340                <li>
341                  <a href=
342                  "#XML_SetBillionLaughsAttackProtectionActivationThreshold">XML_SetBillionLaughsAttackProtectionActivationThreshold</a>
343                </li>
344
345                <li>
346                  <a href=
347                  "#XML_SetAllocTrackerMaximumAmplification">XML_SetAllocTrackerMaximumAmplification</a>
348                </li>
349
350                <li>
351                  <a href=
352                  "#XML_SetAllocTrackerActivationThreshold">XML_SetAllocTrackerActivationThreshold</a>
353                </li>
354
355                <li>
356                  <a href=
357                  "#XML_SetReparseDeferralEnabled">XML_SetReparseDeferralEnabled</a>
358                </li>
359              </ul>
360            </li>
361
362            <li>
363              <a href="#miscellaneous">Miscellaneous Functions</a>
364              <ul>
365                <li>
366                  <a href="#XML_SetUserData">XML_SetUserData</a>
367                </li>
368
369                <li>
370                  <a href="#XML_GetUserData">XML_GetUserData</a>
371                </li>
372
373                <li>
374                  <a href="#XML_UseParserAsHandlerArg">XML_UseParserAsHandlerArg</a>
375                </li>
376
377                <li>
378                  <a href="#XML_SetBase">XML_SetBase</a>
379                </li>
380
381                <li>
382                  <a href="#XML_GetBase">XML_GetBase</a>
383                </li>
384
385                <li>
386                  <a href=
387                  "#XML_GetSpecifiedAttributeCount">XML_GetSpecifiedAttributeCount</a>
388                </li>
389
390                <li>
391                  <a href="#XML_GetIdAttributeIndex">XML_GetIdAttributeIndex</a>
392                </li>
393
394                <li>
395                  <a href="#XML_GetAttributeInfo">XML_GetAttributeInfo</a>
396                </li>
397
398                <li>
399                  <a href="#XML_SetEncoding">XML_SetEncoding</a>
400                </li>
401
402                <li>
403                  <a href="#XML_SetParamEntityParsing">XML_SetParamEntityParsing</a>
404                </li>
405
406                <li>
407                  <a href="#XML_SetHashSalt">XML_SetHashSalt</a>
408                </li>
409
410                <li>
411                  <a href="#XML_UseForeignDTD">XML_UseForeignDTD</a>
412                </li>
413
414                <li>
415                  <a href="#XML_SetReturnNSTriplet">XML_SetReturnNSTriplet</a>
416                </li>
417
418                <li>
419                  <a href="#XML_DefaultCurrent">XML_DefaultCurrent</a>
420                </li>
421
422                <li>
423                  <a href="#XML_ExpatVersion">XML_ExpatVersion</a>
424                </li>
425
426                <li>
427                  <a href="#XML_ExpatVersionInfo">XML_ExpatVersionInfo</a>
428                </li>
429
430                <li>
431                  <a href="#XML_GetFeatureList">XML_GetFeatureList</a>
432                </li>
433
434                <li>
435                  <a href="#XML_FreeContentModel">XML_FreeContentModel</a>
436                </li>
437
438                <li>
439                  <a href="#XML_MemMalloc">XML_MemMalloc</a>
440                </li>
441
442                <li>
443                  <a href="#XML_MemRealloc">XML_MemRealloc</a>
444                </li>
445
446                <li>
447                  <a href="#XML_MemFree">XML_MemFree</a>
448                </li>
449              </ul>
450            </li>
451          </ul>
452        </li>
453      </ul>
454
455      <hr />
456
457      <h2>
458        <a id="overview" name="overview">Overview</a>
459      </h2>
460
461      <p>
462        Expat is a stream-oriented parser. You register callback (or handler) functions
463        with the parser and then start feeding it the document. As the parser recognizes
464        parts of the document, it will call the appropriate handler for that part (if
465        you've registered one.) The document is fed to the parser in pieces, so you can
466        start parsing before you have all the document. This also allows you to parse
467        really huge documents that won't fit into memory.
468      </p>
469
470      <p>
471        Expat can be intimidating due to the many kinds of handlers and options you can
472        set. But you only need to learn four functions in order to do 90% of what you'll
473        want to do with it:
474      </p>
475
476      <dl>
477        <dt>
478          <code><a href="#XML_ParserCreate">XML_ParserCreate</a></code>
479        </dt>
480
481        <dd>
482          Create a new parser object.
483        </dd>
484
485        <dt>
486          <code><a href="#XML_SetElementHandler">XML_SetElementHandler</a></code>
487        </dt>
488
489        <dd>
490          Set handlers for start and end tags.
491        </dd>
492
493        <dt>
494          <code><a href=
495          "#XML_SetCharacterDataHandler">XML_SetCharacterDataHandler</a></code>
496        </dt>
497
498        <dd>
499          Set handler for text.
500        </dd>
501
502        <dt>
503          <code><a href="#XML_Parse">XML_Parse</a></code>
504        </dt>
505
506        <dd>
507          Pass a buffer full of document to the parser
508        </dd>
509      </dl>
510
511      <p>
512        These functions and others are described in the <a href=
513        "#reference">reference</a> part of this document. The reference section also
514        describes in detail the parameters passed to the different types of handlers.
515      </p>
516
517      <p>
518        Let's look at a very simple example program that only uses 3 of the above
519        functions (it doesn't need to set a character handler.) The program <a href=
520        "../examples/outline.c">outline.c</a> prints an element outline, indenting child
521        elements to distinguish them from the parent element that contains them. The
522        start handler does all the work. It prints two indenting spaces for every level
523        of ancestor elements, then it prints the element and attribute information.
524        Finally it increments the global <code>Depth</code> variable.
525      </p>
526
527      <pre class="eg">
528int Depth;
529
530void XMLCALL
531start(void *data, const char *el, const char **attr) {
532  int i;
533
534  for (i = 0; i &lt; Depth; i++)
535    printf("  ");
536
537  printf("%s", el);
538
539  for (i = 0; attr[i]; i += 2) {
540    printf(" %s='%s'", attr[i], attr[i + 1]);
541  }
542
543  printf("\n");
544  Depth++;
545}  /* End of start handler */
546</pre>
547      <p>
548        The end tag simply does the bookkeeping work of decrementing <code>Depth</code>.
549      </p>
550
551      <pre class="eg">
552void XMLCALL
553end(void *data, const char *el) {
554  Depth--;
555}  /* End of end handler */
556</pre>
557      <p>
558        Note the <code>XMLCALL</code> annotation used for the callbacks. This is used to
559        ensure that the Expat and the callbacks are using the same calling convention in
560        case the compiler options used for Expat itself and the client code are
561        different. Expat tries not to care what the default calling convention is, though
562        it may require that it be compiled with a default convention of "cdecl" on some
563        platforms. For code which uses Expat, however, the calling convention is
564        specified by the <code>XMLCALL</code> annotation on most platforms; callbacks
565        should be defined using this annotation.
566      </p>
567
568      <p>
569        The <code>XMLCALL</code> annotation was added in Expat 1.95.7, but existing
570        working Expat applications don't need to add it (since they are already using the
571        "cdecl" calling convention, or they wouldn't be working). The annotation is only
572        needed if the default calling convention may be something other than "cdecl". To
573        use the annotation safely with older versions of Expat, you can conditionally
574        define it <em>after</em> including Expat's header file:
575      </p>
576
577      <pre class="eg">
578#include &lt;expat.h&gt;
579
580#ifndef XMLCALL
581#if defined(_MSC_VER) &amp;&amp; !defined(__BEOS__) &amp;&amp; !defined(__CYGWIN__)
582#define XMLCALL __cdecl
583#elif defined(__GNUC__)
584#define XMLCALL __attribute__((cdecl))
585#else
586#define XMLCALL
587#endif
588#endif
589</pre>
590      <p>
591        After creating the parser, the main program just has the job of shoveling the
592        document to the parser so that it can do its work.
593      </p>
594
595      <hr />
596
597      <h2>
598        <a id="building" name="building">Building and Installing Expat</a>
599      </h2>
600
601      <p>
602        The Expat distribution comes as a compressed (with GNU gzip) tar file. You may
603        download the latest version from <a href=
604        "https://sourceforge.net/projects/expat/">Source Forge</a>. After unpacking this,
605        cd into the directory. Then follow either the Win32 directions or Unix directions
606        below.
607      </p>
608
609      <h3>
610        Building under Win32
611      </h3>
612
613      <p>
614        If you're using the GNU compiler under cygwin, follow the Unix directions in the
615        next section. Otherwise if you have Microsoft's Developer Studio installed, you
616        can use CMake to generate a <code>.sln</code> file, e.g. <code>cmake -G"Visual
617        Studio 17 2022" -DCMAKE_BUILD_TYPE=RelWithDebInfo .</code> , and build Expat
618        using <code>msbuild /m expat.sln</code> after.
619      </p>
620
621      <p>
622        Alternatively, you may download the Win32 binary package that contains the
623        "expat.h" include file and a pre-built DLL.
624      </p>
625
626      <h3>
627        Building under Unix (or GNU)
628      </h3>
629
630      <p>
631        First you'll need to run the configure shell script in order to configure the
632        Makefiles and headers for your system.
633      </p>
634
635      <p>
636        If you're happy with all the defaults that configure picks for you, and you have
637        permission on your system to install into /usr/local, you can install Expat with
638        this sequence of commands:
639      </p>
640
641      <pre class="eg">
642./configure
643make
644make install
645</pre>
646      <p>
647        There are some options that you can provide to this script, but the only one
648        we'll mention here is the <code>--prefix</code> option. You can find out all the
649        options available by running configure with just the <code>--help</code> option.
650      </p>
651
652      <p>
653        By default, the configure script sets things up so that the library gets
654        installed in <code>/usr/local/lib</code> and the associated header file in
655        <code>/usr/local/include</code>. But if you were to give the option,
656        <code>--prefix=/home/me/mystuff</code>, then the library and header would get
657        installed in <code>/home/me/mystuff/lib</code> and
658        <code>/home/me/mystuff/include</code> respectively.
659      </p>
660
661      <h3>
662        Configuring Expat Using the Pre-Processor
663      </h3>
664
665      <p>
666        Expat's feature set can be configured using a small number of pre-processor
667        definitions. The symbols are:
668      </p>
669
670      <dl class="cpp-symbols">
671        <dt>
672          <a id="XML_GE" name="XML_GE">XML_GE</a>
673        </dt>
674
675        <dd>
676          Added in Expat 2.6.0. Include support for <a href=
677          "https://www.w3.org/TR/2006/REC-xml-20060816/#sec-physical-struct">general
678          entities</a> (syntax <code>&amp;e1;</code> to reference and syntax
679          <code>&lt;!ENTITY e1 'value1'&gt;</code> (an internal general entity) or
680          <code>&lt;!ENTITY e2 SYSTEM 'file2'&gt;</code> (an external general entity) to
681          declare). With <code>XML_GE</code> enabled, general entities will be replaced
682          by their declared replacement text; for this to work for <em>external</em>
683          general entities, in addition an <code><a href=
684          "#XML_SetExternalEntityRefHandler">XML_ExternalEntityRefHandler</a></code> must
685          be set using <code><a href=
686          "#XML_SetExternalEntityRefHandler">XML_SetExternalEntityRefHandler</a></code>.
687          Also, enabling <code>XML_GE</code> makes the functions <code><a href=
688          "#XML_SetBillionLaughsAttackProtectionMaximumAmplification">XML_SetBillionLaughsAttackProtectionMaximumAmplification</a></code>
689          and <code><a href=
690          "#XML_SetBillionLaughsAttackProtectionActivationThreshold">XML_SetBillionLaughsAttackProtectionActivationThreshold</a></code>
691          available.<br />
692          With <code>XML_GE</code> disabled, Expat has a smaller memory footprint and can
693          be faster, but will not load external general entities and will replace all
694          general entities (except the <a href=
695          "https://www.w3.org/TR/2006/REC-xml-20060816/#sec-predefined-ent">predefined
696          five</a>: <code>amp</code>, <code>apos</code>, <code>gt</code>,
697          <code>lt</code>, <code>quot</code>) with a self-reference: for example,
698          referencing an entity <code>e1</code> via <code>&amp;e1;</code> will be
699          replaced by text <code>&amp;e1;</code>.
700        </dd>
701
702        <dt>
703          <a id="XML_DTD" name="XML_DTD">XML_DTD</a>
704        </dt>
705
706        <dd>
707          Include support for using and reporting DTD-based content. If this is defined,
708          default attribute values from an external DTD subset are reported and attribute
709          value normalization occurs based on the type of attributes defined in the
710          external subset. Without this, Expat has a smaller memory footprint and can be
711          faster, but will not load external parameter entities or process conditional
712          sections. If defined, makes the functions <code><a href=
713          "#XML_SetBillionLaughsAttackProtectionMaximumAmplification">XML_SetBillionLaughsAttackProtectionMaximumAmplification</a></code>
714          and <code><a href=
715          "#XML_SetBillionLaughsAttackProtectionActivationThreshold">XML_SetBillionLaughsAttackProtectionActivationThreshold</a></code>
716          available.
717        </dd>
718
719        <dt>
720          <a id="XML_NS" name="XML_NS">XML_NS</a>
721        </dt>
722
723        <dd>
724          When defined, support for the <cite><a href=
725          "https://www.w3.org/TR/REC-xml-names/">Namespaces in XML</a></cite>
726          specification is included.
727        </dd>
728
729        <dt>
730          <a id="XML_UNICODE" name="XML_UNICODE">XML_UNICODE</a>
731        </dt>
732
733        <dd>
734          When defined, character data reported to the application is encoded in UTF-16
735          using wide characters of the type <code>XML_Char</code>. This is implied if
736          <code>XML_UNICODE_WCHAR_T</code> is defined.
737        </dd>
738
739        <dt>
740          <a id="XML_UNICODE_WCHAR_T" name="XML_UNICODE_WCHAR_T">XML_UNICODE_WCHAR_T</a>
741        </dt>
742
743        <dd>
744          If defined, causes the <code>XML_Char</code> character type to be defined using
745          the <code>wchar_t</code> type; otherwise, <code>unsigned short</code> is used.
746          Defining this implies <code>XML_UNICODE</code>.
747        </dd>
748
749        <dt>
750          <a id="XML_LARGE_SIZE" name="XML_LARGE_SIZE">XML_LARGE_SIZE</a>
751        </dt>
752
753        <dd>
754          If defined, causes the <code>XML_Size</code> and <code>XML_Index</code> integer
755          types to be at least 64 bits in size. This is intended to support processing of
756          very large input streams, where the return values of <code><a href=
757          "#XML_GetCurrentByteIndex">XML_GetCurrentByteIndex</a></code>, <code><a href=
758          "#XML_GetCurrentLineNumber">XML_GetCurrentLineNumber</a></code> and
759          <code><a href=
760          "#XML_GetCurrentColumnNumber">XML_GetCurrentColumnNumber</a></code> could
761          overflow. It may not be supported by all compilers, and is turned off by
762          default.
763        </dd>
764
765        <dt>
766          <a id="XML_CONTEXT_BYTES" name="XML_CONTEXT_BYTES">XML_CONTEXT_BYTES</a>
767        </dt>
768
769        <dd>
770          The number of input bytes of markup context which the parser will ensure are
771          available for reporting via <code><a href=
772          "#XML_GetInputContext">XML_GetInputContext</a></code>. This is normally set to
773          1024, and must be set to a positive integer to enable. If this is set to zero,
774          the input context will not be available and <code><a href=
775          "#XML_GetInputContext">XML_GetInputContext</a></code> will always report
776          <code>NULL</code>. Without this, Expat has a smaller memory footprint and can
777          be faster.
778        </dd>
779
780        <dt>
781          <a id="XML_STATIC" name="XML_STATIC">XML_STATIC</a>
782        </dt>
783
784        <dd>
785          On Windows, this should be set if Expat is going to be linked statically with
786          the code that calls it; this is required to get all the right MSVC magic
787          annotations correct. This is ignored on other platforms.
788        </dd>
789
790        <dt>
791          <a id="XML_ATTR_INFO" name="XML_ATTR_INFO">XML_ATTR_INFO</a>
792        </dt>
793
794        <dd>
795          If defined, makes the additional function <code><a href=
796          "#XML_GetAttributeInfo">XML_GetAttributeInfo</a></code> available for reporting
797          attribute byte offsets.
798        </dd>
799      </dl>
800
801      <hr />
802
803      <h2>
804        <a id="using" name="using">Using Expat</a>
805      </h2>
806
807      <h3>
808        Compiling and Linking Against Expat
809      </h3>
810
811      <p>
812        Unless you installed Expat in a location not expected by your compiler and
813        linker, all you have to do to use Expat in your programs is to include the Expat
814        header (<code>#include &lt;expat.h&gt;</code>) in your files that make calls to
815        it and to tell the linker that it needs to link against the Expat library. On
816        Unix systems, this would usually be done with the <code>-lexpat</code> argument.
817        Otherwise, you'll need to tell the compiler where to look for the Expat header
818        and the linker where to find the Expat library. You may also need to take steps
819        to tell the operating system where to find this library at run time.
820      </p>
821
822      <p>
823        On a Unix-based system, here's what a Makefile might look like when Expat is
824        installed in a standard location:
825      </p>
826
827      <pre class="eg">
828CC=cc
829LDFLAGS=
830LIBS= -lexpat
831xmlapp: xmlapp.o
832        $(CC) $(LDFLAGS) -o xmlapp xmlapp.o $(LIBS)
833</pre>
834      <p>
835        If you installed Expat in, say, <code>/home/me/mystuff</code>, then the Makefile
836        would look like this:
837      </p>
838
839      <pre class="eg">
840CC=cc
841CFLAGS= -I/home/me/mystuff/include
842LDFLAGS=
843LIBS= -L/home/me/mystuff/lib -lexpat
844xmlapp: xmlapp.o
845        $(CC) $(LDFLAGS) -o xmlapp xmlapp.o $(LIBS)
846</pre>
847      <p>
848        You'd also have to set the environment variable <code>LD_LIBRARY_PATH</code> to
849        <code>/home/me/mystuff/lib</code> (or to
850        <code>${LD_LIBRARY_PATH}:/home/me/mystuff/lib</code> if LD_LIBRARY_PATH already
851        has some directories in it) in order to run your application.
852      </p>
853
854      <h3>
855        Expat Basics
856      </h3>
857
858      <p>
859        As we saw in the example in the overview, the first step in parsing an XML
860        document with Expat is to create a parser object. There are <a href=
861        "#creation">three functions</a> in the Expat API for creating a parser object.
862        However, only two of these (<code><a href=
863        "#XML_ParserCreate">XML_ParserCreate</a></code> and <code><a href=
864        "#XML_ParserCreateNS">XML_ParserCreateNS</a></code>) can be used for constructing
865        a parser for a top-level document. The object returned by these functions is an
866        opaque pointer (i.e. "expat.h" declares it as void *) to data with further
867        internal structure. In order to free the memory associated with this object you
868        must call <code><a href="#XML_ParserFree">XML_ParserFree</a></code>. Note that if
869        you have provided any <a href="#userdata">user data</a> that gets stored in the
870        parser, then your application is responsible for freeing it prior to calling
871        <code>XML_ParserFree</code>.
872      </p>
873
874      <p>
875        The objects returned by the parser creation functions are good for parsing only
876        one XML document or external parsed entity. If your application needs to parse
877        many XML documents, then it needs to create a parser object for each one. The
878        best way to deal with this is to create a higher level object that contains all
879        the default initialization you want for your parser objects.
880      </p>
881
882      <p>
883        Walking through a document hierarchy with a stream oriented parser will require a
884        good stack mechanism in order to keep track of current context. For instance, to
885        answer the simple question, "What element does this text belong to?" requires a
886        stack, since the parser may have descended into other elements that are children
887        of the current one and has encountered this text on the way out.
888      </p>
889
890      <p>
891        The things you're likely to want to keep on a stack are the currently opened
892        element and it's attributes. You push this information onto the stack in the
893        start handler and you pop it off in the end handler.
894      </p>
895
896      <p>
897        For some tasks, it is sufficient to just keep information on what the depth of
898        the stack is (or would be if you had one.) The outline program shown above
899        presents one example. Another such task would be skipping over a complete
900        element. When you see the start tag for the element you want to skip, you set a
901        skip flag and record the depth at which the element started. When the end tag
902        handler encounters the same depth, the skipped element has ended and the flag may
903        be cleared. If you follow the convention that the root element starts at 1, then
904        you can use the same variable for skip flag and skip depth.
905      </p>
906
907      <pre class="eg">
908void
909init_info(Parseinfo *info) {
910  info-&gt;skip = 0;
911  info-&gt;depth = 1;
912  /* Other initializations here */
913}  /* End of init_info */
914
915void XMLCALL
916rawstart(void *data, const char *el, const char **attr) {
917  Parseinfo *inf = (Parseinfo *) data;
918
919  if (! inf-&gt;skip) {
920    if (should_skip(inf, el, attr)) {
921      inf-&gt;skip = inf-&gt;depth;
922    }
923    else
924      start(inf, el, attr);     /* This does rest of start handling */
925  }
926
927  inf-&gt;depth++;
928}  /* End of rawstart */
929
930void XMLCALL
931rawend(void *data, const char *el) {
932  Parseinfo *inf = (Parseinfo *) data;
933
934  inf-&gt;depth--;
935
936  if (! inf-&gt;skip)
937    end(inf, el);              /* This does rest of end handling */
938
939  if (inf-&gt;skip == inf-&gt;depth)
940    inf-&gt;skip = 0;
941}  /* End rawend */
942</pre>
943      <p>
944        Notice in the above example the difference in how depth is manipulated in the
945        start and end handlers. The end tag handler should be the mirror image of the
946        start tag handler. This is necessary to properly model containment. Since, in the
947        start tag handler, we incremented depth <em>after</em> the main body of start tag
948        code, then in the end handler, we need to manipulate it <em>before</em> the main
949        body. If we'd decided to increment it first thing in the start handler, then we'd
950        have had to decrement it last thing in the end handler.
951      </p>
952
953      <h3 id="userdata">
954        Communicating between handlers
955      </h3>
956
957      <p>
958        In order to be able to pass information between different handlers without using
959        globals, you'll need to define a data structure to hold the shared variables. You
960        can then tell Expat (with the <code><a href=
961        "#XML_SetUserData">XML_SetUserData</a></code> function) to pass a pointer to this
962        structure to the handlers. This is the first argument received by most handlers.
963        In the <a href="#reference">reference section</a>, an argument to a callback
964        function is named <code>userData</code> and have type <code>void *</code> if the
965        user data is passed; it will have the type <code>XML_Parser</code> if the parser
966        itself is passed. When the parser is passed, the user data may be retrieved using
967        <code><a href="#XML_GetUserData">XML_GetUserData</a></code>.
968      </p>
969
970      <p>
971        One common case where multiple calls to a single handler may need to communicate
972        using an application data structure is the case when content passed to the
973        character data handler (set by <code><a href=
974        "#XML_SetCharacterDataHandler">XML_SetCharacterDataHandler</a></code>) needs to
975        be accumulated. A common first-time mistake with any of the event-oriented
976        interfaces to an XML parser is to expect all the text contained in an element to
977        be reported by a single call to the character data handler. Expat, like many
978        other XML parsers, reports such data as a sequence of calls; there's no way to
979        know when the end of the sequence is reached until a different callback is made.
980        A buffer referenced by the user data structure proves both an effective and
981        convenient place to accumulate character data.
982      </p>
983      <!-- XXX example needed here -->
984
985      <h3>
986        XML Version
987      </h3>
988
989      <p>
990        Expat is an XML 1.0 parser, and as such never complains based on the value of the
991        <code>version</code> pseudo-attribute in the XML declaration, if present.
992      </p>
993
994      <p>
995        If an application needs to check the version number (to support alternate
996        processing), it should use the <code><a href=
997        "#XML_SetXmlDeclHandler">XML_SetXmlDeclHandler</a></code> function to set a
998        handler that uses the information in the XML declaration to determine what to do.
999        This example shows how to check that only a version number of <code>"1.0"</code>
1000        is accepted:
1001      </p>
1002
1003      <pre class="eg">
1004static int wrong_version;
1005static XML_Parser parser;
1006
1007static void XMLCALL
1008xmldecl_handler(void            *userData,
1009                const XML_Char  *version,
1010                const XML_Char  *encoding,
1011                int              standalone)
1012{
1013  static const XML_Char Version_1_0[] = {'1', '.', '0', 0};
1014
1015  int i;
1016
1017  for (i = 0; i &lt; (sizeof(Version_1_0) / sizeof(Version_1_0[0])); ++i) {
1018    if (version[i] != Version_1_0[i]) {
1019      wrong_version = 1;
1020      /* also clear all other handlers: */
1021      XML_SetCharacterDataHandler(parser, NULL);
1022      ...
1023      return;
1024    }
1025  }
1026  ...
1027}
1028</pre>
1029      <h3>
1030        Namespace Processing
1031      </h3>
1032
1033      <p>
1034        When the parser is created using the <code><a href=
1035        "#XML_ParserCreateNS">XML_ParserCreateNS</a></code>, function, Expat performs
1036        namespace processing. Under namespace processing, Expat consumes
1037        <code>xmlns</code> and <code>xmlns:...</code> attributes, which declare
1038        namespaces for the scope of the element in which they occur. This means that your
1039        start handler will not see these attributes. Your application can still be
1040        informed of these declarations by setting namespace declaration handlers with
1041        <a href=
1042        "#XML_SetNamespaceDeclHandler"><code>XML_SetNamespaceDeclHandler</code></a>.
1043      </p>
1044
1045      <p>
1046        Element type and attribute names that belong to a given namespace are passed to
1047        the appropriate handler in expanded form. By default this expanded form is a
1048        concatenation of the namespace URI, the separator character (which is the 2nd
1049        argument to <code><a href="#XML_ParserCreateNS">XML_ParserCreateNS</a></code>),
1050        and the local name (i.e. the part after the colon). Names with undeclared
1051        prefixes are not well-formed when namespace processing is enabled, and will
1052        trigger an error. Unprefixed attribute names are never expanded, and unprefixed
1053        element names are only expanded when they are in the scope of a default
1054        namespace.
1055      </p>
1056
1057      <p>
1058        However if <code><a href=
1059        "#XML_SetReturnNSTriplet">XML_SetReturnNSTriplet</a></code> has been called with
1060        a non-zero <code>do_nst</code> parameter, then the expanded form for names with
1061        an explicit prefix is a concatenation of: URI, separator, local name, separator,
1062        prefix.
1063      </p>
1064
1065      <p>
1066        You can set handlers for the start of a namespace declaration and for the end of
1067        a scope of a declaration with the <code><a href=
1068        "#XML_SetNamespaceDeclHandler">XML_SetNamespaceDeclHandler</a></code> function.
1069        The StartNamespaceDeclHandler is called prior to the start tag handler and the
1070        EndNamespaceDeclHandler is called after the corresponding end tag that ends the
1071        namespace's scope. The namespace start handler gets passed the prefix and URI for
1072        the namespace. For a default namespace declaration (xmlns='...'), the prefix will
1073        be <code>NULL</code>. The URI will be <code>NULL</code> for the case where the
1074        default namespace is being unset. The namespace end handler just gets the prefix
1075        for the closing scope.
1076      </p>
1077
1078      <p>
1079        These handlers are called for each declaration. So if, for instance, a start tag
1080        had three namespace declarations, then the StartNamespaceDeclHandler would be
1081        called three times before the start tag handler is called, once for each
1082        declaration.
1083      </p>
1084
1085      <h3>
1086        Character Encodings
1087      </h3>
1088
1089      <p>
1090        While XML is based on Unicode, and every XML processor is required to recognized
1091        UTF-8 and UTF-16 (1 and 2 byte encodings of Unicode), other encodings may be
1092        declared in XML documents or entities. For the main document, an XML declaration
1093        may contain an encoding declaration:
1094      </p>
1095
1096      <pre>
1097&lt;?xml version="1.0" encoding="ISO-8859-2"?&gt;
1098</pre>
1099      <p>
1100        External parsed entities may begin with a text declaration, which looks like an
1101        XML declaration with just an encoding declaration:
1102      </p>
1103
1104      <pre>
1105&lt;?xml encoding="Big5"?&gt;
1106</pre>
1107      <p>
1108        With Expat, you may also specify an encoding at the time of creating a parser.
1109        This is useful when the encoding information may come from a source outside the
1110        document itself (like a higher level protocol.)
1111      </p>
1112
1113      <p>
1114        <a id="builtin_encodings" name="builtin_encodings"></a>There are four built-in
1115        encodings in Expat:
1116      </p>
1117
1118      <ul>
1119        <li>UTF-8
1120        </li>
1121
1122        <li>UTF-16
1123        </li>
1124
1125        <li>ISO-8859-1
1126        </li>
1127
1128        <li>US-ASCII
1129        </li>
1130      </ul>
1131
1132      <p>
1133        Anything else discovered in an encoding declaration or in the protocol encoding
1134        specified in the parser constructor, triggers a call to the
1135        <code>UnknownEncodingHandler</code>. This handler gets passed the encoding name
1136        and a pointer to an <code>XML_Encoding</code> data structure. Your handler must
1137        fill in this structure and return <code>XML_STATUS_OK</code> if it knows how to
1138        deal with the encoding. Otherwise the handler should return
1139        <code>XML_STATUS_ERROR</code>. The handler also gets passed a pointer to an
1140        optional application data structure that you may indicate when you set the
1141        handler.
1142      </p>
1143
1144      <p>
1145        Expat places restrictions on character encodings that it can support by filling
1146        in the <code>XML_Encoding</code> structure. include file:
1147      </p>
1148
1149      <ol>
1150        <li>Every ASCII character that can appear in a well-formed XML document must be
1151        represented by a single byte, and that byte must correspond to it's ASCII
1152        encoding (except for the characters $@\^'{}~)
1153        </li>
1154
1155        <li>Characters must be encoded in 4 bytes or less.
1156        </li>
1157
1158        <li>All characters encoded must have Unicode scalar values less than or equal to
1159        65535 (0xFFFF)<em>This does not apply to the built-in support for UTF-16 and
1160        UTF-8</em>
1161        </li>
1162
1163        <li>No character may be encoded by more that one distinct sequence of bytes
1164        </li>
1165      </ol>
1166
1167      <p>
1168        <code>XML_Encoding</code> contains an array of integers that correspond to the
1169        1st byte of an encoding sequence. If the value in the array for a byte is zero or
1170        positive, then the byte is a single byte encoding that encodes the Unicode scalar
1171        value contained in the array. A -1 in this array indicates a malformed byte. If
1172        the value is -2, -3, or -4, then the byte is the beginning of a 2, 3, or 4 byte
1173        sequence respectively. Multi-byte sequences are sent to the convert function
1174        pointed at in the <code>XML_Encoding</code> structure. This function should
1175        return the Unicode scalar value for the sequence or -1 if the sequence is
1176        malformed.
1177      </p>
1178
1179      <p>
1180        One pitfall that novice Expat users are likely to fall into is that although
1181        Expat may accept input in various encodings, the strings that it passes to the
1182        handlers are always encoded in UTF-8 or UTF-16 (depending on how Expat was
1183        compiled). Your application is responsible for any translation of these strings
1184        into other encodings.
1185      </p>
1186
1187      <h3>
1188        Handling External Entity References
1189      </h3>
1190
1191      <p>
1192        Expat does not read or parse external entities directly. Note that any external
1193        DTD is a special case of an external entity. If you've set no
1194        <code>ExternalEntityRefHandler</code>, then external entity references are
1195        silently ignored. Otherwise, it calls your handler with the information needed to
1196        read and parse the external entity.
1197      </p>
1198
1199      <p>
1200        Your handler isn't actually responsible for parsing the entity, but it is
1201        responsible for creating a subsidiary parser with <code><a href=
1202        "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code> that
1203        will do the job. This returns an instance of <code>XML_Parser</code> that has
1204        handlers and other data structures initialized from the parent parser. You may
1205        then use <code><a href="#XML_Parse">XML_Parse</a></code> or <code><a href=
1206        "#XML_ParseBuffer">XML_ParseBuffer</a></code> calls against this parser. Since
1207        external entities my refer to other external entities, your handler should be
1208        prepared to be called recursively.
1209      </p>
1210
1211      <h3>
1212        Parsing DTDs
1213      </h3>
1214
1215      <p>
1216        In order to parse parameter entities, before starting the parse, you must call
1217        <code><a href="#XML_SetParamEntityParsing">XML_SetParamEntityParsing</a></code>
1218        with one of the following arguments:
1219      </p>
1220
1221      <dl>
1222        <dt>
1223          <code>XML_PARAM_ENTITY_PARSING_NEVER</code>
1224        </dt>
1225
1226        <dd>
1227          Don't parse parameter entities or the external subset
1228        </dd>
1229
1230        <dt>
1231          <code>XML_PARAM_ENTITY_PARSING_UNLESS_STANDALONE</code>
1232        </dt>
1233
1234        <dd>
1235          Parse parameter entities and the external subset unless <code>standalone</code>
1236          was set to "yes" in the XML declaration.
1237        </dd>
1238
1239        <dt>
1240          <code>XML_PARAM_ENTITY_PARSING_ALWAYS</code>
1241        </dt>
1242
1243        <dd>
1244          Always parse parameter entities and the external subset
1245        </dd>
1246      </dl>
1247
1248      <p>
1249        In order to read an external DTD, you also have to set an external entity
1250        reference handler as described above.
1251      </p>
1252
1253      <h3 id="stop-resume">
1254        Temporarily Stopping Parsing
1255      </h3>
1256
1257      <p>
1258        Expat 1.95.8 introduces a new feature: its now possible to stop parsing
1259        temporarily from within a handler function, even if more data has already been
1260        passed into the parser. Applications for this include
1261      </p>
1262
1263      <ul>
1264        <li>Supporting the <a href="https://www.w3.org/TR/xinclude/">XInclude</a>
1265        specification.
1266        </li>
1267
1268        <li>Delaying further processing until additional information is available from
1269        some other source.
1270        </li>
1271
1272        <li>Adjusting processor load as task priorities shift within an application.
1273        </li>
1274
1275        <li>Stopping parsing completely (simply free or reset the parser instead of
1276        resuming in the outer parsing loop). This can be useful if an application-domain
1277        error is found in the XML being parsed or if the result of the parse is
1278        determined not to be useful after all.
1279        </li>
1280      </ul>
1281
1282      <p>
1283        To take advantage of this feature, the main parsing loop of an application needs
1284        to support this specifically. It cannot be supported with a parsing loop
1285        compatible with Expat 1.95.7 or earlier (though existing loops will continue to
1286        work without supporting the stop/resume feature).
1287      </p>
1288
1289      <p>
1290        An application that uses this feature for a single parser will have the rough
1291        structure (in pseudo-code):
1292      </p>
1293
1294      <pre class="pseudocode">
1295fd = open_input()
1296p = create_parser()
1297
1298if parse_xml(p, fd) {
1299  /* suspended */
1300
1301  int suspended = 1;
1302
1303  while (suspended) {
1304    do_something_else()
1305    if ready_to_resume() {
1306      suspended = continue_parsing(p, fd);
1307    }
1308  }
1309}
1310</pre>
1311      <p>
1312        An application that may resume any of several parsers based on input (either from
1313        the XML being parsed or some other source) will certainly have more interesting
1314        control structures.
1315      </p>
1316
1317      <p>
1318        This C function could be used for the <code>parse_xml</code> function mentioned
1319        in the pseudo-code above:
1320      </p>
1321
1322      <pre class="eg">
1323#define BUFF_SIZE 10240
1324
1325/* Parse a document from the open file descriptor 'fd' until the parse
1326   is complete (the document has been completely parsed, or there's
1327   been an error), or the parse is stopped.  Return non-zero when
1328   the parse is merely suspended.
1329*/
1330int
1331parse_xml(XML_Parser p, int fd)
1332{
1333  for (;;) {
1334    int last_chunk;
1335    int bytes_read;
1336    enum XML_Status status;
1337
1338    void *buff = XML_GetBuffer(p, BUFF_SIZE);
1339    if (buff == NULL) {
1340      /* handle error... */
1341      return 0;
1342    }
1343    bytes_read = read(fd, buff, BUFF_SIZE);
1344    if (bytes_read &lt; 0) {
1345      /* handle error... */
1346      return 0;
1347    }
1348    status = XML_ParseBuffer(p, bytes_read, bytes_read == 0);
1349    switch (status) {
1350      case XML_STATUS_ERROR:
1351        /* handle error... */
1352        return 0;
1353      case XML_STATUS_SUSPENDED:
1354        return 1;
1355    }
1356    if (bytes_read == 0)
1357      return 0;
1358  }
1359}
1360</pre>
1361      <p>
1362        The corresponding <code>continue_parsing</code> function is somewhat simpler,
1363        since it only need deal with the return code from <code><a href=
1364        "#XML_ResumeParser">XML_ResumeParser</a></code>; it can delegate the input
1365        handling to the <code>parse_xml</code> function:
1366      </p>
1367
1368      <pre class="eg">
1369/* Continue parsing a document which had been suspended.  The 'p' and
1370   'fd' arguments are the same as passed to parse_xml().  Return
1371   non-zero when the parse is suspended.
1372*/
1373int
1374continue_parsing(XML_Parser p, int fd)
1375{
1376  enum XML_Status status = XML_ResumeParser(p);
1377  switch (status) {
1378    case XML_STATUS_ERROR:
1379      /* handle error... */
1380      return 0;
1381    case XML_ERROR_NOT_SUSPENDED:
1382      /* handle error... */
1383      return 0;.
1384    case XML_STATUS_SUSPENDED:
1385      return 1;
1386  }
1387  return parse_xml(p, fd);
1388}
1389</pre>
1390      <p>
1391        Now that we've seen what a mess the top-level parsing loop can become, what have
1392        we gained? Very simply, we can now use the <code><a href=
1393        "#XML_StopParser">XML_StopParser</a></code> function to stop parsing, without
1394        having to go to great lengths to avoid additional processing that we're expecting
1395        to ignore. As a bonus, we get to stop parsing <em>temporarily</em>, and come back
1396        to it when we're ready.
1397      </p>
1398
1399      <p>
1400        To stop parsing from a handler function, use the <code><a href=
1401        "#XML_StopParser">XML_StopParser</a></code> function. This function takes two
1402        arguments; the parser being stopped and a flag indicating whether the parse can
1403        be resumed in the future.
1404      </p>
1405      <!-- XXX really need more here -->
1406
1407      <hr />
1408      <!-- ================================================================ -->
1409
1410      <h2>
1411        <a id="reference" name="reference">Expat Reference</a>
1412      </h2>
1413
1414      <h3>
1415        <a id="creation" name="creation">Parser Creation</a>
1416      </h3>
1417
1418      <h4 id="XML_ParserCreate">
1419        XML_ParserCreate
1420      </h4>
1421
1422      <pre class="fcndec">
1423XML_Parser XMLCALL
1424XML_ParserCreate(const XML_Char *encoding);
1425</pre>
1426      <div class="fcndef">
1427        <p>
1428          Construct a new parser. If encoding is non-<code>NULL</code>, it specifies a
1429          character encoding to use for the document. This overrides the document
1430          encoding declaration. There are four built-in encodings:
1431        </p>
1432
1433        <ul>
1434          <li>US-ASCII
1435          </li>
1436
1437          <li>UTF-8
1438          </li>
1439
1440          <li>UTF-16
1441          </li>
1442
1443          <li>ISO-8859-1
1444          </li>
1445        </ul>
1446
1447        <p>
1448          Any other value will invoke a call to the UnknownEncodingHandler.
1449        </p>
1450      </div>
1451
1452      <h4 id="XML_ParserCreateNS">
1453        XML_ParserCreateNS
1454      </h4>
1455
1456      <pre class="fcndec">
1457XML_Parser XMLCALL
1458XML_ParserCreateNS(const XML_Char *encoding,
1459                   XML_Char sep);
1460</pre>
1461      <div class="fcndef">
1462        Constructs a new parser that has namespace processing in effect. Namespace
1463        expanded element names and attribute names are returned as a concatenation of the
1464        namespace URI, <em>sep</em>, and the local part of the name. This means that you
1465        should pick a character for <em>sep</em> that can't be part of an URI. Since
1466        Expat does not check namespace URIs for conformance, the only safe choice for a
1467        namespace separator is a character that is illegal in XML. For instance,
1468        <code>'\xFF'</code> is not legal in UTF-8, and <code>'\xFFFF'</code> is not legal
1469        in UTF-16. There is a special case when <em>sep</em> is the null character
1470        <code>'\0'</code>: the namespace URI and the local part will be concatenated
1471        without any separator - this is intended to support RDF processors. It is a
1472        programming error to use the null separator with <a href=
1473        "#XML_SetReturnNSTriplet">namespace triplets</a>.
1474      </div>
1475
1476      <p>
1477        <strong>Note:</strong> Expat does not validate namespace URIs (beyond encoding)
1478        against RFC 3986 today (and is not required to do so with regard to the XML 1.0
1479        namespaces specification) but it may start doing that in future releases. Before
1480        that, an application using Expat must be ready to receive namespace URIs
1481        containing non-URI characters.
1482      </p>
1483
1484      <h4 id="XML_ParserCreate_MM">
1485        XML_ParserCreate_MM
1486      </h4>
1487
1488      <pre class="fcndec">
1489XML_Parser XMLCALL
1490XML_ParserCreate_MM(const XML_Char *encoding,
1491                    const XML_Memory_Handling_Suite *ms,
1492                    const XML_Char *sep);
1493</pre>
1494
1495      <pre class="signature">
1496typedef struct {
1497  void *(XMLCALL *malloc_fcn)(size_t size);
1498  void *(XMLCALL *realloc_fcn)(void *ptr, size_t size);
1499  void (XMLCALL *free_fcn)(void *ptr);
1500} XML_Memory_Handling_Suite;
1501</pre>
1502      <div class="fcndef">
1503        <p>
1504          Construct a new parser using the suite of memory handling functions specified
1505          in <code>ms</code>. If <code>ms</code> is <code>NULL</code>, then use the
1506          standard set of memory management functions. If <code>sep</code> is
1507          non-<code>NULL</code>, then namespace processing is enabled in the created
1508          parser and the character pointed at by sep is used as the separator between the
1509          namespace URI and the local part of the name.
1510        </p>
1511      </div>
1512
1513      <h4 id="XML_ExternalEntityParserCreate">
1514        XML_ExternalEntityParserCreate
1515      </h4>
1516
1517      <pre class="fcndec">
1518XML_Parser XMLCALL
1519XML_ExternalEntityParserCreate(XML_Parser p,
1520                               const XML_Char *context,
1521                               const XML_Char *encoding);
1522</pre>
1523      <div class="fcndef">
1524        <p>
1525          Construct a new <code>XML_Parser</code> object for parsing an external general
1526          entity. Context is the context argument passed in a call to a
1527          ExternalEntityRefHandler. Other state information such as handlers, user data,
1528          namespace processing is inherited from the parser passed as the 1st argument.
1529          So you shouldn't need to call any of the behavior changing functions on this
1530          parser (unless you want it to act differently than the parent parser).
1531        </p>
1532
1533        <p>
1534          <strong>Note:</strong> Please be sure to free subparsers created by
1535          <code><a href=
1536          "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>
1537          <em>prior to</em> freeing their related parent parser, as subparsers reference
1538          and use parts of their respective parent parser, internally. Parent parsers
1539          must outlive subparsers.
1540        </p>
1541      </div>
1542
1543      <h4 id="XML_ParserFree">
1544        XML_ParserFree
1545      </h4>
1546
1547      <pre class="fcndec">
1548void XMLCALL
1549XML_ParserFree(XML_Parser p);
1550</pre>
1551      <div class="fcndef">
1552        <p>
1553          Free memory used by the parser.
1554        </p>
1555
1556        <p>
1557          <strong>Note:</strong> Your application is responsible for freeing any memory
1558          associated with <a href="#userdata">user data</a>.
1559        </p>
1560
1561        <p>
1562          <strong>Note:</strong> Please be sure to free subparsers created by
1563          <code><a href=
1564          "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>
1565          <em>prior to</em> freeing their related parent parser, as subparsers reference
1566          and use parts of their respective parent parser, internally. Parent parsers
1567          must outlive subparsers.
1568        </p>
1569      </div>
1570
1571      <h4 id="XML_ParserReset">
1572        XML_ParserReset
1573      </h4>
1574
1575      <pre class="fcndec">
1576XML_Bool XMLCALL
1577XML_ParserReset(XML_Parser p,
1578                const XML_Char *encoding);
1579</pre>
1580      <div class="fcndef">
1581        Clean up the memory structures maintained by the parser so that it may be used
1582        again. After this has been called, <code>parser</code> is ready to start parsing
1583        a new document. All handlers are cleared from the parser, except for the
1584        unknownEncodingHandler. The parser's external state is re-initialized except for
1585        the values of ns and ns_triplets. This function may not be used on a parser
1586        created using <code><a href=
1587        "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>; it
1588        will return <code>XML_FALSE</code> in that case. Returns <code>XML_TRUE</code> on
1589        success. Your application is responsible for dealing with any memory associated
1590        with <a href="#userdata">user data</a>.
1591      </div>
1592
1593      <h3>
1594        <a id="parsing" name="parsing">Parsing</a>
1595      </h3>
1596
1597      <p>
1598        To state the obvious: the three parsing functions <code><a href=
1599        "#XML_Parse">XML_Parse</a></code>, <code><a href=
1600        "#XML_ParseBuffer">XML_ParseBuffer</a></code> and <code><a href=
1601        "#XML_GetBuffer">XML_GetBuffer</a></code> must not be called from within a
1602        handler unless they operate on a separate parser instance, that is, one that did
1603        not call the handler. For example, it is OK to call the parsing functions from
1604        within an <code>XML_ExternalEntityRefHandler</code>, if they apply to the parser
1605        created by <code><a href=
1606        "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>.
1607      </p>
1608
1609      <p>
1610        Note: The <code>len</code> argument passed to these functions should be
1611        considerably less than the maximum value for an integer, as it could create an
1612        integer overflow situation if the added lengths of a buffer and the unprocessed
1613        portion of the previous buffer exceed the maximum integer value. Input data at
1614        the end of a buffer will remain unprocessed if it is part of an XML token for
1615        which the end is not part of that buffer.
1616      </p>
1617
1618      <p>
1619        <a id="isFinal" name="isFinal"></a>The application <em>must</em> make a
1620        concluding <code><a href="#XML_Parse">XML_Parse</a></code> or <code><a href=
1621        "#XML_ParseBuffer">XML_ParseBuffer</a></code> call with <code>isFinal</code> set
1622        to <code>XML_TRUE</code>.
1623      </p>
1624
1625      <h4 id="XML_Parse">
1626        XML_Parse
1627      </h4>
1628
1629      <pre class="fcndec">
1630enum XML_Status XMLCALL
1631XML_Parse(XML_Parser p,
1632          const char *s,
1633          int len,
1634          int isFinal);
1635</pre>
1636
1637      <pre class="signature">
1638enum XML_Status {
1639  XML_STATUS_ERROR = 0,
1640  XML_STATUS_OK = 1
1641};
1642</pre>
1643      <div class="fcndef">
1644        <p>
1645          Parse some more of the document. The string <code>s</code> is a buffer
1646          containing part (or perhaps all) of the document. The number of bytes of s that
1647          are part of the document is indicated by <code>len</code>. This means that
1648          <code>s</code> doesn't have to be null-terminated. It also means that if
1649          <code>len</code> is larger than the number of bytes in the block of memory that
1650          <code>s</code> points at, then a memory fault is likely. Negative values for
1651          <code>len</code> are rejected since Expat 2.2.1. The <code>isFinal</code>
1652          parameter informs the parser that this is the last piece of the document.
1653          Frequently, the last piece is empty (i.e. <code>len</code> is zero.)
1654        </p>
1655
1656        <p>
1657          If a parse error occurred, it returns <code>XML_STATUS_ERROR</code>. Otherwise
1658          it returns <code>XML_STATUS_OK</code> value. Note that regardless of the return
1659          value, there is no guarantee that all provided input has been parsed; only
1660          after <a href="#isFinal">the concluding call</a> will all handler callbacks and
1661          parsing errors have happened.
1662        </p>
1663
1664        <p>
1665          Simplified, <code>XML_Parse</code> can be considered a convenience wrapper that
1666          is pairing calls to <code><a href="#XML_GetBuffer">XML_GetBuffer</a></code> and
1667          <code><a href="#XML_ParseBuffer">XML_ParseBuffer</a></code> (when Expat is
1668          built with macro <code>XML_CONTEXT_BYTES</code> defined to a positive value,
1669          which is both common and default). <code>XML_Parse</code> is then functionally
1670          equivalent to calling <code><a href="#XML_GetBuffer">XML_GetBuffer</a></code>,
1671          <code>memcpy</code>, and <code><a href=
1672          "#XML_ParseBuffer">XML_ParseBuffer</a></code>.
1673        </p>
1674
1675        <p>
1676          To avoid double copying of the input, direct use of functions <code><a href=
1677          "#XML_GetBuffer">XML_GetBuffer</a></code> and <code><a href=
1678          "#XML_ParseBuffer">XML_ParseBuffer</a></code> is advised for most production
1679          use, e.g. if you're using <code>read</code> or similar functionality to fill
1680          your buffers, fill directly into the buffer from <code><a href=
1681          "#XML_GetBuffer">XML_GetBuffer</a></code>, then parse with <code><a href=
1682          "#XML_ParseBuffer">XML_ParseBuffer</a></code>.
1683        </p>
1684      </div>
1685
1686      <h4 id="XML_ParseBuffer">
1687        XML_ParseBuffer
1688      </h4>
1689
1690      <pre class="fcndec">
1691enum XML_Status XMLCALL
1692XML_ParseBuffer(XML_Parser p,
1693                int len,
1694                int isFinal);
1695</pre>
1696      <div class="fcndef">
1697        <p>
1698          This is just like <code><a href="#XML_Parse">XML_Parse</a></code>, except in
1699          this case Expat provides the buffer. By obtaining the buffer from Expat with
1700          the <code><a href="#XML_GetBuffer">XML_GetBuffer</a></code> function, the
1701          application can avoid double copying of the input.
1702        </p>
1703
1704        <p>
1705          Negative values for <code>len</code> are rejected since Expat 2.6.3.
1706        </p>
1707      </div>
1708
1709      <h4 id="XML_GetBuffer">
1710        XML_GetBuffer
1711      </h4>
1712
1713      <pre class="fcndec">
1714void * XMLCALL
1715XML_GetBuffer(XML_Parser p,
1716              int len);
1717</pre>
1718      <div class="fcndef">
1719        Obtain a buffer of size <code>len</code> to read a piece of the document into. A
1720        <code>NULL</code> value is returned if Expat can't allocate enough memory for
1721        this buffer. A <code>NULL</code> value may also be returned if <code>len</code>
1722        is zero. This has to be called prior to every call to <code><a href=
1723        "#XML_ParseBuffer">XML_ParseBuffer</a></code>. A typical use would look like
1724        this:
1725
1726        <pre class="eg">
1727for (;;) {
1728  int bytes_read;
1729  void *buff = XML_GetBuffer(p, BUFF_SIZE);
1730  if (buff == NULL) {
1731    /* handle error */
1732  }
1733
1734  bytes_read = read(docfd, buff, BUFF_SIZE);
1735  if (bytes_read &lt; 0) {
1736    /* handle error */
1737  }
1738
1739  if (! XML_ParseBuffer(p, bytes_read, bytes_read == 0)) {
1740    /* handle parse error */
1741  }
1742
1743  if (bytes_read == 0)
1744    break;
1745}
1746</pre>
1747      </div>
1748
1749      <h4 id="XML_StopParser">
1750        XML_StopParser
1751      </h4>
1752
1753      <pre class="fcndec">
1754enum XML_Status XMLCALL
1755XML_StopParser(XML_Parser p,
1756               XML_Bool resumable);
1757</pre>
1758      <div class="fcndef">
1759        <p>
1760          Stops parsing, causing <code><a href="#XML_Parse">XML_Parse</a></code> or
1761          <code><a href="#XML_ParseBuffer">XML_ParseBuffer</a></code> to return. Must be
1762          called from within a call-back handler, except when aborting (when
1763          <code>resumable</code> is <code>XML_FALSE</code>) an already suspended parser.
1764          Some call-backs may still follow because they would otherwise get lost,
1765          including
1766        </p>
1767
1768        <ul>
1769          <li>the end element handler for empty elements when stopped in the start
1770          element handler,
1771          </li>
1772
1773          <li>the end namespace declaration handler when stopped in the end element
1774          handler,
1775          </li>
1776
1777          <li>the character data handler when stopped in the character data handler while
1778          making multiple call-backs on a contiguous chunk of characters,
1779          </li>
1780        </ul>
1781
1782        <p>
1783          and possibly others.
1784        </p>
1785
1786        <p>
1787          This can be called from most handlers, including DTD related call-backs, except
1788          when parsing an external parameter entity and <code>resumable</code> is
1789          <code>XML_TRUE</code>. Returns <code>XML_STATUS_OK</code> when successful,
1790          <code>XML_STATUS_ERROR</code> otherwise. The possible error codes are:
1791        </p>
1792
1793        <dl>
1794          <dt>
1795            <code>XML_ERROR_NOT_STARTED</code>
1796          </dt>
1797
1798          <dd>
1799            when stopping or suspending a parser before it has started, added in Expat
1800            2.6.4.
1801          </dd>
1802
1803          <dt>
1804            <code>XML_ERROR_SUSPENDED</code>
1805          </dt>
1806
1807          <dd>
1808            when suspending an already suspended parser.
1809          </dd>
1810
1811          <dt>
1812            <code>XML_ERROR_FINISHED</code>
1813          </dt>
1814
1815          <dd>
1816            when the parser has already finished.
1817          </dd>
1818
1819          <dt>
1820            <code>XML_ERROR_SUSPEND_PE</code>
1821          </dt>
1822
1823          <dd>
1824            when suspending while parsing an external PE.
1825          </dd>
1826        </dl>
1827
1828        <p>
1829          Since the stop/resume feature requires application support in the outer parsing
1830          loop, it is an error to call this function for a parser not being handled
1831          appropriately; see <a href="#stop-resume">Temporarily Stopping Parsing</a> for
1832          more information.
1833        </p>
1834
1835        <p>
1836          When <code>resumable</code> is <code>XML_TRUE</code> then parsing is
1837          <em>suspended</em>, that is, <code><a href="#XML_Parse">XML_Parse</a></code>
1838          and <code><a href="#XML_ParseBuffer">XML_ParseBuffer</a></code> return
1839          <code>XML_STATUS_SUSPENDED</code>. Otherwise, parsing is <em>aborted</em>, that
1840          is, <code><a href="#XML_Parse">XML_Parse</a></code> and <code><a href=
1841          "#XML_ParseBuffer">XML_ParseBuffer</a></code> return
1842          <code>XML_STATUS_ERROR</code> with error code <code>XML_ERROR_ABORTED</code>.
1843        </p>
1844
1845        <p>
1846          <strong>Note:</strong> This will be applied to the current parser instance
1847          only, that is, if there is a parent parser then it will continue parsing when
1848          the external entity reference handler returns. It is up to the implementation
1849          of that handler to call <code><a href=
1850          "#XML_StopParser">XML_StopParser</a></code> on the parent parser (recursively),
1851          if one wants to stop parsing altogether.
1852        </p>
1853
1854        <p>
1855          When suspended, parsing can be resumed by calling <code><a href=
1856          "#XML_ResumeParser">XML_ResumeParser</a></code>.
1857        </p>
1858
1859        <p>
1860          New in Expat 1.95.8.
1861        </p>
1862      </div>
1863
1864      <h4 id="XML_ResumeParser">
1865        XML_ResumeParser
1866      </h4>
1867
1868      <pre class="fcndec">
1869enum XML_Status XMLCALL
1870XML_ResumeParser(XML_Parser p);
1871</pre>
1872      <div class="fcndef">
1873        <p>
1874          Resumes parsing after it has been suspended with <code><a href=
1875          "#XML_StopParser">XML_StopParser</a></code>. Must not be called from within a
1876          handler call-back. Returns same status codes as <code><a href=
1877          "#XML_Parse">XML_Parse</a></code> or <code><a href=
1878          "#XML_ParseBuffer">XML_ParseBuffer</a></code>. An additional error code,
1879          <code>XML_ERROR_NOT_SUSPENDED</code>, will be returned if the parser was not
1880          currently suspended.
1881        </p>
1882
1883        <p>
1884          <strong>Note:</strong> This must be called on the most deeply nested child
1885          parser instance first, and on its parent parser only after the child parser has
1886          finished, to be applied recursively until the document entity's parser is
1887          restarted. That is, the parent parser will not resume by itself and it is up to
1888          the application to call <code><a href=
1889          "#XML_ResumeParser">XML_ResumeParser</a></code> on it at the appropriate
1890          moment.
1891        </p>
1892
1893        <p>
1894          New in Expat 1.95.8.
1895        </p>
1896      </div>
1897
1898      <h4 id="XML_GetParsingStatus">
1899        XML_GetParsingStatus
1900      </h4>
1901
1902      <pre class="fcndec">
1903void XMLCALL
1904XML_GetParsingStatus(XML_Parser p,
1905                     XML_ParsingStatus *status);
1906</pre>
1907
1908      <pre class="signature">
1909enum XML_Parsing {
1910  XML_INITIALIZED,
1911  XML_PARSING,
1912  XML_FINISHED,
1913  XML_SUSPENDED
1914};
1915
1916typedef struct {
1917  enum XML_Parsing parsing;
1918  XML_Bool finalBuffer;
1919} XML_ParsingStatus;
1920</pre>
1921      <div class="fcndef">
1922        <p>
1923          Returns status of parser with respect to being initialized, parsing, finished,
1924          or suspended, and whether the final buffer is being processed. The
1925          <code>status</code> parameter <em>must not</em> be <code>NULL</code>.
1926        </p>
1927
1928        <p>
1929          New in Expat 1.95.8.
1930        </p>
1931      </div>
1932
1933      <h3>
1934        <a id="setting" name="setting">Handler Setting</a>
1935      </h3>
1936
1937      <p>
1938        Although handlers are typically set prior to parsing and left alone, an
1939        application may choose to set or change the handler for a parsing event while the
1940        parse is in progress. For instance, your application may choose to ignore all
1941        text not descended from a <code>para</code> element. One way it could do this is
1942        to set the character handler when a para start tag is seen, and unset it for the
1943        corresponding end tag.
1944      </p>
1945
1946      <p>
1947        A handler may be <em>unset</em> by providing a <code>NULL</code> pointer to the
1948        appropriate handler setter. None of the handler setting functions have a return
1949        value.
1950      </p>
1951
1952      <p>
1953        Your handlers will be receiving strings in arrays of type <code>XML_Char</code>.
1954        This type is conditionally defined in expat.h as either <code>char</code>,
1955        <code>wchar_t</code> or <code>unsigned short</code>. The former implies UTF-8
1956        encoding, the latter two imply UTF-16 encoding. Note that you'll receive them in
1957        this form independent of the original encoding of the document.
1958      </p>
1959
1960      <div class="handler">
1961        <h4 id="XML_SetStartElementHandler">
1962          XML_SetStartElementHandler
1963        </h4>
1964
1965        <pre class="setter">
1966void XMLCALL
1967XML_SetStartElementHandler(XML_Parser p,
1968                           XML_StartElementHandler start);
1969</pre>
1970
1971        <pre class="signature">
1972typedef void
1973(XMLCALL *XML_StartElementHandler)(void *userData,
1974                                   const XML_Char *name,
1975                                   const XML_Char **atts);
1976</pre>
1977        <p>
1978          Set handler for start (and empty) tags. Attributes are passed to the start
1979          handler as a pointer to a vector of char pointers. Each attribute seen in a
1980          start (or empty) tag occupies 2 consecutive places in this vector: the
1981          attribute name followed by the attribute value. These pairs are terminated by a
1982          <code>NULL</code> pointer.
1983        </p>
1984
1985        <p>
1986          Note that an empty tag generates a call to both start and end handlers (in that
1987          order).
1988        </p>
1989      </div>
1990
1991      <div class="handler">
1992        <h4 id="XML_SetEndElementHandler">
1993          XML_SetEndElementHandler
1994        </h4>
1995
1996        <pre class="setter">
1997void XMLCALL
1998XML_SetEndElementHandler(XML_Parser p,
1999                         XML_EndElementHandler);
2000</pre>
2001
2002        <pre class="signature">
2003typedef void
2004(XMLCALL *XML_EndElementHandler)(void *userData,
2005                                 const XML_Char *name);
2006</pre>
2007        <p>
2008          Set handler for end (and empty) tags. As noted above, an empty tag generates a
2009          call to both start and end handlers.
2010        </p>
2011      </div>
2012
2013      <div class="handler">
2014        <h4 id="XML_SetElementHandler">
2015          XML_SetElementHandler
2016        </h4>
2017
2018        <pre class="setter">
2019void XMLCALL
2020XML_SetElementHandler(XML_Parser p,
2021                      XML_StartElementHandler start,
2022                      XML_EndElementHandler end);
2023</pre>
2024        <p>
2025          Set handlers for start and end tags with one call.
2026        </p>
2027      </div>
2028
2029      <div class="handler">
2030        <h4 id="XML_SetCharacterDataHandler">
2031          XML_SetCharacterDataHandler
2032        </h4>
2033
2034        <pre class="setter">
2035void XMLCALL
2036XML_SetCharacterDataHandler(XML_Parser p,
2037                            XML_CharacterDataHandler charhndl)
2038</pre>
2039
2040        <pre class="signature">
2041typedef void
2042(XMLCALL *XML_CharacterDataHandler)(void *userData,
2043                                    const XML_Char *s,
2044                                    int len);
2045</pre>
2046        <p>
2047          Set a text handler. The string your handler receives is <em>NOT
2048          null-terminated</em>. You have to use the length argument to deal with the end
2049          of the string. A single block of contiguous text free of markup may still
2050          result in a sequence of calls to this handler. In other words, if you're
2051          searching for a pattern in the text, it may be split across calls to this
2052          handler. Note: Setting this handler to <code>NULL</code> may <em>NOT
2053          immediately</em> terminate call-backs if the parser is currently processing
2054          such a single block of contiguous markup-free text, as the parser will continue
2055          calling back until the end of the block is reached.
2056        </p>
2057      </div>
2058
2059      <div class="handler">
2060        <h4 id="XML_SetProcessingInstructionHandler">
2061          XML_SetProcessingInstructionHandler
2062        </h4>
2063
2064        <pre class="setter">
2065void XMLCALL
2066XML_SetProcessingInstructionHandler(XML_Parser p,
2067                                    XML_ProcessingInstructionHandler proc)
2068</pre>
2069
2070        <pre class="signature">
2071typedef void
2072(XMLCALL *XML_ProcessingInstructionHandler)(void *userData,
2073                                            const XML_Char *target,
2074                                            const XML_Char *data);
2075
2076</pre>
2077        <p>
2078          Set a handler for processing instructions. The target is the first word in the
2079          processing instruction. The data is the rest of the characters in it after
2080          skipping all whitespace after the initial word.
2081        </p>
2082      </div>
2083
2084      <div class="handler">
2085        <h4 id="XML_SetCommentHandler">
2086          XML_SetCommentHandler
2087        </h4>
2088
2089        <pre class="setter">
2090void XMLCALL
2091XML_SetCommentHandler(XML_Parser p,
2092                      XML_CommentHandler cmnt)
2093</pre>
2094
2095        <pre class="signature">
2096typedef void
2097(XMLCALL *XML_CommentHandler)(void *userData,
2098                              const XML_Char *data);
2099</pre>
2100        <p>
2101          Set a handler for comments. The data is all text inside the comment delimiters.
2102        </p>
2103      </div>
2104
2105      <div class="handler">
2106        <h4 id="XML_SetStartCdataSectionHandler">
2107          XML_SetStartCdataSectionHandler
2108        </h4>
2109
2110        <pre class="setter">
2111void XMLCALL
2112XML_SetStartCdataSectionHandler(XML_Parser p,
2113                                XML_StartCdataSectionHandler start);
2114</pre>
2115
2116        <pre class="signature">
2117typedef void
2118(XMLCALL *XML_StartCdataSectionHandler)(void *userData);
2119</pre>
2120        <p>
2121          Set a handler that gets called at the beginning of a CDATA section.
2122        </p>
2123      </div>
2124
2125      <div class="handler">
2126        <h4 id="XML_SetEndCdataSectionHandler">
2127          XML_SetEndCdataSectionHandler
2128        </h4>
2129
2130        <pre class="setter">
2131void XMLCALL
2132XML_SetEndCdataSectionHandler(XML_Parser p,
2133                              XML_EndCdataSectionHandler end);
2134</pre>
2135
2136        <pre class="signature">
2137typedef void
2138(XMLCALL *XML_EndCdataSectionHandler)(void *userData);
2139</pre>
2140        <p>
2141          Set a handler that gets called at the end of a CDATA section.
2142        </p>
2143      </div>
2144
2145      <div class="handler">
2146        <h4 id="XML_SetCdataSectionHandler">
2147          XML_SetCdataSectionHandler
2148        </h4>
2149
2150        <pre class="setter">
2151void XMLCALL
2152XML_SetCdataSectionHandler(XML_Parser p,
2153                           XML_StartCdataSectionHandler start,
2154                           XML_EndCdataSectionHandler end)
2155</pre>
2156        <p>
2157          Sets both CDATA section handlers with one call.
2158        </p>
2159      </div>
2160
2161      <div class="handler">
2162        <h4 id="XML_SetDefaultHandler">
2163          XML_SetDefaultHandler
2164        </h4>
2165
2166        <pre class="setter">
2167void XMLCALL
2168XML_SetDefaultHandler(XML_Parser p,
2169                      XML_DefaultHandler hndl)
2170</pre>
2171
2172        <pre class="signature">
2173typedef void
2174(XMLCALL *XML_DefaultHandler)(void *userData,
2175                              const XML_Char *s,
2176                              int len);
2177</pre>
2178        <p>
2179          Sets a handler for any characters in the document which wouldn't otherwise be
2180          handled. This includes both data for which no handlers can be set (like some
2181          kinds of DTD declarations) and data which could be reported but which currently
2182          has no handler set. The characters are passed exactly as they were present in
2183          the XML document except that they will be encoded in UTF-8 or UTF-16. Line
2184          boundaries are not normalized. Note that a byte order mark character is not
2185          passed to the default handler. There are no guarantees about how characters are
2186          divided between calls to the default handler: for example, a comment might be
2187          split between multiple calls. Setting the handler with this call has the side
2188          effect of turning off expansion of references to internally defined general
2189          entities. Instead these references are passed to the default handler.
2190        </p>
2191
2192        <p>
2193          See also <code><a href="#XML_DefaultCurrent">XML_DefaultCurrent</a></code>.
2194        </p>
2195      </div>
2196
2197      <div class="handler">
2198        <h4 id="XML_SetDefaultHandlerExpand">
2199          XML_SetDefaultHandlerExpand
2200        </h4>
2201
2202        <pre class="setter">
2203void XMLCALL
2204XML_SetDefaultHandlerExpand(XML_Parser p,
2205                            XML_DefaultHandler hndl)
2206</pre>
2207
2208        <pre class="signature">
2209typedef void
2210(XMLCALL *XML_DefaultHandler)(void *userData,
2211                              const XML_Char *s,
2212                              int len);
2213</pre>
2214        <p>
2215          This sets a default handler, but doesn't inhibit the expansion of internal
2216          entity references. The entity reference will not be passed to the default
2217          handler.
2218        </p>
2219
2220        <p>
2221          See also <code><a href="#XML_DefaultCurrent">XML_DefaultCurrent</a></code>.
2222        </p>
2223      </div>
2224
2225      <div class="handler">
2226        <h4 id="XML_SetExternalEntityRefHandler">
2227          XML_SetExternalEntityRefHandler
2228        </h4>
2229
2230        <pre class="setter">
2231void XMLCALL
2232XML_SetExternalEntityRefHandler(XML_Parser p,
2233                                XML_ExternalEntityRefHandler hndl)
2234</pre>
2235
2236        <pre class="signature">
2237typedef int
2238(XMLCALL *XML_ExternalEntityRefHandler)(XML_Parser p,
2239                                        const XML_Char *context,
2240                                        const XML_Char *base,
2241                                        const XML_Char *systemId,
2242                                        const XML_Char *publicId);
2243</pre>
2244        <p>
2245          Set an external entity reference handler. This handler is also called for
2246          processing an external DTD subset if parameter entity parsing is in effect.
2247          (See <a href=
2248          "#XML_SetParamEntityParsing"><code>XML_SetParamEntityParsing</code></a>.)
2249        </p>
2250
2251        <p>
2252          <strong>Warning:</strong> Using an external entity reference handler can lead
2253          to <a href="https://libexpat.github.io/doc/xml-security/#external-entities">XXE
2254          vulnerabilities</a>. It should only be used in applications that do not parse
2255          untrusted XML input.
2256        </p>
2257
2258        <p>
2259          The <code>context</code> parameter specifies the parsing context in the format
2260          expected by the <code>context</code> argument to <code><a href=
2261          "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>.
2262          <code>code</code> is valid only until the handler returns, so if the referenced
2263          entity is to be parsed later, it must be copied. <code>context</code> is
2264          <code>NULL</code> only when the entity is a parameter entity, which is how one
2265          can differentiate between general and parameter entities.
2266        </p>
2267
2268        <p>
2269          The <code>base</code> parameter is the base to use for relative system
2270          identifiers. It is set by <code><a href="#XML_SetBase">XML_SetBase</a></code>
2271          and may be <code>NULL</code>. The <code>publicId</code> parameter is the public
2272          id given in the entity declaration and may be <code>NULL</code>.
2273          <code>systemId</code> is the system identifier specified in the entity
2274          declaration and is never <code>NULL</code>.
2275        </p>
2276
2277        <p>
2278          There are a couple of ways in which this handler differs from others. First,
2279          this handler returns a status indicator (an integer).
2280          <code>XML_STATUS_OK</code> should be returned for successful handling of the
2281          external entity reference. Returning <code>XML_STATUS_ERROR</code> indicates
2282          failure, and causes the calling parser to return an
2283          <code>XML_ERROR_EXTERNAL_ENTITY_HANDLING</code> error.
2284        </p>
2285
2286        <p>
2287          Second, instead of having the user data as its first argument, it receives the
2288          parser that encountered the entity reference. This, along with the context
2289          parameter, may be used as arguments to a call to <code><a href=
2290          "#XML_ExternalEntityParserCreate">XML_ExternalEntityParserCreate</a></code>.
2291          Using the returned parser, the body of the external entity can be recursively
2292          parsed.
2293        </p>
2294
2295        <p>
2296          Since this handler may be called recursively, it should not be saving
2297          information into global or static variables.
2298        </p>
2299      </div>
2300
2301      <h4 id="XML_SetExternalEntityRefHandlerArg">
2302        XML_SetExternalEntityRefHandlerArg
2303      </h4>
2304
2305      <pre class="fcndec">
2306void XMLCALL
2307XML_SetExternalEntityRefHandlerArg(XML_Parser p,
2308                                   void *arg)
2309</pre>
2310      <div class="fcndef">
2311        <p>
2312          Set the argument passed to the ExternalEntityRefHandler. If <code>arg</code> is
2313          not <code>NULL</code>, it is the new value passed to the handler set using
2314          <code><a href=
2315          "#XML_SetExternalEntityRefHandler">XML_SetExternalEntityRefHandler</a></code>;
2316          if <code>arg</code> is <code>NULL</code>, the argument passed to the handler
2317          function will be the parser object itself.
2318        </p>
2319
2320        <p>
2321          <strong>Note:</strong> The type of <code>arg</code> and the type of the first
2322          argument to the ExternalEntityRefHandler do not match. This function takes a
2323          <code>void *</code> to be passed to the handler, while the handler accepts an
2324          <code>XML_Parser</code>. This is a historical accident, but will not be
2325          corrected before Expat 2.0 (at the earliest) to avoid causing compiler warnings
2326          for code that's known to work with this API. It is the responsibility of the
2327          application code to know the actual type of the argument passed to the handler
2328          and to manage it properly.
2329        </p>
2330      </div>
2331
2332      <div class="handler">
2333        <h4 id="XML_SetSkippedEntityHandler">
2334          XML_SetSkippedEntityHandler
2335        </h4>
2336
2337        <pre class="setter">
2338void XMLCALL
2339XML_SetSkippedEntityHandler(XML_Parser p,
2340                            XML_SkippedEntityHandler handler)
2341</pre>
2342
2343        <pre class="signature">
2344typedef void
2345(XMLCALL *XML_SkippedEntityHandler)(void *userData,
2346                                    const XML_Char *entityName,
2347                                    int is_parameter_entity);
2348</pre>
2349        <p>
2350          Set a skipped entity handler. This is called in two situations:
2351        </p>
2352
2353        <ol>
2354          <li>An entity reference is encountered for which no declaration has been read
2355          <em>and</em> this is not an error.
2356          </li>
2357
2358          <li>An internal entity reference is read, but not expanded, because <a href=
2359          "#XML_SetDefaultHandler"><code>XML_SetDefaultHandler</code></a> has been
2360          called.
2361          </li>
2362        </ol>
2363
2364        <p>
2365          The <code>is_parameter_entity</code> argument will be non-zero for a parameter
2366          entity and zero for a general entity.
2367        </p>
2368
2369        <p>
2370          Note: Skipped parameter entities in declarations and skipped general entities
2371          in attribute values cannot be reported, because the event would be out of sync
2372          with the reporting of the declarations or attribute values
2373        </p>
2374      </div>
2375
2376      <div class="handler">
2377        <h4 id="XML_SetUnknownEncodingHandler">
2378          XML_SetUnknownEncodingHandler
2379        </h4>
2380
2381        <pre class="setter">
2382void XMLCALL
2383XML_SetUnknownEncodingHandler(XML_Parser p,
2384                              XML_UnknownEncodingHandler enchandler,
2385                              void *encodingHandlerData)
2386</pre>
2387
2388        <pre class="signature">
2389typedef int
2390(XMLCALL *XML_UnknownEncodingHandler)(void *encodingHandlerData,
2391                                      const XML_Char *name,
2392                                      XML_Encoding *info);
2393
2394typedef struct {
2395  int map[256];
2396  void *data;
2397  int (XMLCALL *convert)(void *data, const char *s);
2398  void (XMLCALL *release)(void *data);
2399} XML_Encoding;
2400</pre>
2401        <p>
2402          Set a handler to deal with encodings other than the <a href=
2403          "#builtin_encodings">built in set</a>. This should be done before
2404          <code><a href="#XML_Parse">XML_Parse</a></code> or <code><a href=
2405          "#XML_ParseBuffer">XML_ParseBuffer</a></code> have been called on the given
2406          parser.
2407        </p>
2408
2409        <p>
2410          If the handler knows how to deal with an encoding with the given name, it
2411          should fill in the <code>info</code> data structure and return
2412          <code>XML_STATUS_OK</code>. Otherwise it should return
2413          <code>XML_STATUS_ERROR</code>. The handler will be called at most once per
2414          parsed (external) entity. The optional application data pointer
2415          <code>encodingHandlerData</code> will be passed back to the handler.
2416        </p>
2417
2418        <p>
2419          The map array contains information for every possible leading byte in a byte
2420          sequence. If the corresponding value is &gt;= 0, then it's a single byte
2421          sequence and the byte encodes that Unicode value. If the value is -1, then that
2422          byte is invalid as the initial byte in a sequence. If the value is -n, where n
2423          is an integer &gt; 1, then n is the number of bytes in the sequence and the
2424          actual conversion is accomplished by a call to the function pointed at by
2425          convert. This function may return -1 if the sequence itself is invalid. The
2426          convert pointer may be <code>NULL</code> if there are only single byte codes.
2427          The data parameter passed to the convert function is the data pointer from
2428          <code>XML_Encoding</code>. The string s is <em>NOT</em> null-terminated and
2429          points at the sequence of bytes to be converted.
2430        </p>
2431
2432        <p>
2433          The function pointed at by <code>release</code> is called by the parser when it
2434          is finished with the encoding. It may be <code>NULL</code>.
2435        </p>
2436      </div>
2437
2438      <div class="handler">
2439        <h4 id="XML_SetStartNamespaceDeclHandler">
2440          XML_SetStartNamespaceDeclHandler
2441        </h4>
2442
2443        <pre class="setter">
2444void XMLCALL
2445XML_SetStartNamespaceDeclHandler(XML_Parser p,
2446                                 XML_StartNamespaceDeclHandler start);
2447</pre>
2448
2449        <pre class="signature">
2450typedef void
2451(XMLCALL *XML_StartNamespaceDeclHandler)(void *userData,
2452                                         const XML_Char *prefix,
2453                                         const XML_Char *uri);
2454</pre>
2455        <p>
2456          Set a handler to be called when a namespace is declared. Namespace declarations
2457          occur inside start tags. But the namespace declaration start handler is called
2458          before the start tag handler for each namespace declared in that start tag.
2459        </p>
2460      </div>
2461
2462      <div class="handler">
2463        <h4 id="XML_SetEndNamespaceDeclHandler">
2464          XML_SetEndNamespaceDeclHandler
2465        </h4>
2466
2467        <pre class="setter">
2468void XMLCALL
2469XML_SetEndNamespaceDeclHandler(XML_Parser p,
2470                               XML_EndNamespaceDeclHandler end);
2471</pre>
2472
2473        <pre class="signature">
2474typedef void
2475(XMLCALL *XML_EndNamespaceDeclHandler)(void *userData,
2476                                       const XML_Char *prefix);
2477</pre>
2478        <p>
2479          Set a handler to be called when leaving the scope of a namespace declaration.
2480          This will be called, for each namespace declaration, after the handler for the
2481          end tag of the element in which the namespace was declared.
2482        </p>
2483      </div>
2484
2485      <div class="handler">
2486        <h4 id="XML_SetNamespaceDeclHandler">
2487          XML_SetNamespaceDeclHandler
2488        </h4>
2489
2490        <pre class="setter">
2491void XMLCALL
2492XML_SetNamespaceDeclHandler(XML_Parser p,
2493                            XML_StartNamespaceDeclHandler start,
2494                            XML_EndNamespaceDeclHandler end)
2495</pre>
2496        <p>
2497          Sets both namespace declaration handlers with a single call.
2498        </p>
2499      </div>
2500
2501      <div class="handler">
2502        <h4 id="XML_SetXmlDeclHandler">
2503          XML_SetXmlDeclHandler
2504        </h4>
2505
2506        <pre class="setter">
2507void XMLCALL
2508XML_SetXmlDeclHandler(XML_Parser p,
2509                      XML_XmlDeclHandler xmldecl);
2510</pre>
2511
2512        <pre class="signature">
2513typedef void
2514(XMLCALL *XML_XmlDeclHandler)(void            *userData,
2515                              const XML_Char  *version,
2516                              const XML_Char  *encoding,
2517                              int             standalone);
2518</pre>
2519        <p>
2520          Sets a handler that is called for XML declarations and also for text
2521          declarations discovered in external entities. The way to distinguish is that
2522          the <code>version</code> parameter will be <code>NULL</code> for text
2523          declarations. The <code>encoding</code> parameter may be <code>NULL</code> for
2524          an XML declaration. The <code>standalone</code> argument will contain -1, 0, or
2525          1 indicating respectively that there was no standalone parameter in the
2526          declaration, that it was given as no, or that it was given as yes.
2527        </p>
2528      </div>
2529
2530      <div class="handler">
2531        <h4 id="XML_SetStartDoctypeDeclHandler">
2532          XML_SetStartDoctypeDeclHandler
2533        </h4>
2534
2535        <pre class="setter">
2536void XMLCALL
2537XML_SetStartDoctypeDeclHandler(XML_Parser p,
2538                               XML_StartDoctypeDeclHandler start);
2539</pre>
2540
2541        <pre class="signature">
2542typedef void
2543(XMLCALL *XML_StartDoctypeDeclHandler)(void           *userData,
2544                                       const XML_Char *doctypeName,
2545                                       const XML_Char *sysid,
2546                                       const XML_Char *pubid,
2547                                       int            has_internal_subset);
2548</pre>
2549        <p>
2550          Set a handler that is called at the start of a DOCTYPE declaration, before any
2551          external or internal subset is parsed. Both <code>sysid</code> and
2552          <code>pubid</code> may be <code>NULL</code>. The
2553          <code>has_internal_subset</code> will be non-zero if the DOCTYPE declaration
2554          has an internal subset.
2555        </p>
2556      </div>
2557
2558      <div class="handler">
2559        <h4 id="XML_SetEndDoctypeDeclHandler">
2560          XML_SetEndDoctypeDeclHandler
2561        </h4>
2562
2563        <pre class="setter">
2564void XMLCALL
2565XML_SetEndDoctypeDeclHandler(XML_Parser p,
2566                             XML_EndDoctypeDeclHandler end);
2567</pre>
2568
2569        <pre class="signature">
2570typedef void
2571(XMLCALL *XML_EndDoctypeDeclHandler)(void *userData);
2572</pre>
2573        <p>
2574          Set a handler that is called at the end of a DOCTYPE declaration, after parsing
2575          any external subset.
2576        </p>
2577      </div>
2578
2579      <div class="handler">
2580        <h4 id="XML_SetDoctypeDeclHandler">
2581          XML_SetDoctypeDeclHandler
2582        </h4>
2583
2584        <pre class="setter">
2585void XMLCALL
2586XML_SetDoctypeDeclHandler(XML_Parser p,
2587                          XML_StartDoctypeDeclHandler start,
2588                          XML_EndDoctypeDeclHandler end);
2589</pre>
2590        <p>
2591          Set both doctype handlers with one call.
2592        </p>
2593      </div>
2594
2595      <div class="handler">
2596        <h4 id="XML_SetElementDeclHandler">
2597          XML_SetElementDeclHandler
2598        </h4>
2599
2600        <pre class="setter">
2601void XMLCALL
2602XML_SetElementDeclHandler(XML_Parser p,
2603                          XML_ElementDeclHandler eldecl);
2604</pre>
2605
2606        <pre class="signature">
2607typedef void
2608(XMLCALL *XML_ElementDeclHandler)(void *userData,
2609                                  const XML_Char *name,
2610                                  XML_Content *model);
2611</pre>
2612
2613        <pre class="signature">
2614enum XML_Content_Type {
2615  XML_CTYPE_EMPTY = 1,
2616  XML_CTYPE_ANY,
2617  XML_CTYPE_MIXED,
2618  XML_CTYPE_NAME,
2619  XML_CTYPE_CHOICE,
2620  XML_CTYPE_SEQ
2621};
2622
2623enum XML_Content_Quant {
2624  XML_CQUANT_NONE,
2625  XML_CQUANT_OPT,
2626  XML_CQUANT_REP,
2627  XML_CQUANT_PLUS
2628};
2629
2630typedef struct XML_cp XML_Content;
2631
2632struct XML_cp {
2633  enum XML_Content_Type         type;
2634  enum XML_Content_Quant        quant;
2635  const XML_Char *              name;
2636  unsigned int                  numchildren;
2637  XML_Content *                 children;
2638};
2639</pre>
2640        <p>
2641          Sets a handler for element declarations in a DTD. The handler gets called with
2642          the name of the element in the declaration and a pointer to a structure that
2643          contains the element model. It's the user code's responsibility to free model
2644          when finished with via a call to <code><a href=
2645          "#XML_FreeContentModel">XML_FreeContentModel</a></code>. There is no need to
2646          free the model from the handler, it can be kept around and freed at a later
2647          stage.
2648        </p>
2649
2650        <p>
2651          The <code>model</code> argument is the root of a tree of
2652          <code>XML_Content</code> nodes. If <code>type</code> equals
2653          <code>XML_CTYPE_EMPTY</code> or <code>XML_CTYPE_ANY</code>, then
2654          <code>quant</code> will be <code>XML_CQUANT_NONE</code>, and the other fields
2655          will be zero or <code>NULL</code>. If <code>type</code> is
2656          <code>XML_CTYPE_MIXED</code>, then <code>quant</code> will be
2657          <code>XML_CQUANT_NONE</code> or <code>XML_CQUANT_REP</code> and
2658          <code>numchildren</code> will contain the number of elements that are allowed
2659          to be mixed in and <code>children</code> points to an array of
2660          <code>XML_Content</code> structures that will all have type XML_CTYPE_NAME with
2661          no quantification. Only the root node can be type <code>XML_CTYPE_EMPTY</code>,
2662          <code>XML_CTYPE_ANY</code>, or <code>XML_CTYPE_MIXED</code>.
2663        </p>
2664
2665        <p>
2666          For type <code>XML_CTYPE_NAME</code>, the <code>name</code> field points to the
2667          name and the <code>numchildren</code> and <code>children</code> fields will be
2668          zero and <code>NULL</code>. The <code>quant</code> field will indicate any
2669          quantifiers placed on the name.
2670        </p>
2671
2672        <p>
2673          Types <code>XML_CTYPE_CHOICE</code> and <code>XML_CTYPE_SEQ</code> indicate a
2674          choice or sequence respectively. The <code>numchildren</code> field indicates
2675          how many nodes in the choice or sequence and <code>children</code> points to
2676          the nodes.
2677        </p>
2678      </div>
2679
2680      <div class="handler">
2681        <h4 id="XML_SetAttlistDeclHandler">
2682          XML_SetAttlistDeclHandler
2683        </h4>
2684
2685        <pre class="setter">
2686void XMLCALL
2687XML_SetAttlistDeclHandler(XML_Parser p,
2688                          XML_AttlistDeclHandler attdecl);
2689</pre>
2690
2691        <pre class="signature">
2692typedef void
2693(XMLCALL *XML_AttlistDeclHandler)(void           *userData,
2694                                  const XML_Char *elname,
2695                                  const XML_Char *attname,
2696                                  const XML_Char *att_type,
2697                                  const XML_Char *dflt,
2698                                  int            isrequired);
2699</pre>
2700        <p>
2701          Set a handler for attlist declarations in the DTD. This handler is called for
2702          <em>each</em> attribute. So a single attlist declaration with multiple
2703          attributes declared will generate multiple calls to this handler. The
2704          <code>elname</code> parameter returns the name of the element for which the
2705          attribute is being declared. The attribute name is in the <code>attname</code>
2706          parameter. The attribute type is in the <code>att_type</code> parameter. It is
2707          the string representing the type in the declaration with whitespace removed.
2708        </p>
2709
2710        <p>
2711          The <code>dflt</code> parameter holds the default value. It will be
2712          <code>NULL</code> in the case of "#IMPLIED" or "#REQUIRED" attributes. You can
2713          distinguish these two cases by checking the <code>isrequired</code> parameter,
2714          which will be true in the case of "#REQUIRED" attributes. Attributes which are
2715          "#FIXED" will have also have a true <code>isrequired</code>, but they will have
2716          the non-<code>NULL</code> fixed value in the <code>dflt</code> parameter.
2717        </p>
2718      </div>
2719
2720      <div class="handler">
2721        <h4 id="XML_SetEntityDeclHandler">
2722          XML_SetEntityDeclHandler
2723        </h4>
2724
2725        <pre class="setter">
2726void XMLCALL
2727XML_SetEntityDeclHandler(XML_Parser p,
2728                         XML_EntityDeclHandler handler);
2729</pre>
2730
2731        <pre class="signature">
2732typedef void
2733(XMLCALL *XML_EntityDeclHandler)(void           *userData,
2734                                 const XML_Char *entityName,
2735                                 int            is_parameter_entity,
2736                                 const XML_Char *value,
2737                                 int            value_length,
2738                                 const XML_Char *base,
2739                                 const XML_Char *systemId,
2740                                 const XML_Char *publicId,
2741                                 const XML_Char *notationName);
2742</pre>
2743        <p>
2744          Sets a handler that will be called for all entity declarations. The
2745          <code>is_parameter_entity</code> argument will be non-zero in the case of
2746          parameter entities and zero otherwise.
2747        </p>
2748
2749        <p>
2750          For internal entities (<code>&lt;!ENTITY foo "bar"&gt;</code>),
2751          <code>value</code> will be non-<code>NULL</code> and <code>systemId</code>,
2752          <code>publicId</code>, and <code>notationName</code> will all be
2753          <code>NULL</code>. The value string is <em>not</em> null-terminated; the length
2754          is provided in the <code>value_length</code> parameter. Do not use
2755          <code>value_length</code> to test for internal entities, since it is legal to
2756          have zero-length values. Instead check for whether or not <code>value</code> is
2757          <code>NULL</code>.
2758        </p>
2759
2760        <p>
2761          The <code>notationName</code> argument will have a non-<code>NULL</code> value
2762          only for unparsed entity declarations.
2763        </p>
2764      </div>
2765
2766      <div class="handler">
2767        <h4 id="XML_SetUnparsedEntityDeclHandler">
2768          XML_SetUnparsedEntityDeclHandler
2769        </h4>
2770
2771        <pre class="setter">
2772void XMLCALL
2773XML_SetUnparsedEntityDeclHandler(XML_Parser p,
2774                                 XML_UnparsedEntityDeclHandler h)
2775</pre>
2776
2777        <pre class="signature">
2778typedef void
2779(XMLCALL *XML_UnparsedEntityDeclHandler)(void *userData,
2780                                         const XML_Char *entityName,
2781                                         const XML_Char *base,
2782                                         const XML_Char *systemId,
2783                                         const XML_Char *publicId,
2784                                         const XML_Char *notationName);
2785</pre>
2786        <p>
2787          Set a handler that receives declarations of unparsed entities. These are entity
2788          declarations that have a notation (NDATA) field:
2789        </p>
2790
2791        <div id="eg">
2792          <pre>
2793&lt;!ENTITY logo SYSTEM "images/logo.gif" NDATA gif&gt;
2794</pre>
2795        </div>
2796
2797        <p>
2798          This handler is obsolete and is provided for backwards compatibility. Use
2799          instead <a href="#XML_SetEntityDeclHandler">XML_SetEntityDeclHandler</a>.
2800        </p>
2801      </div>
2802
2803      <div class="handler">
2804        <h4 id="XML_SetNotationDeclHandler">
2805          XML_SetNotationDeclHandler
2806        </h4>
2807
2808        <pre class="setter">
2809void XMLCALL
2810XML_SetNotationDeclHandler(XML_Parser p,
2811                           XML_NotationDeclHandler h)
2812</pre>
2813
2814        <pre class="signature">
2815typedef void
2816(XMLCALL *XML_NotationDeclHandler)(void *userData,
2817                                   const XML_Char *notationName,
2818                                   const XML_Char *base,
2819                                   const XML_Char *systemId,
2820                                   const XML_Char *publicId);
2821</pre>
2822        <p>
2823          Set a handler that receives notation declarations.
2824        </p>
2825      </div>
2826
2827      <div class="handler">
2828        <h4 id="XML_SetNotStandaloneHandler">
2829          XML_SetNotStandaloneHandler
2830        </h4>
2831
2832        <pre class="setter">
2833void XMLCALL
2834XML_SetNotStandaloneHandler(XML_Parser p,
2835                            XML_NotStandaloneHandler h)
2836</pre>
2837
2838        <pre class="signature">
2839typedef int
2840(XMLCALL *XML_NotStandaloneHandler)(void *userData);
2841</pre>
2842        <p>
2843          Set a handler that is called if the document is not "standalone". This happens
2844          when there is an external subset or a reference to a parameter entity, but does
2845          not have standalone set to "yes" in an XML declaration. If this handler returns
2846          <code>XML_STATUS_ERROR</code>, then the parser will throw an
2847          <code>XML_ERROR_NOT_STANDALONE</code> error.
2848        </p>
2849      </div>
2850
2851      <h3>
2852        <a id="position" name="position">Parse position and error reporting functions</a>
2853      </h3>
2854
2855      <p>
2856        These are the functions you'll want to call when the parse functions return
2857        <code>XML_STATUS_ERROR</code> (a parse error has occurred), although the position
2858        reporting functions are useful outside of errors. The position reported is the
2859        byte position (in the original document or entity encoding) of the first of the
2860        sequence of characters that generated the current event (or the error that caused
2861        the parse functions to return <code>XML_STATUS_ERROR</code>.) The exceptions are
2862        callbacks triggered by declarations in the document prologue, in which case they
2863        exact position reported is somewhere in the relevant markup, but not necessarily
2864        as meaningful as for other events.
2865      </p>
2866
2867      <p>
2868        The position reporting functions are accurate only outside of the DTD. In other
2869        words, they usually return bogus information when called from within a DTD
2870        declaration handler.
2871      </p>
2872
2873      <h4 id="XML_GetErrorCode">
2874        XML_GetErrorCode
2875      </h4>
2876
2877      <pre class="fcndec">
2878enum XML_Error XMLCALL
2879XML_GetErrorCode(XML_Parser p);
2880</pre>
2881      <div class="fcndef">
2882        Return what type of error has occurred.
2883      </div>
2884
2885      <h4 id="XML_ErrorString">
2886        XML_ErrorString
2887      </h4>
2888
2889      <pre class="fcndec">
2890const XML_LChar * XMLCALL
2891XML_ErrorString(enum XML_Error code);
2892</pre>
2893      <div class="fcndef">
2894        Return a string describing the error corresponding to code. The code should be
2895        one of the enums that can be returned from <code><a href=
2896        "#XML_GetErrorCode">XML_GetErrorCode</a></code>.
2897      </div>
2898
2899      <h4 id="XML_GetCurrentByteIndex">
2900        XML_GetCurrentByteIndex
2901      </h4>
2902
2903      <pre class="fcndec">
2904XML_Index XMLCALL
2905XML_GetCurrentByteIndex(XML_Parser p);
2906</pre>
2907      <div class="fcndef">
2908        Return the byte offset of the position. This always corresponds to the values
2909        returned by <code><a href=
2910        "#XML_GetCurrentLineNumber">XML_GetCurrentLineNumber</a></code> and
2911        <code><a href="#XML_GetCurrentColumnNumber">XML_GetCurrentColumnNumber</a></code>.
2912      </div>
2913
2914      <h4 id="XML_GetCurrentLineNumber">
2915        XML_GetCurrentLineNumber
2916      </h4>
2917
2918      <pre class="fcndec">
2919XML_Size XMLCALL
2920XML_GetCurrentLineNumber(XML_Parser p);
2921</pre>
2922      <div class="fcndef">
2923        Return the line number of the position. The first line is reported as
2924        <code>1</code>.
2925      </div>
2926
2927      <h4 id="XML_GetCurrentColumnNumber">
2928        XML_GetCurrentColumnNumber
2929      </h4>
2930
2931      <pre class="fcndec">
2932XML_Size XMLCALL
2933XML_GetCurrentColumnNumber(XML_Parser p);
2934</pre>
2935      <div class="fcndef">
2936        Return the <em>offset</em>, from the beginning of the current line, of the
2937        position. The first column is reported as <code>0</code>.
2938      </div>
2939
2940      <h4 id="XML_GetCurrentByteCount">
2941        XML_GetCurrentByteCount
2942      </h4>
2943
2944      <pre class="fcndec">
2945int XMLCALL
2946XML_GetCurrentByteCount(XML_Parser p);
2947</pre>
2948      <div class="fcndef">
2949        Return the number of bytes in the current event. Returns <code>0</code> if the
2950        event is inside a reference to an internal entity and for the end-tag event for
2951        empty element tags (the later can be used to distinguish empty-element tags from
2952        empty elements using separate start and end tags).
2953      </div>
2954
2955      <h4 id="XML_GetInputContext">
2956        XML_GetInputContext
2957      </h4>
2958
2959      <pre class="fcndec">
2960const char * XMLCALL
2961XML_GetInputContext(XML_Parser p,
2962                    int *offset,
2963                    int *size);
2964</pre>
2965      <div class="fcndef">
2966        <p>
2967          Returns the parser's input buffer, sets the integer pointed at by
2968          <code>offset</code> to the offset within this buffer of the current parse
2969          position, and set the integer pointed at by <code>size</code> to the size of
2970          the returned buffer.
2971        </p>
2972
2973        <p>
2974          This should only be called from within a handler during an active parse and the
2975          returned buffer should only be referred to from within the handler that made
2976          the call. This input buffer contains the untranslated bytes of the input.
2977        </p>
2978
2979        <p>
2980          Only a limited amount of context is kept, so if the event triggering a call
2981          spans over a very large amount of input, the actual parse position may be
2982          before the beginning of the buffer.
2983        </p>
2984
2985        <p>
2986          If <code>XML_CONTEXT_BYTES</code> is zero, this will always return
2987          <code>NULL</code>.
2988        </p>
2989      </div>
2990
2991      <h3>
2992        <a id="attack-protection" name="attack-protection">Attack Protection</a><a id=
2993        "billion-laughs" name="billion-laughs"></a>
2994      </h3>
2995
2996      <h4 id="XML_SetBillionLaughsAttackProtectionMaximumAmplification">
2997        XML_SetBillionLaughsAttackProtectionMaximumAmplification
2998      </h4>
2999
3000      <pre class="fcndec">
3001/* Added in Expat 2.4.0. */
3002XML_Bool XMLCALL
3003XML_SetBillionLaughsAttackProtectionMaximumAmplification(XML_Parser p,
3004                                                         float maximumAmplificationFactor);
3005</pre>
3006      <div class="fcndef">
3007        <p>
3008          Sets the maximum tolerated amplification factor for protection against <a href=
3009          "https://en.wikipedia.org/wiki/Billion_laughs_attack">billion laughs
3010          attacks</a> (default: <code>100.0</code>) of parser <code>p</code> to
3011          <code>maximumAmplificationFactor</code>, and returns <code>XML_TRUE</code> upon
3012          success and <code>XML_FALSE</code> upon error.
3013        </p>
3014
3015        <p>
3016          Once the <a href=
3017          "#XML_SetBillionLaughsAttackProtectionActivationThreshold">threshold for
3018          activation</a> is reached, the amplification factor is calculated as ..
3019        </p>
3020
3021        <pre>amplification := (direct + indirect) / direct</pre>
3022        <p>
3023          .. while parsing, whereas <code>direct</code> is the number of bytes read from
3024          the primary document in parsing and <code>indirect</code> is the number of
3025          bytes added by expanding entities and reading of external DTD files, combined.
3026        </p>
3027
3028        <p>
3029          For a call to
3030          <code>XML_SetBillionLaughsAttackProtectionMaximumAmplification</code> to
3031          succeed:
3032        </p>
3033
3034        <ul>
3035          <li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without
3036          any parent parsers) and
3037          </li>
3038
3039          <li>
3040            <code>maximumAmplificationFactor</code> must be non-<code>NaN</code> and
3041            greater than or equal to <code>1.0</code>.
3042          </li>
3043        </ul>
3044
3045        <p>
3046          <strong>Note:</strong> If you ever need to increase this value for non-attack
3047          payload, please <a href="https://github.com/libexpat/libexpat/issues">file a
3048          bug report</a>.
3049        </p>
3050
3051        <p>
3052          <strong>Note:</strong> Peak amplifications of factor 15,000 for the entire
3053          payload and of factor 30,000 in the middle of parsing have been observed with
3054          small benign files in practice. So if you do reduce the maximum allowed
3055          amplification, please make sure that the activation threshold is still big
3056          enough to not end up with undesired false positives (i.e. benign files being
3057          rejected).
3058        </p>
3059      </div>
3060
3061      <h4 id="XML_SetBillionLaughsAttackProtectionActivationThreshold">
3062        XML_SetBillionLaughsAttackProtectionActivationThreshold
3063      </h4>
3064
3065      <pre class="fcndec">
3066/* Added in Expat 2.4.0. */
3067XML_Bool XMLCALL
3068XML_SetBillionLaughsAttackProtectionActivationThreshold(XML_Parser p,
3069                                                        unsigned long long activationThresholdBytes);
3070</pre>
3071      <div class="fcndef">
3072        <p>
3073          Sets number of output bytes (including amplification from entity expansion and
3074          reading DTD files) needed to activate protection against <a href=
3075          "https://en.wikipedia.org/wiki/Billion_laughs_attack">billion laughs
3076          attacks</a> (default: <code>8 MiB</code>) of parser <code>p</code> to
3077          <code>activationThresholdBytes</code>, and returns <code>XML_TRUE</code> upon
3078          success and <code>XML_FALSE</code> upon error.
3079        </p>
3080
3081        <p>
3082          For a call to
3083          <code>XML_SetBillionLaughsAttackProtectionActivationThreshold</code> to
3084          succeed:
3085        </p>
3086
3087        <ul>
3088          <li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without
3089          any parent parsers).
3090          </li>
3091        </ul>
3092
3093        <p>
3094          <strong>Note:</strong> If you ever need to increase this value for non-attack
3095          payload, please <a href="https://github.com/libexpat/libexpat/issues">file a
3096          bug report</a>.
3097        </p>
3098
3099        <p>
3100          <strong>Note:</strong> Activation thresholds below 4 MiB are known to break
3101          support for <a href=
3102          "https://en.wikipedia.org/wiki/Darwin_Information_Typing_Architecture">DITA</a>
3103          1.3 payload and are hence not recommended.
3104        </p>
3105      </div>
3106
3107      <h4 id="XML_SetAllocTrackerMaximumAmplification">
3108        XML_SetAllocTrackerMaximumAmplification
3109      </h4>
3110
3111      <pre class="fcndec">
3112/* Added in Expat 2.7.2. */
3113XML_Bool
3114XML_SetAllocTrackerMaximumAmplification(XML_Parser p,
3115                                        float maximumAmplificationFactor);
3116</pre>
3117      <div class="fcndef">
3118        <p>
3119          Sets the maximum tolerated amplification factor between direct input and bytes
3120          of dynamic memory allocated (default: <code>100.0</code>) of parser
3121          <code>p</code> to <code>maximumAmplificationFactor</code>, and returns
3122          <code>XML_TRUE</code> upon success and <code>XML_FALSE</code> upon error.
3123        </p>
3124
3125        <p>
3126          <strong>Note:</strong> There are three types of allocations that intentionally
3127          bypass tracking and limiting:
3128        </p>
3129
3130        <ul>
3131          <li>application calls to functions <code><a href=
3132          "#XML_MemMalloc">XML_MemMalloc</a></code> and <code><a href="#XML_MemRealloc">
3133            XML_MemRealloc</a></code> — <em>healthy</em> use of these two functions
3134            continues to be a responsibility of the application using Expat —,
3135          </li>
3136
3137          <li>the main character buffer used by functions <code><a href="#XML_GetBuffer">
3138            XML_GetBuffer</a></code> and <code><a href=
3139            "#XML_ParseBuffer">XML_ParseBuffer</a></code> (and thus also by plain
3140            <code><a href="#XML_Parse">XML_Parse</a></code>), and
3141          </li>
3142
3143          <li>the <a href="#XML_SetElementDeclHandler">content model memory</a> (that is
3144          passed to the <a href="#XML_SetElementDeclHandler">element declaration
3145          handler</a> and freed by a call to <code><a href=
3146          "#XML_FreeContentModel">XML_FreeContentModel</a></code>).
3147          </li>
3148        </ul>
3149
3150        <p>
3151          Once the <a href="#XML_SetAllocTrackerActivationThreshold">threshold for
3152          activation</a> is reached, the amplification factor is calculated as ..
3153        </p>
3154
3155        <pre>amplification := allocated / direct</pre>
3156        <p>
3157          .. while parsing, whereas <code>direct</code> is the number of bytes read from
3158          the primary document in parsing and <code>allocated</code> is the number of
3159          bytes of dynamic memory allocated in the parser hierarchy.
3160        </p>
3161
3162        <p>
3163          For a call to <code>XML_SetAllocTrackerMaximumAmplification</code> to succeed:
3164        </p>
3165
3166        <ul>
3167          <li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without
3168          any parent parsers) and
3169          </li>
3170
3171          <li>
3172            <code>maximumAmplificationFactor</code> must be non-<code>NaN</code> and
3173            greater than or equal to <code>1.0</code>.
3174          </li>
3175        </ul>
3176
3177        <p>
3178          <strong>Note:</strong> If you ever need to increase this value for non-attack
3179          payload, please <a href="https://github.com/libexpat/libexpat/issues">file a
3180          bug report</a>.
3181        </p>
3182
3183        <p>
3184          <strong>Note:</strong> Amplifications factors greater than <code>100.0</code>
3185          can been observed near the start of parsing even with benign files in practice.
3186          So if you do reduce the maximum allowed amplification, please make sure that
3187          the activation threshold is still big enough to not end up with undesired false
3188          positives (i.e. benign files being rejected).
3189        </p>
3190      </div>
3191
3192      <h4 id="XML_SetAllocTrackerActivationThreshold">
3193        XML_SetAllocTrackerActivationThreshold
3194      </h4>
3195
3196      <pre class="fcndec">
3197/* Added in Expat 2.7.2. */
3198XML_Bool
3199XML_SetAllocTrackerActivationThreshold(XML_Parser p,
3200                                       unsigned long long activationThresholdBytes);
3201</pre>
3202      <div class="fcndef">
3203        <p>
3204          Sets number of allocated bytes of dynamic memory needed to activate protection
3205          against disproportionate use of RAM (default: <code>64 MiB</code>) of parser
3206          <code>p</code> to <code>activationThresholdBytes</code>, and returns
3207          <code>XML_TRUE</code> upon success and <code>XML_FALSE</code> upon error.
3208        </p>
3209
3210        <p>
3211          <strong>Note:</strong> For types of allocations that intentionally bypass
3212          tracking and limiting, please see <code><a href=
3213          "#XML_SetAllocTrackerMaximumAmplification">XML_SetAllocTrackerMaximumAmplification</a></code>
3214          above.
3215        </p>
3216
3217        <p>
3218          For a call to <code>XML_SetAllocTrackerActivationThreshold</code> to succeed:
3219        </p>
3220
3221        <ul>
3222          <li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without
3223          any parent parsers).
3224          </li>
3225        </ul>
3226
3227        <p>
3228          <strong>Note:</strong> If you ever need to increase this value for non-attack
3229          payload, please <a href="https://github.com/libexpat/libexpat/issues">file a
3230          bug report</a>.
3231        </p>
3232      </div>
3233
3234      <h4 id="XML_SetReparseDeferralEnabled">
3235        XML_SetReparseDeferralEnabled
3236      </h4>
3237
3238      <pre class="fcndec">
3239/* Added in Expat 2.6.0. */
3240XML_Bool XMLCALL
3241XML_SetReparseDeferralEnabled(XML_Parser parser, XML_Bool enabled);
3242</pre>
3243      <div class="fcndef">
3244        <p>
3245          Large tokens may require many parse calls before enough data is available for
3246          Expat to parse it in full. If Expat retried parsing the token on every parse
3247          call, parsing could take quadratic time. To avoid this, Expat only retries once
3248          a significant amount of new data is available. This function allows disabling
3249          this behavior.
3250        </p>
3251
3252        <p>
3253          The <code>enabled</code> argument should be <code>XML_TRUE</code> or
3254          <code>XML_FALSE</code>.
3255        </p>
3256
3257        <p>
3258          Returns <code>XML_TRUE</code> on success, and <code>XML_FALSE</code> on error.
3259        </p>
3260      </div>
3261
3262      <h3>
3263        <a id="miscellaneous" name="miscellaneous">Miscellaneous functions</a>
3264      </h3>
3265
3266      <p>
3267        The functions in this section either obtain state information from the parser or
3268        can be used to dynamically set parser options.
3269      </p>
3270
3271      <h4 id="XML_SetUserData">
3272        XML_SetUserData
3273      </h4>
3274
3275      <pre class="fcndec">
3276void XMLCALL
3277XML_SetUserData(XML_Parser p,
3278                void *userData);
3279</pre>
3280      <div class="fcndef">
3281        This sets the user data pointer that gets passed to handlers. It overwrites any
3282        previous value for this pointer. Note that the application is responsible for
3283        freeing the memory associated with <code>userData</code> when it is finished with
3284        the parser. So if you call this when there's already a pointer there, and you
3285        haven't freed the memory associated with it, then you've probably just leaked
3286        memory.
3287      </div>
3288
3289      <h4 id="XML_GetUserData">
3290        XML_GetUserData
3291      </h4>
3292
3293      <pre class="fcndec">
3294void * XMLCALL
3295XML_GetUserData(XML_Parser p);
3296</pre>
3297      <div class="fcndef">
3298        This returns the user data pointer that gets passed to handlers. It is actually
3299        implemented as a macro.
3300      </div>
3301
3302      <h4 id="XML_UseParserAsHandlerArg">
3303        XML_UseParserAsHandlerArg
3304      </h4>
3305
3306      <pre class="fcndec">
3307void XMLCALL
3308XML_UseParserAsHandlerArg(XML_Parser p);
3309</pre>
3310      <div class="fcndef">
3311        After this is called, handlers receive the parser in their <code>userData</code>
3312        arguments. The user data can still be obtained using the <code><a href=
3313        "#XML_GetUserData">XML_GetUserData</a></code> function.
3314      </div>
3315
3316      <h4 id="XML_SetBase">
3317        XML_SetBase
3318      </h4>
3319
3320      <pre class="fcndec">
3321enum XML_Status XMLCALL
3322XML_SetBase(XML_Parser p,
3323            const XML_Char *base);
3324</pre>
3325      <div class="fcndef">
3326        Set the base to be used for resolving relative URIs in system identifiers. The
3327        return value is <code>XML_STATUS_ERROR</code> if there's no memory to store base,
3328        otherwise it's <code>XML_STATUS_OK</code>.
3329      </div>
3330
3331      <h4 id="XML_GetBase">
3332        XML_GetBase
3333      </h4>
3334
3335      <pre class="fcndec">
3336const XML_Char * XMLCALL
3337XML_GetBase(XML_Parser p);
3338</pre>
3339      <div class="fcndef">
3340        Return the base for resolving relative URIs.
3341      </div>
3342
3343      <h4 id="XML_GetSpecifiedAttributeCount">
3344        XML_GetSpecifiedAttributeCount
3345      </h4>
3346
3347      <pre class="fcndec">
3348int XMLCALL
3349XML_GetSpecifiedAttributeCount(XML_Parser p);
3350</pre>
3351      <div class="fcndef">
3352        When attributes are reported to the start handler in the atts vector, attributes
3353        that were explicitly set in the element occur before any attributes that receive
3354        their value from default information in an ATTLIST declaration. This function
3355        returns the number of attributes that were explicitly set times two, thus giving
3356        the offset in the <code>atts</code> array passed to the start tag handler of the
3357        first attribute set due to defaults. It supplies information for the last call to
3358        a start handler. If called inside a start handler, then that means the current
3359        call.
3360      </div>
3361
3362      <h4 id="XML_GetIdAttributeIndex">
3363        XML_GetIdAttributeIndex
3364      </h4>
3365
3366      <pre class="fcndec">
3367int XMLCALL
3368XML_GetIdAttributeIndex(XML_Parser p);
3369</pre>
3370      <div class="fcndef">
3371        Returns the index of the ID attribute passed in the atts array in the last call
3372        to <code><a href="#XML_StartElementHandler">XML_StartElementHandler</a></code>,
3373        or -1 if there is no ID attribute. If called inside a start handler, then that
3374        means the current call.
3375      </div>
3376
3377      <h4 id="XML_GetAttributeInfo">
3378        XML_GetAttributeInfo
3379      </h4>
3380
3381      <pre class="fcndec">
3382const XML_AttrInfo * XMLCALL
3383XML_GetAttributeInfo(XML_Parser parser);
3384</pre>
3385
3386      <pre class="signature">
3387typedef struct {
3388  XML_Index  nameStart;  /* Offset to beginning of the attribute name. */
3389  XML_Index  nameEnd;    /* Offset after the attribute name's last byte. */
3390  XML_Index  valueStart; /* Offset to beginning of the attribute value. */
3391  XML_Index  valueEnd;   /* Offset after the attribute value's last byte. */
3392} XML_AttrInfo;
3393</pre>
3394      <div class="fcndef">
3395        Returns an array of <code>XML_AttrInfo</code> structures for the attribute/value
3396        pairs passed in the last call to the <code>XML_StartElementHandler</code> that
3397        were specified in the start-tag rather than defaulted. Each attribute/value pair
3398        counts as 1; thus the number of entries in the array is
3399        <code>XML_GetSpecifiedAttributeCount(parser) / 2</code>.
3400      </div>
3401
3402      <h4 id="XML_SetEncoding">
3403        XML_SetEncoding
3404      </h4>
3405
3406      <pre class="fcndec">
3407enum XML_Status XMLCALL
3408XML_SetEncoding(XML_Parser p,
3409                const XML_Char *encoding);
3410</pre>
3411      <div class="fcndef">
3412        Set the encoding to be used by the parser. It is equivalent to passing a
3413        non-<code>NULL</code> encoding argument to the parser creation functions. It must
3414        not be called after <code><a href="#XML_Parse">XML_Parse</a></code> or
3415        <code><a href="#XML_ParseBuffer">XML_ParseBuffer</a></code> have been called on
3416        the given parser. Returns <code>XML_STATUS_OK</code> on success or
3417        <code>XML_STATUS_ERROR</code> on error.
3418      </div>
3419
3420      <h4 id="XML_SetParamEntityParsing">
3421        XML_SetParamEntityParsing
3422      </h4>
3423
3424      <pre class="fcndec">
3425int XMLCALL
3426XML_SetParamEntityParsing(XML_Parser p,
3427                          enum XML_ParamEntityParsing code);
3428</pre>
3429      <div class="fcndef">
3430        This enables parsing of parameter entities, including the external parameter
3431        entity that is the external DTD subset, according to <code>code</code>. The
3432        choices for <code>code</code> are:
3433        <ul>
3434          <li>
3435            <code>XML_PARAM_ENTITY_PARSING_NEVER</code>
3436          </li>
3437
3438          <li>
3439            <code>XML_PARAM_ENTITY_PARSING_UNLESS_STANDALONE</code>
3440          </li>
3441
3442          <li>
3443            <code>XML_PARAM_ENTITY_PARSING_ALWAYS</code>
3444          </li>
3445        </ul>
3446        <b>Note:</b> If <code>XML_SetParamEntityParsing</code> is called after
3447        <code>XML_Parse</code> or <code>XML_ParseBuffer</code>, then it has no effect and
3448        will always return 0.
3449      </div>
3450
3451      <h4 id="XML_SetHashSalt">
3452        XML_SetHashSalt
3453      </h4>
3454
3455      <pre class="fcndec">
3456int XMLCALL
3457XML_SetHashSalt(XML_Parser p,
3458                unsigned long hash_salt);
3459</pre>
3460      <div class="fcndef">
3461        Sets the hash salt to use for internal hash calculations. Helps in preventing DoS
3462        attacks based on predicting hash function behavior. In order to have an effect
3463        this must be called before parsing has started. Returns 1 if successful, 0 when
3464        called after <code>XML_Parse</code> or <code>XML_ParseBuffer</code>.
3465        <p>
3466          <b>Note:</b> This call is optional, as the parser will auto-generate a new
3467          random salt value if no value has been set at the start of parsing.
3468        </p>
3469
3470        <p>
3471          <b>Note:</b> One should not call <code>XML_SetHashSalt</code> with a hash salt
3472          value of 0, as this value is used as sentinel value to indicate that
3473          <code>XML_SetHashSalt</code> has <b>not</b> been called. Consequently such a
3474          call will have no effect, even if it returns 1.
3475        </p>
3476      </div>
3477
3478      <h4 id="XML_UseForeignDTD">
3479        XML_UseForeignDTD
3480      </h4>
3481
3482      <pre class="fcndec">
3483enum XML_Error XMLCALL
3484XML_UseForeignDTD(XML_Parser parser, XML_Bool useDTD);
3485</pre>
3486      <div class="fcndef">
3487        <p>
3488          This function allows an application to provide an external subset for the
3489          document type declaration for documents which do not specify an external subset
3490          of their own. For documents which specify an external subset in their DOCTYPE
3491          declaration, the application-provided subset will be ignored. If the document
3492          does not contain a DOCTYPE declaration at all and <code>useDTD</code> is true,
3493          the application-provided subset will be parsed, but the
3494          <code>startDoctypeDeclHandler</code> and <code>endDoctypeDeclHandler</code>
3495          functions, if set, will not be called. The setting of parameter entity parsing,
3496          controlled using <code><a href=
3497          "#XML_SetParamEntityParsing">XML_SetParamEntityParsing</a></code>, will be
3498          honored.
3499        </p>
3500
3501        <p>
3502          The application-provided external subset is read by calling the external entity
3503          reference handler set via <code><a href=
3504          "#XML_SetExternalEntityRefHandler">XML_SetExternalEntityRefHandler</a></code>
3505          with both <code>publicId</code> and <code>systemId</code> set to
3506          <code>NULL</code>.
3507        </p>
3508
3509        <p>
3510          If this function is called after parsing has begun, it returns
3511          <code>XML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSING</code> and ignores
3512          <code>useDTD</code>. If called when Expat has been compiled without DTD
3513          support, it returns <code>XML_ERROR_FEATURE_REQUIRES_XML_DTD</code>. Otherwise,
3514          it returns <code>XML_ERROR_NONE</code>.
3515        </p>
3516
3517        <p>
3518          <b>Note:</b> For the purpose of checking WFC: Entity Declared, passing
3519          <code>useDTD == XML_TRUE</code> will make the parser behave as if the document
3520          had a DTD with an external subset. This holds true even if the external entity
3521          reference handler returns without action.
3522        </p>
3523      </div>
3524
3525      <h4 id="XML_SetReturnNSTriplet">
3526        XML_SetReturnNSTriplet
3527      </h4>
3528
3529      <pre class="fcndec">
3530void XMLCALL
3531XML_SetReturnNSTriplet(XML_Parser parser,
3532                       int        do_nst);
3533</pre>
3534      <div class="fcndef">
3535        <p>
3536          This function only has an effect when using a parser created with
3537          <code><a href="#XML_ParserCreateNS">XML_ParserCreateNS</a></code>, i.e. when
3538          namespace processing is in effect. The <code>do_nst</code> sets whether or not
3539          prefixes are returned with names qualified with a namespace prefix. If this
3540          function is called with <code>do_nst</code> non-zero, then afterwards namespace
3541          qualified names (that is qualified with a prefix as opposed to belonging to a
3542          default namespace) are returned as a triplet with the three parts separated by
3543          the namespace separator specified when the parser was created. The order of
3544          returned parts is URI, local name, and prefix.
3545        </p>
3546
3547        <p>
3548          If <code>do_nst</code> is zero, then namespaces are reported in the default
3549          manner, URI then local_name separated by the namespace separator.
3550        </p>
3551      </div>
3552
3553      <h4 id="XML_DefaultCurrent">
3554        XML_DefaultCurrent
3555      </h4>
3556
3557      <pre class="fcndec">
3558void XMLCALL
3559XML_DefaultCurrent(XML_Parser parser);
3560</pre>
3561      <div class="fcndef">
3562        This can be called within a handler for a start element, end element, processing
3563        instruction or character data. It causes the corresponding markup to be passed to
3564        the default handler set by <code><a href=
3565        "#XML_SetDefaultHandler">XML_SetDefaultHandler</a></code> or <code><a href=
3566        "#XML_SetDefaultHandlerExpand">XML_SetDefaultHandlerExpand</a></code>. It does
3567        nothing if there is not a default handler.
3568      </div>
3569
3570      <h4 id="XML_ExpatVersion">
3571        XML_ExpatVersion
3572      </h4>
3573
3574      <pre class="fcndec">
3575XML_LChar * XMLCALL
3576XML_ExpatVersion();
3577</pre>
3578      <div class="fcndef">
3579        Return the library version as a string (e.g. <code>"expat_1.95.1"</code>).
3580      </div>
3581
3582      <h4 id="XML_ExpatVersionInfo">
3583        XML_ExpatVersionInfo
3584      </h4>
3585
3586      <pre class="fcndec">
3587struct XML_Expat_Version XMLCALL
3588XML_ExpatVersionInfo();
3589</pre>
3590
3591      <pre class="signature">
3592typedef struct {
3593  int major;
3594  int minor;
3595  int micro;
3596} XML_Expat_Version;
3597</pre>
3598      <div class="fcndef">
3599        Return the library version information as a structure. Some macros are also
3600        defined that support compile-time tests of the library version:
3601        <ul>
3602          <li>
3603            <code>XML_MAJOR_VERSION</code>
3604          </li>
3605
3606          <li>
3607            <code>XML_MINOR_VERSION</code>
3608          </li>
3609
3610          <li>
3611            <code>XML_MICRO_VERSION</code>
3612          </li>
3613        </ul>
3614        Testing these constants is currently the best way to determine if particular
3615        parts of the Expat API are available.
3616      </div>
3617
3618      <h4 id="XML_GetFeatureList">
3619        XML_GetFeatureList
3620      </h4>
3621
3622      <pre class="fcndec">
3623const XML_Feature * XMLCALL
3624XML_GetFeatureList();
3625</pre>
3626
3627      <pre class="signature">
3628enum XML_FeatureEnum {
3629  XML_FEATURE_END = 0,
3630  XML_FEATURE_UNICODE,
3631  XML_FEATURE_UNICODE_WCHAR_T,
3632  XML_FEATURE_DTD,
3633  XML_FEATURE_CONTEXT_BYTES,
3634  XML_FEATURE_MIN_SIZE,
3635  XML_FEATURE_SIZEOF_XML_CHAR,
3636  XML_FEATURE_SIZEOF_XML_LCHAR,
3637  XML_FEATURE_NS,
3638  XML_FEATURE_LARGE_SIZE
3639};
3640
3641typedef struct {
3642  enum XML_FeatureEnum  feature;
3643  XML_LChar            *name;
3644  long int              value;
3645} XML_Feature;
3646</pre>
3647      <div class="fcndef">
3648        <p>
3649          Returns a list of "feature" records, providing details on how Expat was
3650          configured at compile time. Most applications should not need to worry about
3651          this, but this information is otherwise not available from Expat. This function
3652          allows code that does need to check these features to do so at runtime.
3653        </p>
3654
3655        <p>
3656          The return value is an array of <code>XML_Feature</code>, terminated by a
3657          record with a <code>feature</code> of <code>XML_FEATURE_END</code> and
3658          <code>name</code> of <code>NULL</code>, identifying the feature-test macros
3659          Expat was compiled with. Since an application that requires this kind of
3660          information needs to determine the type of character the <code>name</code>
3661          points to, records for the <code>XML_FEATURE_SIZEOF_XML_CHAR</code> and
3662          <code>XML_FEATURE_SIZEOF_XML_LCHAR</code> will be located at the beginning of
3663          the list, followed by <code>XML_FEATURE_UNICODE</code> and
3664          <code>XML_FEATURE_UNICODE_WCHAR_T</code>, if they are present at all.
3665        </p>
3666
3667        <p>
3668          Some features have an associated value. If there isn't an associated value, the
3669          <code>value</code> field is set to 0. At this time, the following features have
3670          been defined to have values:
3671        </p>
3672
3673        <dl>
3674          <dt>
3675            <code>XML_FEATURE_SIZEOF_XML_CHAR</code>
3676          </dt>
3677
3678          <dd>
3679            The number of bytes occupied by one <code>XML_Char</code> character.
3680          </dd>
3681
3682          <dt>
3683            <code>XML_FEATURE_SIZEOF_XML_LCHAR</code>
3684          </dt>
3685
3686          <dd>
3687            The number of bytes occupied by one <code>XML_LChar</code> character.
3688          </dd>
3689
3690          <dt>
3691            <code>XML_FEATURE_CONTEXT_BYTES</code>
3692          </dt>
3693
3694          <dd>
3695            The maximum number of characters of context which can be reported by
3696            <code><a href="#XML_GetInputContext">XML_GetInputContext</a></code>.
3697          </dd>
3698        </dl>
3699      </div>
3700
3701      <h4 id="XML_FreeContentModel">
3702        XML_FreeContentModel
3703      </h4>
3704
3705      <pre class="fcndec">
3706void XMLCALL
3707XML_FreeContentModel(XML_Parser parser, XML_Content *model);
3708</pre>
3709      <div class="fcndef">
3710        Function to deallocate the <code>model</code> argument passed to the
3711        <code>XML_ElementDeclHandler</code> callback set using <code><a href=
3712        "#XML_SetElementDeclHandler">XML_ElementDeclHandler</a></code>. This function
3713        should not be used for any other purpose.
3714      </div>
3715
3716      <p>
3717        The following functions allow external code to share the memory allocator an
3718        <code>XML_Parser</code> has been configured to use. This is especially useful for
3719        third-party libraries that interact with a parser object created by application
3720        code, or heavily layered applications. This can be essential when using
3721        dynamically loaded libraries which use different C standard libraries (this can
3722        happen on Windows, at least).
3723      </p>
3724
3725      <h4 id="XML_MemMalloc">
3726        XML_MemMalloc
3727      </h4>
3728
3729      <pre class="fcndec">
3730void * XMLCALL
3731XML_MemMalloc(XML_Parser parser, size_t size);
3732</pre>
3733      <div class="fcndef">
3734        Allocate <code>size</code> bytes of memory using the allocator the
3735        <code>parser</code> object has been configured to use. Returns a pointer to the
3736        memory or <code>NULL</code> on failure. Memory allocated in this way must be
3737        freed using <code><a href="#XML_MemFree">XML_MemFree</a></code>.
3738      </div>
3739
3740      <h4 id="XML_MemRealloc">
3741        XML_MemRealloc
3742      </h4>
3743
3744      <pre class="fcndec">
3745void * XMLCALL
3746XML_MemRealloc(XML_Parser parser, void *ptr, size_t size);
3747</pre>
3748      <div class="fcndef">
3749        Allocate <code>size</code> bytes of memory using the allocator the
3750        <code>parser</code> object has been configured to use. <code>ptr</code> must
3751        point to a block of memory allocated by <code><a href=
3752        "#XML_MemMalloc">XML_MemMalloc</a></code> or <code>XML_MemRealloc</code>, or be
3753        <code>NULL</code>. This function tries to expand the block pointed to by
3754        <code>ptr</code> if possible. Returns a pointer to the memory or
3755        <code>NULL</code> on failure. On success, the original block has either been
3756        expanded or freed. On failure, the original block has not been freed; the caller
3757        is responsible for freeing the original block. Memory allocated in this way must
3758        be freed using <code><a href="#XML_MemFree">XML_MemFree</a></code>.
3759      </div>
3760
3761      <h4 id="XML_MemFree">
3762        XML_MemFree
3763      </h4>
3764
3765      <pre class="fcndec">
3766void XMLCALL
3767XML_MemFree(XML_Parser parser, void *ptr);
3768</pre>
3769      <div class="fcndef">
3770        Free a block of memory pointed to by <code>ptr</code>. The block must have been
3771        allocated by <code><a href="#XML_MemMalloc">XML_MemMalloc</a></code> or
3772        <code>XML_MemRealloc</code>, or be <code>NULL</code>.
3773      </div>
3774
3775      <hr />
3776
3777      <div class="footer">
3778        Found a bug in the documentation? <a href=
3779        "https://github.com/libexpat/libexpat/issues">Please file a bug report.</a>
3780      </div>
3781    </div>
3782  </body>
3783</html>
3784