Lines Matching +full:pre +full:- +full:processing

1 <?xml version="1.0" encoding="iso-8859-1"?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
3 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
6 <!--
15 Copyright (c) 2000-2004 Fred L. Drake, Jr. <fdrake@users.sourceforge.net>
16 Copyright (c) 2002-2012 Karl Waclawek <karl@waclawek.net>
17 Copyright (c) 2017-2025 Sebastian Pipping <sebastian@pipping.org>
20 Copyright (c) 2021 Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
44 -->
47 <meta http-equiv="Content-Style-Type" content="text/css" />
63 other open-source XML parsers.</p>
66 groff (an nroff look-alike), Jade (an implementation of ISO's DSSSL
156 <a href="#attack-protection">Attack Protection</a>
195 <p>Expat is a stream-oriented parser. You register callback (or
241 <pre class="eg">
260 </pre>
264 <pre class="eg">
267 Depth--;
269 </pre>
289 <pre class="eg">
301 </pre>
322 cmake -G"Visual Studio 16 2019" -DCMAKE_BUILD_TYPE=RelWithDebInfo .
326 contains the "expat.h" include file and a pre-built DLL.</p>
337 <pre class="eg">
341 </pre>
344 only one we'll mention here is the <code>--prefix</code> option. You
346 the <code>--help</code> option.</p>
351 give the option, <code>--prefix=/home/me/mystuff</code>, then the
356 <h3>Configuring Expat Using the Pre-Processor</h3>
359 pre-processor definitions. The symbols are:</p>
361 <dl class="cpp-symbols">
366 <a href="https://www.w3.org/TR/2006/REC-xml-20060816/#sec-physical-struct">general entities</a>
382 (except the <a href="https://www.w3.org/TR/2006/REC-xml-20060816/#sec-predefined-ent">predefined fi…
384 with a self-reference:
390 <dd>Include support for using and reporting DTD-based content. If
404 "https://www.w3.org/TR/REC-xml-names/" >Namespaces in XML</a></cite>
409 encoded in UTF-16 using wide characters of the type
422 processing of very large input streams, where the return values of
461 usually be done with the <code>-lexpat</code> argument. Otherwise,
467 <p>On a Unix-based system, here's what a Makefile might look like when
470 <pre class="eg">
473 LIBS= -lexpat
475 $(CC) $(LDFLAGS) -o xmlapp xmlapp.o $(LIBS)
476 </pre>
481 <pre class="eg">
483 CFLAGS= -I/home/me/mystuff/include
485 LIBS= -L/home/me/mystuff/lib -lexpat
487 $(CC) $(LDFLAGS) -o xmlapp xmlapp.o $(LIBS)
488 </pre>
504 constructing a parser for a top-level document. The object returned
543 <pre class="eg">
546 info->skip = 0;
547 info->depth = 1;
555 if (! inf->skip) {
557 inf->skip = inf->depth;
563 inf->depth++;
570 inf->depth--;
572 if (! inf->skip)
575 if (inf->skip == inf->depth)
576 inf->skip = 0;
578 </pre>
610 common first-time mistake with any of the event-oriented interfaces to
619 <!-- XXX example needed here -->
625 the value of the <code>version</code> pseudo-attribute in the XML
629 alternate processing), it should use the <code><a href=
635 <pre class="eg">
660 </pre>
662 <h3>Namespace Processing</h3>
666 performs namespace processing. Under namespace processing, Expat
681 are not well-formed when namespace processing is enabled, and will
687 >XML_SetReturnNSTriplet</a></code> has been called with a non-zero
713 to recognized UTF-8 and UTF-16 (1 and 2 byte encodings of Unicode),
717 <pre>
718 &lt;?xml version="1.0" encoding="ISO-8859-2"?&gt;
719 </pre>
723 <pre>
725 </pre>
732 <p><a name="builtin_encodings"></a>There are four built-in encodings
735 <li>UTF-8</li>
736 <li>UTF-16</li>
737 <li>ISO-8859-1</li>
738 <li>US-ASCII</li>
756 <li>Every ASCII character that can appear in a well-formed XML document
761 equal to 65535 (0xFFFF)<em>This does not apply to the built-in support
762 for UTF-16 and UTF-8</em></li>
771 array. A -1 in this array indicates a malformed byte. If the value is
772 -2, -3, or -4, then the byte is the beginning of a 2, 3, or 4 byte
773 sequence respectively. Multi-byte sequences are sent to the convert
775 function should return the Unicode scalar value for the sequence or -1
780 it passes to the handlers are always encoded in UTF-8 or UTF-16
822 <h3 id="stop-resume">Temporarily Stopping Parsing</h3>
833 <li>Delaying further processing until additional information is
841 if an application-domain error is found in the XML being parsed or if
853 the rough structure (in pseudo-code):</p>
855 <pre class="pseudocode">
871 </pre>
878 function mentioned in the pseudo-code above:</p>
880 <pre class="eg">
885 been an error), or the parse is stopped. Return non-zero when
918 </pre>
926 <pre class="eg">
929 non-zero when the parse is suspended.
947 </pre>
949 <p>Now that we've seen what a mess the top-level parsing loop can
953 processing that we're expecting to ignore. As a bonus, we get to stop
962 <!-- XXX really need more here -->
966 <!-- ================================================================ -->
973 <pre class="fcndec">
976 </pre>
979 Construct a new parser. If encoding is non-<code>NULL</code>, it specifies a
981 encoding declaration. There are four built-in encodings:
984 <li>US-ASCII</li>
985 <li>UTF-8</li>
986 <li>UTF-16</li>
987 <li>ISO-8859-1</li>
995 <pre class="fcndec">
999 </pre>
1001 Constructs a new parser that has namespace processing in effect. Namespace
1007 in XML. For instance, <code>'\xFF'</code> is not legal in UTF-8, and
1008 <code>'\xFFFF'</code> is not legal in UTF-16. There is a special case when
1010 the local part will be concatenated without any separator - this is intended
1019 be ready to receive namespace URIs containing non-URI characters.
1023 <pre class="fcndec">
1028 </pre>
1029 <pre class="signature">
1035 </pre>
1040 non-<code>NULL</code>, then namespace processing is enabled in the created parser
1046 <pre class="fcndec">
1051 </pre>
1056 user data, namespace processing is inherited from the parser passed as
1063 <pre class="fcndec">
1066 </pre>
1073 <pre class="fcndec">
1077 </pre>
1083 state is re-initialized except for the values of ns and ns_triplets.
1118 <pre class="fcndec">
1124 </pre>
1125 <pre class="signature">
1130 </pre>
1136 that <code>s</code> doesn't have to be null-terminated. It also means that
1181 <pre class="fcndec">
1186 </pre>
1202 <pre class="fcndec">
1206 </pre>
1215 <pre class="eg">
1235 </pre>
1239 <pre class="fcndec">
1243 </pre>
1249 call-back handler, except when aborting (when <code>resumable</code>
1251 call-backs may still follow because they would otherwise get
1259 while making multiple call-backs on a contiguous chunk of characters,</li>
1264 call-backs, except when parsing an external parameter entity and
1285 not being handled appropriately; see <a href= "#stop-resume"
1313 <pre class="fcndec">
1316 </pre>
1320 within a handler call-back. Returns same status codes as <code><a
1339 <pre class="fcndec">
1343 </pre>
1344 <pre class="signature">
1356 </pre>
1383 The former implies UTF-8 encoding, the latter two imply UTF-16 encoding.
1389 <pre class="setter">
1393 </pre>
1394 <pre class="signature">
1399 </pre>
1411 <pre class="setter">
1415 </pre>
1416 <pre class="signature">
1420 </pre>
1427 <pre class="setter">
1432 </pre>
1438 <pre class="setter">
1442 </pre>
1443 <pre class="signature">
1448 </pre>
1450 is <em>NOT null-terminated</em>. You have to use the length argument
1455 may <em>NOT immediately</em> terminate call-backs if the parser is currently
1456 processing such a single block of contiguous markup-free text, as the parser
1462 <pre class="setter">
1466 </pre>
1467 <pre class="signature">
1473 </pre>
1474 <p>Set a handler for processing instructions. The target is the first word
1475 in the processing instruction. The data is the rest of the characters in
1481 <pre class="setter">
1485 </pre>
1486 <pre class="signature">
1490 </pre>
1497 <pre class="setter">
1501 </pre>
1502 <pre class="signature">
1505 </pre>
1511 <pre class="setter">
1515 </pre>
1516 <pre class="signature">
1519 </pre>
1525 <pre class="setter">
1530 </pre>
1536 <pre class="setter">
1540 </pre>
1541 <pre class="signature">
1546 </pre>
1553 that they will be encoded in UTF-8 or UTF-16. Line boundaries are not
1568 <pre class="setter">
1572 </pre>
1573 <pre class="signature">
1578 </pre>
1589 <pre class="setter">
1593 </pre>
1594 <pre class="signature">
1601 </pre>
1603 called for processing an external DTD subset if parameter entity parsing
1643 <pre class="fcndec">
1647 </pre>
1670 <pre class="setter">
1674 </pre>
1675 <pre class="signature">
1680 </pre>
1689 <p>The <code>is_parameter_entity</code> argument will be non-zero for
1698 <pre class="setter">
1703 </pre>
1704 <pre class="signature">
1716 </pre>
1732 value is -1, then that byte is invalid as the initial byte in a sequence.
1733 If the value is -n, where n is an integer &gt; 1, then n is the number of
1735 call to the function pointed at by convert. This function may return -1
1739 string s is <em>NOT</em> null-terminated and points at the sequence of
1748 <pre class="setter">
1752 </pre>
1753 <pre class="signature">
1758 </pre>
1767 <pre class="setter">
1771 </pre>
1772 <pre class="signature">
1776 </pre>
1785 <pre class="setter">
1790 </pre>
1796 <pre class="setter">
1800 </pre>
1801 <pre class="signature">
1807 </pre>
1813 contain -1, 0, or 1 indicating respectively that there was no
1820 <pre class="setter">
1824 </pre>
1825 <pre class="signature">
1832 </pre>
1836 will be non-zero if the DOCTYPE declaration has an internal subset.</p>
1841 <pre class="setter">
1845 </pre>
1846 <pre class="signature">
1849 </pre>
1856 <pre class="setter">
1861 </pre>
1867 <pre class="setter">
1871 </pre>
1872 <pre class="signature">
1877 </pre>
1878 <pre class="signature">
1904 </pre>
1941 <pre class="setter">
1945 </pre>
1946 <pre class="signature">
1954 </pre>
1969 <code>isrequired</code>, but they will have the non-<code>NULL</code> fixed value
1975 <pre class="setter">
1979 </pre>
1980 <pre class="signature">
1991 </pre>
1993 The <code>is_parameter_entity</code> argument will be non-zero in the
1997 <code>value</code> will be non-<code>NULL</code> and <code>systemId</code>,
1999 The value string is <em>not</em> null-terminated; the length is
2002 legal to have zero-length values. Instead check for whether or not
2004 argument will have a non-<code>NULL</code> value only for unparsed entity
2010 <pre class="setter">
2014 </pre>
2015 <pre class="signature">
2023 </pre>
2027 <div id="eg"><pre>
2029 </pre></div>
2037 <pre class="setter">
2041 </pre>
2042 <pre class="signature">
2049 </pre>
2055 <pre class="setter">
2059 </pre>
2060 <pre class="signature">
2063 </pre>
2091 <pre class="fcndec">
2094 </pre>
2100 <pre class="fcndec">
2103 </pre>
2111 <pre class="fcndec">
2114 </pre>
2123 <pre class="fcndec">
2126 </pre>
2133 <pre class="fcndec">
2136 </pre>
2143 <pre class="fcndec">
2146 </pre>
2150 entity and for the end-tag event for empty element tags (the later can
2151 be used to distinguish empty-element tags from empty elements using
2156 <pre class="fcndec">
2161 </pre>
2182 <h3><a name="attack-protection">Attack Protection</a><a name="billion-laughs"></a></h3>
2185 <pre class="fcndec">
2190 </pre>
2202 <pre>
2204 </pre>
2211 …<li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without any parent parsers)…
2212 …<li><code>maximumAmplificationFactor</code> must be non-<code>NaN</code> and greater than or equal…
2217 If you ever need to increase this value for non-attack payload,
2235 <pre class="fcndec">
2240 </pre>
2253 …<li>parser <code>p</code> must be a non-<code>NULL</code> root parser (without any parent parsers)…
2258 If you ever need to increase this value for non-attack payload,
2271 <pre class="fcndec">
2275 </pre>
2297 <pre class="fcndec">
2301 </pre>
2313 <pre class="fcndec">
2316 </pre>
2323 <pre class="fcndec">
2326 </pre>
2335 <pre class="fcndec">
2339 </pre>
2348 <pre class="fcndec">
2351 </pre>
2357 <pre class="fcndec">
2360 </pre>
2374 <pre class="fcndec">
2377 </pre>
2381 >XML_StartElementHandler</a></code>, or -1 if there is no ID
2387 <pre class="fcndec">
2390 </pre>
2391 <pre class="signature">
2398 </pre>
2403 in the start-tag rather than defaulted. Each attribute/value pair counts
2409 <pre class="fcndec">
2413 </pre>
2416 passing a non-<code>NULL</code> encoding argument to the parser creation functions.
2425 <pre class="fcndec">
2429 </pre>
2446 <pre class="fcndec">
2450 </pre>
2457 <p><b>Note:</b> This call is optional, as the parser will auto-generate
2466 <pre class="fcndec">
2469 </pre>
2474 external subset in their DOCTYPE declaration, the application-provided
2477 application-provided subset will be parsed, but the
2484 <p>The application-provided external subset is read by calling the
2504 <pre class="fcndec">
2508 </pre>
2513 i.e. when namespace processing is in effect. The <code>do_nst</code>
2516 non-zero, then afterwards namespace qualified names (that is qualified
2527 <pre class="fcndec">
2530 </pre>
2533 processing instruction or character data. It causes the corresponding
2542 <pre class="fcndec">
2545 </pre>
2551 <pre class="fcndec">
2554 </pre>
2555 <pre class="signature">
2561 </pre>
2564 Some macros are also defined that support compile-time tests of the
2576 <pre class="fcndec">
2579 </pre>
2580 <pre class="signature">
2599 </pre>
2610 identifying the feature-test macros Expat was compiled with. Since an
2638 <pre class="fcndec">
2641 </pre>
2651 is especially useful for third-party libraries that interact with a
2658 <pre class="fcndec">
2661 </pre>
2671 <pre class="fcndec">
2674 </pre>
2691 <pre class="fcndec">
2694 </pre>