| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606 |
- <html>
- <body>
- <h1 align='right'><a name='BASICS'><img src="2.gif" align="right"
- hspace="10" width="100" height="100" alt="2"></a>Getting Started
- with Mini-XML</h1>
- <p>This chapter describes how to write programs that use Mini-XML to
- access data in an XML file. Mini-XML provides the following
- functionality:</p>
- <ul>
- <li>Functions for creating and managing XML documents
- in memory.</li>
- <li>Reading of UTF-8 and UTF-16 encoded XML files and
- strings.</li>
- <li>Writing of UTF-8 encoded XML files and strings.</li>
- <li>Support for arbitrary element names, attributes, and
- attribute values with no preset limits, just available
- memory.</li>
- <li>Support for integer, real, opaque ("CDATA"), and text
- data types in "leaf" nodes.</li>
- <li>"Find", "index", and "walk" functions for easily
- accessing data in an XML document.</li>
- </ul>
- <p>Mini-XML doesn't do validation or other types of processing
- on the data based upon schema files or other sources of
- definition information, nor does it support character entities
- other than those required by the XML specification.</p>
- <h2>The Basics</h2>
- <p>Mini-XML provides a single header file which you include:</p>
- <pre>
- #include <mxml.h>
- </pre>
- <p>The Mini-XML library is included with your program using the
- <kbd>-lmxml</kbd> option:</p>
- <pre>
- <kbd>gcc -o myprogram myprogram.c -lmxml ENTER</kbd>
- </pre>
- <p>If you have the <tt>pkg-config(1)</tt> software installed,
- you can use it to determine the proper compiler and linker options
- for your installation:</p>
- <pre>
- <kbd>pkg-config --cflags mxml ENTER</kbd>
- <kbd>pkg-config --libs mxml ENTER</kbd>
- </pre>
- <h2>Nodes</h2>
- <p>Every piece of information in an XML file is stored in memory in "nodes".
- Nodes are defined by the <a href='#mxml_node_t'><tt>mxml_node_t</tt></a>
- structure. Each node has a typed value, optional user data, a parent node,
- sibling nodes (previous and next), and potentially child nodes.</p>
- <p>For example, if you have an XML file like the following:</p>
- <pre>
- <?xml version="1.0" encoding="utf-8"?>
- <data>
- <node>val1</node>
- <node>val2</node>
- <node>val3</node>
- <group>
- <node>val4</node>
- <node>val5</node>
- <node>val6</node>
- </group>
- <node>val7</node>
- <node>val8</node>
- </data>
- </pre>
- <p>the node tree for the file would look like the following in memory:</p>
- <pre>
- ?xml version="1.0" encoding="utf-8"?
- |
- data
- |
- node - node - node - group - node - node
- | | | | | |
- val1 val2 val3 | val7 val8
- |
- node - node - node
- | | |
- val4 val5 val6
- </pre>
- <p>where "-" is a pointer to the sibling node and "|" is a pointer
- to the first child or parent node.</p>
- <p>The <a href="#mxmlGetType"><tt>mxmlGetType</tt></a> function gets the type of
- a node, one of <tt>MXML_CUSTOM</tt>, <tt>MXML_ELEMENT</tt>,
- <tt>MXML_INTEGER</tt>, <tt>MXML_OPAQUE</tt>, <tt>MXML_REAL</tt>, or
- <tt>MXML_TEXT</tt>. The parent and sibling nodes are accessed using the
- <a href="#mxmlGetParent"><tt>mxmlGetParent</tt></a>,
- <a href="#mxmlGetNext"><tt>mxmlGetNext</tt></a>, and
- <a href="#mxmlGetPrevious"><tt>mxmlGetPrevious</tt></a> functions. The
- <a href="#mxmlGetUserData"><tt>mxmlGetUserData</tt></a> function gets any user
- data associated with the node.</p>
- <h3>CDATA Nodes</h3>
- <p>CDATA (<tt>MXML_ELEMENT</tt>) nodes are created using the
- <a href="#mxmlNewCDATA"><tt>mxmlNewCDATA</tt></a> function. The
- <a href="#mxmlGetCDATA"><tt>mxmlGetCDATA</tt></a> function retrieves the
- CDATA string pointer for a node.</p>
- <blockquote><b>Note:</b>
- <p>CDATA nodes are currently stored in memory as special elements. This will
- be changed in a future major release of Mini-XML.</p>
- </blockquote>
- <h3>Custom Nodes</h3>
- <p>Custom (<tt>MXML_CUSTOM</tt>) nodes are created using the
- <a href="#mxmlNewCustom"><tt>mxmlNewCustom</tt></a> function or using a custom
- load callback specified using the
- <a href="#mxmlSetCustomHandlers"><tt>mxmlSetCustomHandlers</tt></a> function.
- The <a href="#mxmlGetCustom"><tt>mxmlGetCustom</tt></a> function retrieves the
- custom value pointer for a node.</p>
- <h3>Comment Nodes</h3>
- <p>Comment (<tt>MXML_ELEMENT</tt>) nodes are created using the
- <a href="#mxmlNewElement"><tt>mxmlNewElement</tt></a> function. The
- <a href="#mxmlGetElement"><tt>mxmlGetElement</tt></a> function retrieves the
- comment string pointer for a node, including the surrounding "!--" and "--"
- characters.</p>
- <blockquote><b>Note:</b>
- <p>Comment nodes are currently stored in memory as special elements. This will
- be changed in a future major release of Mini-XML.</p>
- </blockquote>
- <h3>Element Nodes</h3>
- <p>Element (<tt>MXML_ELEMENT</tt>) nodes are created using the
- <a href="#mxmlNewElement"><tt>mxmlNewElement</tt></a> function. The
- <a href="#mxmlGetElement"><tt>mxmlGetElement</tt></a> function retrieves the
- element name, the
- <a href="#mxmlElementGetAttr"><tt>mxmlElementGetAttr</tt></a> function retrieves
- the value string for a named attribute associated with the element, and the
- <a href="#mxmlGetFirstChild"><tt>mxmlGetFirstChild</tt></a> and
- <a href="#mxmlGetLastChild"><tt>mxmlGetLastChild</tt></a> functions retrieve the
- first and last child nodes for the element, respectively.</p>
- <h3>Integer Nodes</h3>
- <p>Integer (<tt>MXML_INTEGER</tt>) nodes are created using the
- <a href="#mxmlNewInteger"><tt>mxmlNewInteger</tt></a> function. The
- <a href="#mxmlGetInteger"><tt>mxmlGetInteger</tt></a> function retrieves the
- integer value for a node.</p>
- <h3>Opaque Nodes</h3>
- <p>Opaque (<tt>MXML_OPAQUE</tt>) nodes are created using the
- <a href="#mxmlNewOpaque"><tt>mxmlNewOpaque</tt></a> function. The
- <a href="#mxmlGetOpaque"><tt>mxmlGetOpaque</tt></a> function retrieves the
- opaque string pointer for a node. Opaque nodes are like string nodes but
- preserve all whitespace between nodes.</p>
- <h3>Text Nodes</h3>
- <p>Text (<tt>MXML_TEXT</tt>) nodes are created using the
- <a href="#mxmlNewText"><tt>mxmlNewText</tt></a> and
- <a href="#mxmlNewTextf"><tt>mxmlNewTextf</tt></a> functions. Each text node
- consists of a text string and (leading) whitespace value - the
- <a href="#mxmlGetText"><tt>mxmlGetText</tt></a> function retrieves the
- text string pointer and whitespace value for a node.</p>
- <!-- NEED 12 -->
- <h3>Processing Instruction Nodes</h3>
- <p>Processing instruction (<tt>MXML_ELEMENT</tt>) nodes are created using the
- <a href="#mxmlNewElement"><tt>mxmlNewElement</tt></a> function. The
- <a href="#mxmlGetElement"><tt>mxmlGetElement</tt></a> function retrieves the
- processing instruction string for a node, including the surrounding "?"
- characters.</p>
- <blockquote><b>Note:</b>
- <p>Processing instruction nodes are currently stored in memory as special
- elements. This will be changed in a future major release of Mini-XML.</p>
- </blockquote>
- <h3>Real Number Nodes</h3>
- <p>Real number (<tt>MXML_REAL</tt>) nodes are created using the
- <a href="#mxmlNewReal"><tt>mxmlNewReal</tt></a> function. The
- <a href="#mxmlGetReal"><tt>mxmlGetReal</tt></a> function retrieves the
- CDATA string pointer for a node.</p>
- <!-- NEED 15 -->
- <h3>XML Declaration Nodes</h3>
- <p>XML declaration (<tt>MXML_ELEMENT</tt>) nodes are created using the
- <a href="#mxmlNewXML"><tt>mxmlNewXML</tt></a> function. The
- <a href="#mxmlGetElement"><tt>mxmlGetElement</tt></a> function retrieves the
- XML declaration string for a node, including the surrounding "?" characters.</p>
- <blockquote><b>Note:</b>
- <p>XML declaration nodes are currently stored in memory as special elements.
- This will be changed in a future major release of Mini-XML.</p>
- </blockquote>
- <!-- NEW PAGE -->
- <h2>Creating XML Documents</h2>
- <p>You can create and update XML documents in memory using the
- various <tt>mxmlNew</tt> functions. The following code will
- create the XML document described in the previous section:</p>
- <pre>
- mxml_node_t *xml; /* <?xml ... ?> */
- mxml_node_t *data; /* <data> */
- mxml_node_t *node; /* <node> */
- mxml_node_t *group; /* <group> */
- xml = mxmlNewXML("1.0");
- data = mxmlNewElement(xml, "data");
- node = mxmlNewElement(data, "node");
- mxmlNewText(node, 0, "val1");
- node = mxmlNewElement(data, "node");
- mxmlNewText(node, 0, "val2");
- node = mxmlNewElement(data, "node");
- mxmlNewText(node, 0, "val3");
- group = mxmlNewElement(data, "group");
- node = mxmlNewElement(group, "node");
- mxmlNewText(node, 0, "val4");
- node = mxmlNewElement(group, "node");
- mxmlNewText(node, 0, "val5");
- node = mxmlNewElement(group, "node");
- mxmlNewText(node, 0, "val6");
- node = mxmlNewElement(data, "node");
- mxmlNewText(node, 0, "val7");
- node = mxmlNewElement(data, "node");
- mxmlNewText(node, 0, "val8");
- </pre>
- <!-- NEED 6 -->
- <p>We start by creating the declaration node common to all XML files using the
- <a href="#mxmlNewXML"><tt>mxmlNewXML</tt></a> function:</p>
- <pre>
- xml = mxmlNewXML("1.0");
- </pre>
- <p>We then create the <tt><data></tt> node used for this document using
- the <a href="#mxmlNewElement"><tt>mxmlNewElement</tt></a> function. The first
- argument specifies the parent node (<tt>xml</tt>) while the second specifies the
- element name (<tt>data</tt>):</p>
- <pre>
- data = mxmlNewElement(xml, "data");
- </pre>
- <p>Each <tt><node>...</node></tt> in the file is created using the
- <tt>mxmlNewElement</tt> and <a href="#mxmlNewText"><tt>mxmlNewText</tt></a>
- functions. The first argument of <tt>mxmlNewText</tt> specifies the parent node
- (<tt>node</tt>). The second argument specifies whether whitespace appears before
- the text - 0 or false in this case. The last argument specifies the actual text
- to add:</p>
- <pre>
- node = mxmlNewElement(data, "node");
- mxmlNewText(node, 0, "val1");
- </pre>
- <p>The resulting in-memory XML document can then be saved or processed just like
- one loaded from disk or a string.</p>
- <!-- NEED 15 -->
- <h2>Loading XML</h2>
- <p>You load an XML file using the <a
- href='#mxmlLoadFile'><tt>mxmlLoadFile</tt></a>
- function:</p>
- <pre>
- FILE *fp;
- mxml_node_t *tree;
- fp = fopen("filename.xml", "r");
- tree = mxmlLoadFile(NULL, fp,
- MXML_TEXT_CALLBACK);
- fclose(fp);
- </pre>
- <p>The first argument specifies an existing XML parent node, if
- any. Normally you will pass <tt>NULL</tt> for this argument
- unless you are combining multiple XML sources. The XML file must
- contain a complete XML document including the <tt>?xml</tt>
- element if the parent node is <tt>NULL</tt>.</p>
- <p>The second argument specifies the stdio file to read from, as
- opened by <tt>fopen()</tt> or <tt>popen()</tt>. You can also use
- <tt>stdin</tt> if you are implementing an XML filter
- program.</p>
- <p>The third argument specifies a callback function which returns
- the value type of the immediate children for a new element node:
- <tt>MXML_CUSTOM</tt>, <tt>MXML_IGNORE</tt>,
- <tt>MXML_INTEGER</tt>, <tt>MXML_OPAQUE</tt>, <tt>MXML_REAL</tt>,
- or <tt>MXML_TEXT</tt>. Load callbacks are described in detail in
- <a href='#LOAD_CALLBACKS'>Chapter 3</a>. The example code uses
- the <tt>MXML_TEXT_CALLBACK</tt> constant which specifies that all
- data nodes in the document contain whitespace-separated text
- values. Other standard callbacks include
- <tt>MXML_IGNORE_CALLBACK</tt>, <tt>MXML_INTEGER_CALLBACK</tt>,
- <tt>MXML_OPAQUE_CALLBACK</tt>, and
- <tt>MXML_REAL_CALLBACK</tt>.</p>
- <p>The <a href='#mxmlLoadString'><tt>mxmlLoadString</tt></a>
- function loads XML node trees from a string:</p>
- <!-- NEED 10 -->
- <pre>
- char buffer[8192];
- mxml_node_t *tree;
- ...
- tree = mxmlLoadString(NULL, buffer,
- MXML_TEXT_CALLBACK);
- </pre>
- <p>The first and third arguments are the same as used for
- <tt>mxmlLoadFile()</tt>. The second argument specifies the
- string or character buffer to load and must be a complete XML
- document including the <tt>?xml</tt> element if the parent node
- is <tt>NULL</tt>.</p>
- <!-- NEED 15 -->
- <h2>Saving XML</h2>
- <p>You save an XML file using the <a
- href='#mxmlSaveFile'><tt>mxmlSaveFile</tt></a> function:</p>
- <pre>
- FILE *fp;
- mxml_node_t *tree;
- fp = fopen("filename.xml", "w");
- mxmlSaveFile(tree, fp, MXML_NO_CALLBACK);
- fclose(fp);
- </pre>
- <p>The first argument is the XML node tree to save. It should
- normally be a pointer to the top-level <tt>?xml</tt> node in
- your XML document.</p>
- <p>The second argument is the stdio file to write to, as opened
- by <tt>fopen()</tt> or <tt>popen()</tt>. You can also use
- <tt>stdout</tt> if you are implementing an XML filter
- program.</p>
- <p>The third argument is the whitespace callback to use when
- saving the file. Whitespace callbacks are covered in detail in <a
- href='SAVE_CALLBACKS'>Chapter 3</a>. The previous example code
- uses the <tt>MXML_NO_CALLBACK</tt> constant to specify that no
- special whitespace handling is required.</p>
- <p>The <a
- href='#mxmlSaveAllocString'><tt>mxmlSaveAllocString</tt></a>,
- and <a href='#mxmlSaveString'><tt>mxmlSaveString</tt></a>
- functions save XML node trees to strings:</p>
- <pre>
- char buffer[8192];
- char *ptr;
- mxml_node_t *tree;
- ...
- mxmlSaveString(tree, buffer, sizeof(buffer),
- MXML_NO_CALLBACK);
- ...
- ptr = mxmlSaveAllocString(tree, MXML_NO_CALLBACK);
- </pre>
- <p>The first and last arguments are the same as used for
- <tt>mxmlSaveFile()</tt>. The <tt>mxmlSaveString</tt> function
- takes pointer and size arguments for saving the XML document to
- a fixed-size buffer, while <tt>mxmlSaveAllocString()</tt>
- returns a string buffer that was allocated using
- <tt>malloc()</tt>.</p>
- <!-- NEED 15 -->
- <h3>Controlling Line Wrapping</h3>
- <p>When saving XML documents, Mini-XML normally wraps output
- lines at column 75 so that the text is readable in terminal
- windows. The <a
- href='#mxmlSetWrapMargin'><tt>mxmlSetWrapMargin</tt></a> function
- overrides the default wrap margin:</p>
- <pre>
- /* Set the margin to 132 columns */
- mxmlSetWrapMargin(132);
- /* Disable wrapping */
- mxmlSetWrapMargin(0);
- </pre>
- <h2>Memory Management</h2>
- <p>Once you are done with the XML data, use the <a
- href="#mxmlDelete"><tt>mxmlDelete</tt></a> function to recursively
- free the memory that is used for a particular node or the entire
- tree:</p>
- <pre>
- mxmlDelete(tree);
- </pre>
- <p>You can also use reference counting to manage memory usage. The
- <a href="#mxmlRetain"><tt>mxmlRetain</tt></a> and
- <a href="#mxmlRelease"><tt>mxmlRelease</tt></a> functions increment and
- decrement a node's use count, respectively. When the use count goes to 0,
- <tt>mxmlRelease</tt> will automatically call <tt>mxmlDelete</tt> to actually
- free the memory used by the node tree. New nodes automatically start with a
- use count of 1.</p>
- <!-- NEW PAGE-->
- <h2>Finding and Iterating Nodes</h2>
- <p>The <a
- href='#mxmlWalkPrev'><tt>mxmlWalkPrev</tt></a>
- and <a
- href='#mxmlWalkNext'><tt>mxmlWalkNext</tt></a>functions
- can be used to iterate through the XML node tree:</p>
- <pre>
- mxml_node_t *node;
-
- node = mxmlWalkPrev(current, tree,
- MXML_DESCEND);
- node = mxmlWalkNext(current, tree,
- MXML_DESCEND);
- </pre>
- <p>In addition, you can find a named element/node using the <a
- href='#mxmlFindElement'><tt>mxmlFindElement</tt></a>
- function:</p>
- <pre>
- mxml_node_t *node;
-
- node = mxmlFindElement(tree, tree, "name",
- "attr", "value",
- MXML_DESCEND);
- </pre>
- <p>The <tt>name</tt>, <tt>attr</tt>, and <tt>value</tt>
- arguments can be passed as <tt>NULL</tt> to act as wildcards,
- e.g.:</p>
- <!-- NEED 4 -->
- <pre>
- /* Find the first "a" element */
- node = mxmlFindElement(tree, tree, "a",
- NULL, NULL,
- MXML_DESCEND);
- </pre>
- <!-- NEED 5 -->
- <pre>
- /* Find the first "a" element with "href"
- attribute */
- node = mxmlFindElement(tree, tree, "a",
- "href", NULL,
- MXML_DESCEND);
- </pre>
- <!-- NEED 6 -->
- <pre>
- /* Find the first "a" element with "href"
- to a URL */
- node = mxmlFindElement(tree, tree, "a",
- "href",
- "http://www.easysw.com/",
- MXML_DESCEND);
- </pre>
- <!-- NEED 5 -->
- <pre>
- /* Find the first element with a "src"
- attribute */
- node = mxmlFindElement(tree, tree, NULL,
- "src", NULL,
- MXML_DESCEND);
- </pre>
- <!-- NEED 5 -->
- <pre>
- /* Find the first element with a "src"
- = "foo.jpg" */
- node = mxmlFindElement(tree, tree, NULL,
- "src", "foo.jpg",
- MXML_DESCEND);
- </pre>
- <p>You can also iterate with the same function:</p>
- <pre>
- mxml_node_t *node;
- for (node = mxmlFindElement(tree, tree,
- "name",
- NULL, NULL,
- MXML_DESCEND);
- node != NULL;
- node = mxmlFindElement(node, tree,
- "name",
- NULL, NULL,
- MXML_DESCEND))
- {
- ... do something ...
- }
- </pre>
- <!-- NEED 10 -->
- <p>The <tt>MXML_DESCEND</tt> argument can actually be one of
- three constants:</p>
- <ul>
- <li><tt>MXML_NO_DESCEND</tt> means to not to look at any
- child nodes in the element hierarchy, just look at
- siblings at the same level or parent nodes until the top
- node or top-of-tree is reached.
- <p>The previous node from "group" would be the "node"
- element to the left, while the next node from "group" would
- be the "node" element to the right.<br><br></p></li>
- <li><tt>MXML_DESCEND_FIRST</tt> means that it is OK to
- descend to the first child of a node, but not to descend
- further when searching. You'll normally use this when
- iterating through direct children of a parent node, e.g. all
- of the "node" and "group" elements under the "?xml" parent
- node in the example above.
- <p>This mode is only applicable to the search function; the
- walk functions treat this as <tt>MXML_DESCEND</tt> since
- every call is a first time.<br><br></p></li>
- <li><tt>MXML_DESCEND</tt> means to keep descending until
- you hit the bottom of the tree. The previous node from
- "group" would be the "val3" node and the next node would
- be the first node element under "group".
- <p>If you were to walk from the root node "?xml" to the end
- of the tree with <tt>mxmlWalkNext()</tt>, the order would
- be:</p>
- <p><tt>?xml data node val1 node val2 node val3 group node
- val4 node val5 node val6 node val7 node val8</tt></p>
- <p>If you started at "val8" and walked using
- <tt>mxmlWalkPrev()</tt>, the order would be reversed,
- ending at "?xml".</p></li>
- </ul>
- <h2>Finding Specific Nodes</h2>
- <p>You can find specific nodes in the tree using the <a
- href='#mxmlFindValue'><tt>mxmlFindPath</tt></a>, for example:
- <pre>
- mxml_node_t *value;
- value = mxmlFindPath(tree, "path/to/*/foo/bar");
- </pre>
- <p>The second argument is a "path" to the parent node. Each component of the
- path is separated by a slash (/) and represents a named element in the document
- tree or a wildcard (*) path representing 0 or more intervening nodes.</p>
- </body>
- </html>
|