http://swpat.ffii.org/
Action against software patents
http://www.gnome.org/
Gnome2 Logo
http://www.w3.org/Status
W3C Logo
http://www.redhat.com/
Red Hat Logo
http://xmlsoft.org/
Made with Libxml2 Logo
The XML C parser and toolkit of Gnome
The parser interfaces
Developer Menu
index.html
Main Menu
html/index.html
Reference Manual
examples/index.html
Code Examples
guidelines.html
XML Guidelines
tutorial/index.html
Tutorial
xmlreader.html
The Reader Interface
ChangeLog.html
ChangeLog
XSLT.html
XSLT
python.html
Python and bindings
architecture.html
libxml2 architecture
tree.html
The tree output
interface.html
The SAX interface
xmlmem.html
Memory Management
xmlio.html
I/O Interfaces
library.html
The parser interfaces
entities.html
Entities or no entities
namespaces.html
Namespaces
upgrade.html
Upgrading 1.x code
threads.html
Thread safety
DOM.html
DOM Principles
example.html
A real example
xml.html
flat page
,
site.xsl
stylesheet
API Indexes
APIchunk0.html
Alphabetic
APIconstructors.html
Constructors
APIfunctions.html
Functions/Types
APIfiles.html
Modules
APIsymbols.html
Symbols
Related links
http://mail.gnome.org/archives/xml/
Mail archive
http://xmlsoft.org/XSLT/
XSLT libxslt
http://phd.cs.unibo.it/gdome2/
DOM gdome2
http://www.aleksey.com/xmlsec/
XML-DSig xmlsec
ftp://xmlsoft.org/
FTP
http://www.zlatkovic.com/projects/libxml/
Windows binaries
http://www.blastwave.org/packages.php/libxml2
Solaris binaries
http://www.explain.com.au/oss/libxml2xslt.html
MacOsX binaries
http://libxmlplusplus.sourceforge.net/
C++ bindings
http://www.zend.com/php5/articles/php5-xmlphp.php#Heading4
PHP bindings
http://sourceforge.net/projects/libxml2-pas/
Pascal bindings
http://rubyforge.org/projects/xml-tools/
Ruby bindings
http://tclxml.sourceforge.net/
Tcl bindings
http://bugzilla.gnome.org/buglist.cgi?product=libxml2
Bug Tracker
This section is directly intended to help programmers getting bootstrapped
using the XML tollkit from the C language. It is not intended to be
extensive. I hope the automatically generated documents will provide the
completeness required, but as a separate set of documents. The interfaces of
the XML parser are by principle low level, Those interested in a higher level
API should
#DOM
look at DOM
.
The
html/libxml-parser.html
parser interfaces for XML
are
separated from the
html/libxml-htmlparser.html
HTML parser
interfaces
.  Let's have a look at how the XML parser can be called:
Invoking the parser : the pull method
Usually, the first thing to do is to read an XML input. The parser accepts
documents either from in-memory strings or from files.  The functions are
defined in "parser.h":
xmlDocPtr xmlParseMemory(char *buffer, int size);
Parse a null-terminated string containing the document.
xmlDocPtr xmlParseFile(const char *filename);
Parse an XML document contained in a (possibly compressed)
file.
The parser returns a pointer to the document structure (or NULL in case of
failure).
Invoking the parser: the push method
In order for the application to keep the control when the document is
being fetched (which is common for GUI based programs) libxml2 provides a
push interface, too, as of version 1.8.3. Here are the interface
functions:
xmlParserCtxtPtr xmlCreatePushParserCtxt(xmlSAXHandlerPtr sax,
void *user_data,
const char *chunk,
int size,
const char *filename);
int              xmlParseChunk          (xmlParserCtxtPtr ctxt,
const char *chunk,
int size,
int terminate);
and here is a simple example showing how to use the interface:
FILE *f;
f = fopen(filename, "r");
if (f != NULL) {
int res, size = 1024;
char chars[1024];
xmlParserCtxtPtr ctxt;
res = fread(chars, 1, 4, f);
if (res > 0) {
ctxt = xmlCreatePushParserCtxt(NULL, NULL,
chars, res, filename);
while ((res = fread(chars, 1, size, f)) > 0) {
xmlParseChunk(ctxt, chars, res, 0);
}
xmlParseChunk(ctxt, chars, 0, 1);
doc = ctxt->myDoc;
xmlFreeParserCtxt(ctxt);
}
}
The HTML parser embedded into libxml2 also has a push interface; the
functions are just prefixed by "html" rather than "xml".
Invoking the parser: the SAX interface
The tree-building interface makes the parser memory-hungry, first loading
the document in memory and then building the tree itself. Reading a document
without building the tree is possible using the SAX interfaces (see SAX.h and
http://www.daa.com.au/~james/gnome/xml-sax/xml-sax.html
James
Henstridge's documentation
). Note also that the push interface can be
limited to SAX: just use the two first arguments of
xmlCreatePushParserCtxt()
.
Building a tree from scratch
The other way to get an XML tree in memory is by building it. Basically
there is a set of functions dedicated to building new elements. (These are
also described in <libxml/tree.h>.) For example, here is a piece of
code that produces the XML document used in the previous examples:
#include <libxml/tree.h>
xmlDocPtr doc;
xmlNodePtr tree, subtree;
doc = xmlNewDoc("1.0");
doc->children = xmlNewDocNode(doc, NULL, "EXAMPLE", NULL);
xmlSetProp(doc->children, "prop1", "gnome is great");
xmlSetProp(doc->children, "prop2", "& linux too");
tree = xmlNewChild(doc->children, NULL, "head", NULL);
subtree = xmlNewChild(tree, NULL, "title", "Welcome to Gnome");
tree = xmlNewChild(doc->children, NULL, "chapter", NULL);
subtree = xmlNewChild(tree, NULL, "title", "The Linux adventure");
subtree = xmlNewChild(tree, NULL, "p", "bla bla bla ...");
subtree = xmlNewChild(tree, NULL, "image", NULL);
xmlSetProp(subtree, "href", "linus.gif");
Not really rocket science ...
Traversing the tree
Basically by
html/libxml-tree.html
including "tree.h"
your
code has access to the internal structure of all the elements of the tree.
The names should be somewhat simple like
parent
,
children
,
next
,
prev
,
properties
, etc... For example, still with the previous
example:
doc->children->children->children
points to the title element,
doc->children->children->next->children->children
points to the text node containing the chapter title "The Linux
adventure".
NOTE
: XML allows
PI
s and
comments
to be
present before the document root, so
doc->children
may point
to an element which is not the document Root Element; a function
xmlDocGetRootElement()
was added for this purpose.
Modifying the tree
Functions are provided for reading and writing the document content. Here
is an excerpt from the
html/libxml-tree.html
tree API
:
xmlAttrPtr xmlSetProp(xmlNodePtr node, const xmlChar *name, const
xmlChar *value);
This sets (or changes) an attribute carried by an ELEMENT node.
The value can be NULL.
const xmlChar *xmlGetProp(xmlNodePtr node, const xmlChar
*name);
This function returns a pointer to new copy of the property
content. Note that the user must deallocate the result.
Two functions are provided for reading and writing the text associated
with elements:
xmlNodePtr xmlStringGetNodeList(xmlDocPtr doc, const xmlChar
*value);
This function takes an "external" string and converts it to one
text node or possibly to a list of entity and text nodes. All
non-predefined entity references like &Gnome; will be stored
internally as entity nodes, hence the result of the function may not be
a single node.
xmlChar *xmlNodeListGetString(xmlDocPtr doc, xmlNodePtr list, int
inLine);
This function is the inverse of
xmlStringGetNodeList()
. It generates a new string
containing the content of the text and entity nodes. Note the extra
argument inLine. If this argument is set to 1, the function will expand
entity references.  For example, instead of returning the &Gnome;
XML encoding in the string, it will substitute it with its value (say,
"GNU Network Object Model Environment").
Saving a tree
Basically 3 options are possible:
void xmlDocDumpMemory(xmlDocPtr cur, xmlChar**mem, int
*size);
Returns a buffer into which the document has been saved.
extern void xmlDocDump(FILE *f, xmlDocPtr doc);
Dumps a document to an open file descriptor.
int xmlSaveFile(const char *filename, xmlDocPtr cur);
Saves the document to a file. In this case, the compression
interface is triggered if it has been turned on.
Compression
The library transparently handles compression when doing file-based
accesses. The level of compression on saves can be turned on either globally
or individually for one file:
int  xmlGetDocCompressMode (xmlDocPtr doc);
Gets the document compression ratio (0-9).
void xmlSetDocCompressMode (xmlDocPtr doc, int mode);
Sets the document compression ratio.
int  xmlGetCompressMode(void);
Gets the default compression ratio.
void xmlSetCompressMode(int mode);
Sets the default compression ratio.
bugs.html
Daniel Veillard
