http://swpat.ffii.org/Action against software patents http://www.gnome.org/Gnome2 Logo http://www.w3.org/StatusW3C Logo http://www.redhat.com/Red Hat Logo http://xmlsoft.org/Made with Libxml2 Logo The XML C parser and toolkit of Gnome
A real example
Developer Menu index.htmlMain Menu html/index.htmlReference Manual examples/index.htmlCode Examples guidelines.htmlXML Guidelines tutorial/index.htmlTutorial xmlreader.htmlThe Reader Interface ChangeLog.htmlChangeLog XSLT.htmlXSLT python.htmlPython and bindings architecture.htmllibxml2 architecture tree.htmlThe tree output interface.htmlThe SAX interface xmlmem.htmlMemory Management xmlio.htmlI/O Interfaces library.htmlThe parser interfaces entities.htmlEntities or no entities namespaces.htmlNamespaces upgrade.htmlUpgrading 1.x code threads.htmlThread safety DOM.htmlDOM Principles example.htmlA real example xml.htmlflat page , site.xslstylesheet API Indexes APIchunk0.htmlAlphabetic APIconstructors.htmlConstructors APIfunctions.htmlFunctions/Types APIfiles.htmlModules APIsymbols.htmlSymbols Related links http://mail.gnome.org/archives/xml/Mail archive http://xmlsoft.org/XSLT/XSLT libxslt http://phd.cs.unibo.it/gdome2/DOM gdome2 http://www.aleksey.com/xmlsec/XML-DSig xmlsec ftp://xmlsoft.org/FTP http://www.zlatkovic.com/projects/libxml/Windows binaries http://www.blastwave.org/packages.php/libxml2Solaris binaries http://www.explain.com.au/oss/libxml2xslt.htmlMacOsX binaries http://libxmlplusplus.sourceforge.net/C++ bindings http://www.zend.com/php5/articles/php5-xmlphp.php#Heading4PHP bindings http://sourceforge.net/projects/libxml2-pas/Pascal bindings http://rubyforge.org/projects/xml-tools/Ruby bindings http://tclxml.sourceforge.net/Tcl bindings http://bugzilla.gnome.org/buglist.cgi?product=libxml2Bug Tracker Here is a real size example, where the actual content of the application
data is not kept in the DOM tree but uses internal structures. It is based on
a proposal to keep a database of jobs related to Gnome, with an XML based
storage structure. Here is an 
gjobs.xmlXML encoded jobs
base
:
<?xml version="1.0"?>
<gjob:Helping xmlns:gjob="http://www.gnome.org/some-location">
  <gjob:Jobs>
    <gjob:Job>
      <gjob:Project ID="3"/>
      <gjob:Application>GBackup</gjob:Application>
      <gjob:Category>Development</gjob:Category>
      <gjob:Update>
        <gjob:Status>Open</gjob:Status>
        <gjob:Modified>Mon, 07 Jun 1999 20:27:45 -0400 MET DST</gjob:Modified>
        <gjob:Salary>USD 0.00</gjob:Salary>
      </gjob:Update>
      <gjob:Developers>
        <gjob:Developer>
        </gjob:Developer>
      </gjob:Developers>
      <gjob:Contact>
        <gjob:Person>Nathan Clemons</gjob:Person>
        <gjob:Email>nathan@windsofstorm.net</gjob:Email>
        <gjob:Company>
        </gjob:Company>
        <gjob:Organisation>
        </gjob:Organisation>
        <gjob:Webpage>
        </gjob:Webpage>
        <gjob:Snailmail>
        </gjob:Snailmail>
        <gjob:Phone>
        </gjob:Phone>
      </gjob:Contact>
      <gjob:Requirements>
      The program should be released as free software, under the GPL.
      </gjob:Requirements>
      <gjob:Skills>
      </gjob:Skills>
      <gjob:Details>
      A GNOME based system that will allow a superuser to configure 
      compressed and uncompressed files and/or file systems to be backed 
      up with a supported media in the system.  This should be able to 
      perform via find commands generating a list of files that are passed 
      to tar, dd, cpio, cp, gzip, etc., to be directed to the tape machine 
      or via operations performed on the filesystem itself. Email 
      notification and GUI status display very important.
      </gjob:Details>
    </gjob:Job>
  </gjob:Jobs>
</gjob:Helping>
While loading the XML file into an internal DOM tree is a matter of
calling only a couple of functions, browsing the tree to gather the data and
generate the internal structures is harder, and more error prone.
The suggested principle is to be tolerant with respect to the input
structure. For example, the ordering of the attributes is not significant,
the XML specification is clear about it. It's also usually a good idea not to
depend on the order of the children of a given node, unless it really makes
things harder. Here is some code to parse the information for a person:
/*
 * A person record
 */
typedef struct person {
    char *name;
    char *email;
    char *company;
    char *organisation;
    char *smail;
    char *webPage;
    char *phone;
} person, *personPtr;
/*
 * And the code needed to parse it
 */
personPtr parsePerson(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
    personPtr ret = NULL;
DEBUG("parsePerson\n");
    /*
     * allocate the struct
     */
    ret = (personPtr) malloc(sizeof(person));
    if (ret == NULL) {
        fprintf(stderr,"out of memory\n");
        return(NULL);
    }
    memset(ret, 0, sizeof(person));
    /* We don't care what the top level element name is */
    cur = cur->xmlChildrenNode;
    while (cur != NULL) {
        if ((!strcmp(cur->name, "Person")) && (cur->ns == ns))
            ret->name = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);
        if ((!strcmp(cur->name, "Email")) && (cur->ns == ns))
            ret->email = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);
        cur = cur->next;
    }
    return(ret);
}
Here are a couple of things to notice:
Usually a recursive parsing style is the more convenient one: XML data
    is by nature subject to repetitive constructs and usually exhibits highly
    structured patterns.
  
The two arguments of type xmlDocPtr and xmlNsPtr,
    i.e. the pointer to the global XML document and the namespace reserved to
    the application. Document wide information are needed for example to
    decode entities and it's a good coding practice to define a namespace for
    your application set of data and test that the element and attributes
    you're analyzing actually pertains to your application space. This is
    done by a simple equality test (cur->ns == ns).
  
To retrieve text and attributes value, you can use the function
    
xmlNodeListGetString to gather all the text and entity reference
    nodes generated by the DOM output and produce an single text string.
Here is another piece of code used to parse another level of the
structure:
#include <libxml/tree.h>
/*
 * a Description for a Job
 */
typedef struct job {
    char *projectID;
    char *application;
    char *category;
    personPtr contact;
    int nbDevelopers;
    personPtr developers[100]; /* using dynamic alloc is left as an exercise */
} job, *jobPtr;
/*
 * And the code needed to parse it
 */
jobPtr parseJob(xmlDocPtr doc, xmlNsPtr ns, xmlNodePtr cur) {
    jobPtr ret = NULL;
DEBUG("parseJob\n");
    /*
     * allocate the struct
     */
    ret = (jobPtr) malloc(sizeof(job));
    if (ret == NULL) {
        fprintf(stderr,"out of memory\n");
        return(NULL);
    }
    memset(ret, 0, sizeof(job));
    /* We don't care what the top level element name is */
    cur = cur->xmlChildrenNode;
    while (cur != NULL) {
        
        if ((!strcmp(cur->name, "Project")) && (cur->ns == ns)) {
            ret->projectID = xmlGetProp(cur, "ID");
            if (ret->projectID == NULL) {
                fprintf(stderr, "Project has no ID\n");
            }
        }
        if ((!strcmp(cur->name, "Application")) && (cur->ns == ns))
            ret->application = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);
        if ((!strcmp(cur->name, "Category")) && (cur->ns == ns))
            ret->category = xmlNodeListGetString(doc, cur->xmlChildrenNode, 1);
        if ((!strcmp(cur->name, "Contact")) && (cur->ns == ns))
            ret->contact = parsePerson(doc, ns, cur);
        cur = cur->next;
    }
    return(ret);
}
Once you are used to it, writing this kind of code is quite simple, but
boring. Ultimately, it could be possible to write stubbers taking either C
data structure definitions, a set of XML examples or an XML DTD and produce
the code needed to import and export the content between C data and XML
storage. This is left as an exercise to the reader :-)
Feel free to use example/gjobread.cthe code for the full C
parsing example
 as a template, it is also available with Makefile in the
Gnome CVS base under gnome-xml/example
bugs.htmlDaniel Veillard 
