IDMEF XML Library version 0.7.2

This library is released as Alpha code under the GNU General Public License
(COPYING), and under the BSD License (COPYING.BSD).  This library is to be
used under one license or the other, but not both.

Authors: Joe McAlerney, joey@silicondefense.com,
         Adam Migus, NAI Labs, amigus@NAI.com
       --==============================================--

Description
-----------

This IDMEF XML library, henceforth known as libidmef, is an implementation of
the IDMEF XML draft [1].  Libidmef provides the ability to create IDMEF XML
messages from raw data. The raw data would typically be collected with an IDS,
and molded together into IDMEF XML messages using the functions of libidmef.
Libidmef also provides functionality for parsing IDMEF XML messages into data
structures.

Structure
---------

Libidmef is built on Gnome's libxml [2].  Libxml is a well supported and active
project that provides C structure implementations of XML documents and data.
Among many other attractive attributes, libxml allows for XML validation with
DTD's.

In building libidmef, we wanted to provide a simple front end to libxml.  There
needed to be a way for someone who does not necessarily understand the details 
of XML docs, nodes and trees to build XML messages, but allow those who do
wish to work at lower levels that freedom.  Libxml is a large library, with an
enormous span of functionality.  It is constantly growing, and will continue to
do so to meet the demands of it's users.

Requirements
------------

Libidmef requires libxml [2], available at http://www.xmlsoft.org/.

Installation
------------

Please read the directions in the INSTALL file.

---------------------------------------------------------
 Section 1: Building an IDMEF Message
---------------------------------------------------------

The Basics of Building
----------------------
At the core of the library resides the xmlDoc.  In short, it is a structure
that contains the tree that IDMEF XML messages will be structured with, and the
DTD that will be used to check the message validity.  The IDMEF message
functions work to build off of the current xmlDoc's root node.  Typically,
one would construct an IDMEF XML message, store it, print it, ship it, etc..., 
then construct another in the same fashion.

Libidmef also provides ability to concurrently build multiple IDMEF messages
by setting the current, or active document, and storing documents in progress.

Getting Started
---------------

This section is not intended to show the only way (or the best way for that
matter) to create IDMEF XML messages with libidmef.  It should, however, be
used to become familiar with the functions of libidmef, and perhaps some from
libxml.

1) Before you begin building IDMEF XML messages, you must call globalsInit() to
   (surprise!) initialize global variables. Pass globalsInit the local path to
   the current IDMEF XML DTD.  The DTD has been provide free of charge with
   this library, and is called idmef-message.dtd.

   globalsInit( DTD_PATH );

2) Next, call createCurrentDoc() to create and initialize a new xmlDoc, and
   set it as the current document.  createCurrentDoc() requires the current
   version of XML to be passed in.

   createCurrentDoc("1.0");

3) Now you want to start building your IDMEF XML message.  There are three
   approaches to this.  First, you can use a LISP-style way to build the
   entire message all at once.  Second, you can build the individual
   components, and put them all together at the end.  Third, you can create
   some unholy hybrid of the first two.  Actually, the third way is rather
   nice and fitting at times.

   Using the first approach, an IDMEF XML message can be built like so:

   ====== First recommended way to build an IDMEF XML message ======

   /**
    * This sample code is based partly on IDMEF Message generating
    * routines taken from the IDMEF XML Snort[3] plugin.  It builds IDMEF 
    * XML messages from data collected by Snort.  All variables were previously
    * declared and defined.  Additional information has been added to
    * more thoroughly illustrate functionality.
    *
    * Notice the use of NULL as the last parameter of each of the function
    * calls.  It acts as a sentinel value that ensures the function will
    * not read in more arguments then it should expect.
    **/

void buildMessage(Packet *p)
{
   if(p == NULL)
     return NULL;

  /* build a new IDMEF message */

  theMessage = newIDMEF_Message(
     newAttribute("version","1.0"),
     newAlert(
        newAttribute("ident",idmef_alertid_str),
        newAnalyzer( 
             newAttribute("analyzerid","IDS_DRMBOT"),
	     newAttribute("manufacturer","JBOT"),
	     newAttribute("model","0110"),
             newNode(
                newAddress(
                   newAttribute("category","ipv4-addr"),
                   newSimpleElement("address","192.168.123.234"),
                   NULL
	        ),
                NULL
             ),
             NULL
	),
        newCreateTime(NULL),                /* Current time values set here */
        newSource(
             newNode(
                newAddress(
                   newAttribute("category","ipv4-addr"),
                   newSimpleElement("address",inet_ntoa(p->iph->ip_src)),
                   NULL
	        ),
                NULL
             ),
             newService(
                newSimpleElement("port",sport),
                NULL
             ),
             NULL,
        ),
        newTarget(
             newNode(
                newAddress(
                   newAttribute("category","ipv4-addr"),
                   newSimpleElement("address",inet_ntoa(p->iph->ip_dst)),
                   NULL
	        ),
                NULL
             ),
             newService(
                newSimpleElement("port",dport),
                NULL
             ),
             NULL,
        ),
	newClassification(
	    newAttribute("origin","bugtraqid"),
	    newSimpleElement("name","TELNET bsd exploit client finishing"),
	    newSimpleElement("url","http://cve.mitre.org/cgi-bin/cvename.cgi?name=CAN-2001-0554"),
        NULL
	),
	newAssessment(
	     newImpact(
		 newAttribute("severity","high"),
		 newAttribute("completion","succeeded"),
		 newAttribute("type","admin"),
                 NULL
             ),
	     newConfidence(
		 newAttribute("rating","2"),
		 NULL
             ),
             NULL
        ),
        NULL
     ),
     NULL
  );

  /* 
     Libxml provides many ways to store and print your xmlDoc.  Use
     the one that best fits your needs.  The function below is a generic
     front end to one of those libxml functions.
   */
    
  /* The validation will shout out to stderr if there is a problem */
  if(validateCurrentDoc())
     printCurrentMessage(log_file);

}

  The above may seem a bit overwhealming, because it requires that you have all
  the data you need, but the structure provides an intuitive way to build 
  messages, and allow future modifications to be made in a simple fashion.

  Alternatively, you can use the second recommended way to build an IDMEF XML
  message.  This will require that you create a number of xmlNodePtr variables
  to assign to the return value of each of the function calls.

   ============ Second recommended way to build an IDMEF XML message ======

void buildMessage2(Packet *p)
{
   xmlNodePtr nMessage, nMessage_version;
   xmlNodePtr nAlert, nAlert_ident;
   xmlNodePtr nAnalyzer, nAnalyzer_analyzerid, nAnalyzer_manufacturer;
   xmlNodePtr nAnalyzer_model, nAnalyzer_Node, nAnalyzer_Node_Address;
   xmlNodePtr nAnalyzer_Node_Address_category, nAnalyzer_Node_Address_address;

   xmlNodePtr nAssessment, nImpact, nImpact_severity, nImpact_completion;
   xmlNodePtr nImpact_type, nConfidence, nConfidence_rating;

   xmlNodePtr nCreateTime;

   xmlNodePtr nSource, nSource_Node, nSource_Node_Address;
   xmlNodePtr nSource_Node_Address_category, nSource_Node_Address_address;
   xmlNodePtr nSource_Service, nSource_Service_port;

   xmlNodePtr nTarget, nTarget_Node, nTarget_Node_Address;
   xmlNodePtr nTarget_Node_Address_category, nTarget_Node_Address_address;
   xmlNodePtr nTarget_Service, nTarget_Service_port;
   
   xmlNodePtr nClassification, nClassification_origin, nClassification_name;
   xmlNodePtr nClassification_url;

   if(p == NULL)
     return NULL;

  /* build the simple elements, then the intermediate elements */

   nMessage_version = newAttribute("version","1.0");

   nAlert_ident = newAttribute("ident",intToString(idmef_alertid));


   nAnalyzer_analyzerid = newAttribute("analyzerid","IDS_DRMBOT");
   nAnalyzer_manufacturer = newAttribute("manufacturer","JBOT");
   nAnalyzer_model = newAttribute("model","0110");
   nAnalyzer_Node_category = newAttribute("category","ipv4-addr");
   nAnalyzer_Node_address = newSimpleElement("address","192.168.123.234");
   nAnalyzer_Node_Address = newAddress( nAnalyzer_Node_category,
	                                nAnalyzer_Node_address, NULL);
   nAnalyzer_Node = newNode( nAnalyzer_Node_Address, NULL);
   nAnalyzer = newAnalyzer( nAnalyzer_analyzerid, nAnalyzer_manufacturer,
			    nAnalyzer_model, nAnalyzer_Node, NULL);


   nImpact_severity = newAttribute("severity","high");
   nImpact_completion = newAttribute("completion","succeeded");
   nImpact_type = newAttribute("type","admin");
   nImpact = newImpact( nImpact_severity, nImpact_completion,
			nImpact_impact, NULL);
   nConfidence_rating = newAttribute("rating","2");
   nConfidence = newConfidence( nConfidence_rating, NULL);
   nAssessment = newAssessment( nImpact, nConfidence, NULL);


   nSource_Node_Address_category = newAttribute("category","ipv4-addr");
   nSource_Node_Address_address = newSimpleElement("address",
                                                   inet_ntoa(p->iph->ip_src));
   nSource_Node_Address = newAddress( nSource_Node_Address_category,
                                      nSource_Node_Address_address, NULL);
   nSource_Node = newNode( nSource_Node_Address, NULL);
   nSource_Service_port = newSimpleElement("port",sport);
   nSource_Service = newService(nSource_Service_port, NULL);
   nSource = newSource( nSource_Node, nSource_Service, NULL);

   nTarget_Node_Address_category = newAttribute("category","ipv4-addr");
   nTarget_Node_Address_address = newSimpleElement("address",
                                                   inet_ntoa(p->iph->ip_dst));
   nTarget_Node_Address = newAddress( nTarget_Node_Address_category,
                                      nTarget_Node_Address_address, NULL);
   nTarget_Node = newNode( nTarget_Node_category, nTarget_Node_location,
                           nTarget_Node_Address, NULL);
   nTarget_Service_port = newSimpleElement("port",dport);
   nTarget_Service = newService(nTarget_Service_port, NULL);
   nTarget = newTarget( nTarget_Node, nTarget_Service, NULL);

   nClassification_origin = newAttribute("origin","bugtraqid");
   nClassification_name = newSimpleElement("name","TELNET bsd exploit client finishing");
   nClassification_url = newSimpleElement("url","http://cve.mitre.org/cgi-bin/cvename.cgi?name=CAN-2001-0554");
   nClassification = newClassification( nClassification_origin,
					nClassificatio_name,
					nClassification_url, NULL);

   nCreateTime = newCreateTime(NULL); /* taking snapshot at last second */

   /* Construct the Alert */

   nAlert = newAlert( nAlert_ident, nAnalyzer, nCreateTime,
                      nSource, nTarget, nClassification, nAssessment, NULL);

   /* Construct the IDMEF_Message */  

   newIDMEF_Message( nMessage_version, nAlert, NULL);

   /* The validation will shout out to stderr if there is a problem */
   if(validateCurrentDoc())
      printCurrentMessage(log_file);
}

  The above code is obviously greater in size then the previous example.  It
  does, however, allow for situations where an important piece of data is
  not yet available.  Other pieces of the message can be built, and ready
  to be assembled together when all the data have been collected.

  I think you can see how a combonation of the above examples can be used
  to produce the third recommended way of building IDMEF XML messages, so I
  don't think I need to provide an example for that.  The following is sample
  output taken from an actual alert produced the the IDMEF XML plugin.

  NOTE: This message was produced using an older version of IDMEF.  It is not
        valid under version 0.7 of the IDMEF specification.  
 
<?xml version="1.0"?>
<!DOCTYPE IDMEF-Message PUBLIC "-//IETF//DTD RFCxxxx IDMEF v0.3//EN" "/home/bob/src/libidmef-0.6/idmef-message.dtd">
<IDMEF-Message version="0.3">
  <Alert ident="3564" impact="unknown">
    <Analyzer analyzerid="IDS_ONE">
      <Node category="dns">
        <location>SD_Network</location>
        <name>silverbulletproof</name>
        <Address category="ipv4-addr">
          <address>192.168.0.25</address>
        </Address>
      </Node>
    </Analyzer>
    <CreateTime ntpstamp="bea6c6fc.d34b0000">2001-05-11T20:15:56Z</CreateTime>
    <Source>
      <Node category="dns">
        <location>SD_Network</location>
        <Address category="ipv4-addr">
          <address>192.168.0.56</address>
        </Address>
      </Node>
      <Service>
        <port>2109</port>
      </Service>
    </Source>
    <Target>
      <Node category="dns">
        <location>SD_Network</location>
        <Address category="ipv4-addr">
          <address>192.168.0.211</address>
        </Address>
      </Node>
      <Service>
        <port>139</port>
      </Service>
    </Target>
    <Classification origin="vendor-specific">
      <name>NETBIOS NT NULL session</name>
      <url>http://www.whitehats.com/info/204</url>
    </Classification>
    <AdditionalData meaning="Packet Payload" type="string">.....SMBs...........,... ............u.........................G......W.i.n.d.o.w.s. .N.T. .1.3.8.1.....W.i.n.d.o.w.s. .N.T. .4...0..............1..\.\.S.U.P.P.O.R.T.1.-.W.I.N.2.K.\.I.P.C.$...IPC.</AdditionalData>
  </Alert>
</IDMEF-Message>

  By default, the message is output without indentations, because in XML
  whitespace is significant.  To indent IDMEF messages (as shown above),
  you can call the following libxml function before outputing the message:

         xmlKeepBlanksDefault(0); 
  
  This sets a global variable, and will remain that way untill the function
  is called again with a 1 passed in.  Keep in mind, if you use indentations,
  you will be adding whitespace.  When you reload a document, the whitespace
  will be preserved.  As a rule of thumb, I'd say don't call this function
  if you are storing or sending the data.  For more information, look at:

     http://www.xmlsoft.org/html/gnome-xml-parser.html#XMLKEEPBLANKSDEFAULT
  
  and the mailing list thread titled "Indentation ?" (Thomas Poindessous, 
  Thu Aug 31 2000 - 16:54:06 EDT) at:
 
    http://www.xmlsoft.org/messages/

4) Before you create another message, you need to call clearCurrentDoc() to
   free the document structure.  After that, you can call createCurrentDoc()
   again and build another message.   


----------------------------------------------------------
 Section 2: Parsing an IDMEF Message
----------------------------------------------------------

The Basics of Parsing
---------------------

Under normal operation the parsing process hides all the details of XML.
The user call's one of two top-level functions that both return the
IDMEF-Message in the form of C style structures defined in idmefxml_types.h.
The structures are nested.  For example an IDMEFmessage structure contains
a pointer to an array of alerts and/or heartbeats.  Those structures in
turn contain C structures representing the child elements contained in the
alert/heartbeat.

Getting Started
---------------

Parsing normally involves calling one of two top-level functions.  The first
one, get_idmef_message_from_file() takes a filename as an argument.  The
file must contain ONLY ONE IDMEF-Message.  The other function
get_idmef_message() takes a pointer to a char * containing ONLY ONE
IDMEF-Message.  Please note that the structures returned contain DYNAMICALLY
ALLOCATED STORAGE.  Consequently the user must call free_message() when
done to insure that the memory is freed.

A typical example of how to use the parsing functions follows:

IDMEFmessage *message = 0;
char *filename = "/tmp/idmef-message.xml";

message = get_idmef_message_from_file(filename);

fprintf(stdout, "message->version=\'%s\'\n", message->version);

...

free_message(message);

The remaining functions (all declarations found in idmefxml_parse.h) are
mainly used internally by the get_idmef_message() functions however the
user is free to use them if required/desired.  The parse_ functions all take
xmlNodePtr's which are ASSUMED to contain a valid XML tree containing the
expected elements/attributes for the type being parsed.  Attempting to parse
an invalid XML tree will result in undefined behavior.  Note that each
parse_XXX function has a corresponding free_XXX function.


----------------------------------------------------------
 Section 3: Generating an IDMEF from C structures
----------------------------------------------------------

libidmef has a third interface which allows the user to generate IDMEF
messages directly from the C structures found in idmefxml_types.h.
It is actually merely a wrapper around the original message creation API.
As such it serves as a good example of how to use said functionality.  It
is recommended that the user use the original message creation API defined
above.
Note that this interface has virtually no error handling and relies on the
underlying message creation code API.  One main advantage of using this
interface is that it will produce messages in the correct order, however
it is still possible to create IDMEF messages that will fail validation.

A typical example of how to use the gen_ interface follows:

char *text_message;
IDMEFmessage *message = (IDMEFmessage *)malloc(sizeof(IDMEFmessage));

message->alerts[0] = (IDMEFmessage *)malloc(sizeof(IDMEFalert));
message->nalerts = 1;

...

message->version = "1.0";
message->alert->ident = "ALERT #1";

...

text_message = generate_idmef_message(message);

fprintf(stdout, text_message);

free_message(message);

One important note when using the gen_ interface is memory usage. If you
create local variables on the stack and point to them DO NOT call
free_message().  If you use the heap you need for some or all of the structures
make sure to free them with the proper free_XXX function.  DO NOT free
structures containing non-heaped space with the provided free_XXX function,
you must free each element manually if you mix.  The bottom line is that it's
better to go all or nothing -- heap or stack...


----------------------------------------------------------
 Section 4: General Information
----------------------------------------------------------

libidmef was originally designed to create IDMEF messages from raw data only.
The parsing and generation from the C structure data model functionality was
added after the libraries original release.  While we tried our best to
integrate the features as much as possible it is still a bit disjoint.  If
you the user have any suggestions about how we could better integrate the
functionality or add new functionality please contact us.

Tips
----

Read the API.  It contains documentation for all IDMEF construction functions 
in this library.

Read the FAQ.

We recomend getting familiar with libxml, because of the large span and 
diversity of it's functionality.  If there are functions in libxml that you
would like to be implemented more specifically for IDMEF XML, please let us 
know.  Use the Libxml Library Reference [4], and visit xmlsoft.org for more 
information.

Keep the IDMEF XML draft [1] nearby for reference. 

Look at the file example1.c, parsing_example1.c and gen_example1.c

Look at the IDMEF XML plugin as a reference for using libidmef.  It is
available for download at http://www.silicondefense.com/idwg/snort-idmef/

Known Issues
------------

Should any come up, they will dealt with accordingly.  This library has been 
tested on:

 +  Redhat 6.1, 7.0, 7.1, 7.2
 +  Debian 2.2 running Linux 2.4.2
 +  FreeBSD 4.2, 4.5
 +  OpenBSD 2.6 - 2.8
 +  Solaris 8

If you notice a portability issue that can be resolved, please inform me.

+ OpenBSD (and presumably other BSD's) seems to require the libz library to be
loaded in for IO functions in libxml (unless you configure libxml without it).
Use the -lz tag when compiling your program.

+ When linking with libxml, be sure to use -lxml2 instead of -lxml.  Certain
subtile yet devious errors have arose out of using the later library.

+ You may get compile and link issues with Solaris.  See FAQ #8 and #9.

+ libxml header files are located under /usr/local/include/libxml2/libxml by
  default (Except for RedHat RPM's, which put them under /usr/include/libxml2).
  This means you will have to include the path when compiling any program using
  libidmef.  Add the following compiler flag to your command line:

      -I/usr/local/include/libxml2

  You can also just imbed the xml2-config --cflags return values in
  your command:

      gcc `xml2-config --cflags` -o example1 example1.c -lidmef -lm -lxml2

  There are other methods (hacks?) to work around this, but we leave it as an
  exercise to the user to figure out.

Please send bug reports, comments, suggestions and flames to:

	libidmef-devel@lists.sourceforge.net

Mailing Lists
-------------

There are two mailing lists.  libidmef-users and libidmef-devel.

To find out about, or to subscribe to libidmef-users, visit:

http://lists.sourceforge.net/lists/listinfo/libidmef-users

To find out about, or to subscribe to libidmef-devel, visit:

http://lists.sourceforge.net/lists/listinfo/libidmef-devel

Credits
-------

[1] IDMEF XML DTD draft, 
    http://www.silicondefense.com/idwg/draft-ietf-idwg-idmef-xml-06.txt

[2] Libxml by Daniel Veillard. http://www.xmlsoft.org/

[3] Snort by Martin Roesch. http://www.snort.org/

[4] Libxml Library Reference. http://www.xmlsoft.org/html/libxml-lib.html

The following people helped significantly with development ideas and support.

 - Stuart Staniford, Silicon Defense, stuart@silicondefense.com
 - Jason Coit, Silicon Defense, jasonc@silicondefense.com
 - James Hoagland, Silicon Defense, hoagland@silicondefense.com
