SaxMagique is a c++ library that offers an API for creating a saxparser. It is intended to be the simplest way theoretically possible to create a complex saxparser. It is not a replacement of expat or xerces, but an enhancement. In fact, it depends on expat library (could easily be adapted to xerces or others) for which it build-up the [start|end]Element and characters handlers needed.
The library is available as a tar.gz at http://sourceforge.net/projects/saxmagique/. You have to compile it:
]$ tar xzvf libsaxmagique-1.0.0.tar.gz ]$ cd libsaxmagique-1.0.0 ]$ ./configure ]$ make ]$ sudo make install
If your not sudoer, you need to tell to the configure script where you want to install the package:
]$ ./configure --prefix=/home/username --exec-prefix=/home/username ]$ make ]$ make install
Don't forget to add this directory to your LD_LIBRARY_PATH environment variable if you want to use the library.
SaxMagique is a c++ library that uses the stl and boost to compile. libexpat and libz need to be links. For libexpat, you'll need a version >= 1.95.8 in order to successfully compile. For boost, version >= 1.33.1 is required. If these libraries are not in a standard directory, use --with-boost=dir or --with-expat=dir. For example, if you got these libraries from sources and installed them in /usr/local:
]$ ./configure --with-boost=/usr/local --with-expat=/usr/local
The principal class of the library is called SaxMagique. The API of SaxMagique calque the API of xsd schema. Indeed, starting from the following xsd schema:
<xs:schema> <xs:complexType name="Type_d"> <xs:attribute name="c" type="xs:integer"/> </xs:complexType> <xs:complexType name="Type_a"> <xs:attribute name="b" type="xs:string"/> <xs:element name="d" type="Type_d"/> </xs:complexType> <xs:complexType name="Type_e"> <xs:complexContent> <xs:extension type="a"> <xs:attribute name="f" type="char"/> <xs:extension> </xs:complexContent> </xs:complexType> <xs:element name="rootnode" type="e"/> </xs:schema>
And the corresponding c++ classes:
class D { int m_c; public: void set_c( int c ) { m_c = c; } int get_c() { return m_c; } }; class A { std::string m_b; D m_d; public: void set_b( const std::string& b ) { m_b = b; } const std::string& get_b() { return m_b; } void set_d( const D& d ) { m_d = d; } const D& get_d() { return m_d; } }; class E : public A { char m_f; public: void set_f( char f ) { m_f = f; } char get_f() { return m_f; } };
The SaxMagique that parse a xml file from this schema and puts the data into corresponding c++ classes is created by:
// Instantiation of the parser SaxMagique myparser; // Creation of nodes for the three complexType. // The nodes are owned by myparser. saxmagique::ComplexType< D >* ptrNode_D = myparser.complexType< D >(); saxmagique::ComplexType< A >* ptrNode_A = myparser.complexType< A >(); saxmagique::ComplexType< E >* ptrNode_E = myparser.complexType< E >(); // The complexTypes are detailed. ptrNode_D->attribute( "c", saxmagique::SimpleType< int, Patlac::Atoi >(), &D::set_c ); ptrNode_A->element< D >( "d", &A::set_d ) ->attribute( "b", saxmagique::SimpleType< std::string >, &A::set_b ); ptrNode_E->extension( ptrNode_D ) ->attribute( "f", saxmagique::SimpleType< char, Patlac::Atoc() >, &D::set_f ); // The root node is declared. // A pointer is keeped on this node. TopLevelElement* ptrTopLevelElement = myparser.topLevelElement( "e", ptrNode_E );
Each line consist only on the minimum: what is the relation ( attribute, element, extension ), what is the marker, what is the corresponding c++ type and to which member function shall it be given.
Now this parser can be use with four lines :
// Set the file to be parsed. myparser.setFileName( "test.xml" ); // Create a receptable and give it to the parser. E my_e; ptrTopLevelElement->assign_object( my_e ); // Parse. myparser.parse();
Yes it's fast, but it could be faster.
SaxMagique has been developed together with the Patlac::Xml2cpp software. This sofware is a code generator that translates a xml schema of type xsd into a c++ library containing classes defined by the xsd schema along with a saxparser and a write_xml function for serailization of xml documents.
Patlac::Xml2cpp is intended to be totally customizable. However, the version release uses SaxMagique to create the saxparsers.
It is available at http://sourceforge.net/projects/patlac--xml2cpp.
A growing collection of library as already been generated by Patlac::Xml2cpp. These libraries include:
- uniprot - uniref - pepXML - protXML
Unfortunately, the <xs:any> field in not yet implemented, but it is planned to be. These libraries can be downloaded from http://sourceforge.net/projects/data-libs.
SaxMagique is a complicated machinery that make heavy use advanced c++. The first release of this library is not yet totally contributor friendly, it will slowly tends to become more.
To be sure that every things are created only when needed and destructed one and only one time, the policies listed below has been chosen. It would be good programming practice to enforced them in a non-violable manner. It has to be done tactical usage of private, friend and inheritance concepts.
Copyright 2007 Patrick Lacasse