Showing posts with label XML Data Binding. Show all posts
Showing posts with label XML Data Binding. Show all posts

Tuesday, 14 February 2012

XML Data Binding - Part 3: CodeSynthesis XSD example

In my previous article about XML Data Binding, I demonstrated how to use gSOAP in order to convert data from XML document into in-memory C++ objects and vice versa. Today I will show how to use another tool, CodeSynthesis XSD, to perform the same task.

CodeSynthesis XSD depends on Apache Xerces-C++ XML parser so you need to download and set up Xerces in your development environment first. Setup of both tools is described in README.txt file you can find after unpacking downloaded CodeSynthesis XSD archive.

In order to compare gSOAP and CodeSynthesis Data Binding process, let's create a project that does the same XML processing, like gSOAP one: loads XML, reads and displays data, adds new element, displays data again and saves chnages to XML.

We are going to use the same XML schema - library.xsd:



If XML documents use XML schema grammars Xerces parser requires them to specify location of their XML schemas (by using an xsi:schemaLocation attribute if they use namespaces, and an xsi:noNamespaceSchemaLocation if not).



NOTE: Make sure xml document and its schema are in the same directory (application's working directory).

Similar to gSOAP case, we need to compile schema into C++ classes. CodeSynthesis schema compiler is xsd.exe and can be found in the bin directory of the package (e.g. ..\xsd-3.3.0-i686-windows\bin\). This directory should be added to Path environment variable. We want to generate C++/Tree mapping and code for serialization (object to XML; this code is not generated by default) so call XSD compiler with following parameters:

c:\test\XercesCodeSynthesis_Test1>xsd cxx-tree --generate-serialization library.xsd

It creates two files, library.hxx and library.cxx, and we need to include them into our project as they contain definitions of proxy classes.

Again, like we had with gSOAP, CodeSynthesis XSD generates classes that match XML document elements, by name and structure. We can see that in the header, library.hxx (some parts are omitted):



Schema complier has generated and Library_ functions that serialize/deserialize data to/from XML file.

main.cpp contains code that loads XML document into Library object, traverses through its member (vector) Books and displays all Book elements; it then adds a new Book to the collection, displays it again and serializes back to XML document in file:



Output:


Displaying all books in the library:

Book:
Title:Clean Code
Author:Robert C. Martin
ISBN: 0132350882
Copies available: 2

Book:
Title:The Pragmatic Programmer
Author:Andrew Hunt
ISBN: 020161622X
Copies available: 0

Book:
Title:Design patterns
Author:Erich Gamma
ISBN: 0201633612
Copies available: 1


Adding a new book:
Title: Effective C++
Author: Scott Meyers
ISBN:0321334876
Copies available: 50


Displaying all books in the library:

Book:
Title:Clean Code
Author:Robert C. Martin
ISBN: 0132350882
Copies available: 2

Book:
Title:The Pragmatic Programmer
Author:Andrew Hunt
ISBN: 020161622X
Copies available: 0

Book:
Title:Design patterns
Author:Erich Gamma
ISBN: 0201633612
Copies available: 1

Book:
Title:Effective C++
Author:Scott Meyers
ISBN: 0321334876
Copies available: 50

library.xml is changed - a new Book element has been added:



Links and References:
Boris Kolpackov: An Introduction to XML Data Binding in C++

Thursday, 9 February 2012

XML Data Binding - Part 2: gSOAP example

In my previous post, I explained the benefits of XML Data Binding. In this article I will show how to use gSOAP for conversion of data stored in XML format into objects and vice versa.

Download and unpack the latest gSOAP release package. In the previous article I said that XML Data Binding tools compile XML schemas and create C++ classes that represent XML elements. gSOAP's (Win32) compiler is located in ..\gsoap_2.8.6\gsoap-2.8\gsoap\bin\win32 directory and its full path (e.g. c:\tools\gsoap_2.8.6\gsoap-2.8\gsoap\bin\win32) should be added to Path environment variable. This directory contains two components of gSOAP XML compiler: wsdl2h.exe, which compiles XML schema to intermediate header file and soapcpp2.exe, which generates classes (in the header and source file that we need to include in our C++ project).

We need to get XML schema from our XML file. Let's use XML file based on the one from the previous article but modified by including namespace and using proper naming convention (title case for elements and camelcase for attributes):

library.xml:



We can generate schema from this XML in Visual Studio: select XML item in the main menu and click on Create Schema in the drop down menu. Visual Studio generates schema document in Russian doll design style. It supports only this XSD design pattern because it is the most restrictive one.

library.xsd:



Save both XML and XSD files in the project directory.

Now let us compile schema. This is a two step process. Schema is passed to wsdl2h.exe which generates intermediate header. soapcpp2.exe uses that header to create proxy C++ classes for data binding (header file with their declarations and source file with their definitions).

We need to provide namespace used in our XML document ("gt") as gSOAP would otherwise use its generic namespace name ("ns1") when generating data type names and when serializing our data object back to the XML.

c:\test\gSOAP_Test1>wsdl2h.exe -t "c:\tools\gsoap_2.8.6\gsoap-2.8\gsoap\
typemap.dat" -N "gt" Library.xsd

** The gSOAP WSDL/Schema processor for C and C++, wsdl2h release 2.8.6
** Copyright (C) 2000-2011 Robert van Engelen, Genivia Inc.
** All Rights Reserved. This product is provided "as is", without any warranty.

** The wsdl2h tool is released under one of the following two licenses:
** GPL or the commercial license by Genivia Inc. Use option -l for details.

Saving Library.h

Reading type definitions from type map file 'c:\tools\gsoap_2.8.6\
gsoap-2.8\gsoap\typemap.dat'

Reading file 'Library.xsd'...
Done reading 'Library.xsd'

To complete the process, compile with:
> soapcpp2 Library.h
or to generate C++ proxy and object classes:
> soapcpp2 -j Library.h

We just need to follow the instruction given in the report above - call soapcpp2:

c:\test\gSOAP_Test1>soapcpp2 -I "c:\tools\gsoap_2.8.6\gsoap-2.8\gsoap\
import" Library.h

** The gSOAP code generator for C and C++, soapcpp2 release 2.8.6
** Copyright (C) 2000-2011, Robert van Engelen, Genivia Inc.
** All Rights Reserved. This product is provided "as is", without any warranty.

** The soapcpp2 tool is released under one of the following two licenses:
** GPL or the commercial license by Genivia Inc.

Saving soapStub.h annotated copy of the input declarations
Saving gt.nsmap namespace mapping table
Saving soapH.h interface declarations
Saving soapC.cpp XML serializers

Compilation successful

c:\test\gSOAP_Test1>

As we can see in the report, class declarations are in the following header:

soapStub.h (irrelevant parts omitted):



If observing original XML document and generated classes, we can see the parallel between them: element types are mapped to classes; single children nodes and attributes are mapped to class members; sequence is mapped to a vector. _gt__Library class matches Library element. Its members, Books and Staff match XML elements of the same name. In XML, these nodes are of the sequence type so their C++ implementation (_gt__Library_Books and _gt__Library_Staff) uses STL collection type (vector) to model them. Books element contains Book elements so _gt__Library_Books's vector member contains elements of type _gt__Library_Books_Book. In the same way, Librarian element is mapped to _gt__Library_Staff_Librarian class and _gt__Library_Staff's vector contains its instances.

Class names look a bit ugly but that is because they are made by joining namespace name("gt") and element type name. If namespace isn't specified in the schema, gSOAP uses generic namespace name - "ns1". If element name contains underscore, that character is replaced with "_USCORE" because gSOAP maps hyphens to normal underscores [source].

For Russian doll styled schema, class members and vector elements are objects. This is not the case for schemas designed in Salami slice or Venetian blind styles: class members and vector elements are pointers to objects. This happens even if minOccurs attribute is set to 1. I don't know how to force gSOAP to generate classes that force composition class relationship for any design pattern of the schema provided. I found here one explanation of gSOAP's reasoning: gSOAP generates pointers when it needs to be able to represent a NULL value. If you have defined an element with minOccurs = "0", then you will get a pointer generated in your code. You can then inspect this pointer. If it is NULL, then you know that the element is not present. Conversely, you can choose to set the pointer, or not, to indicate that the element is present or not. Another author says: there are many different ways to define XML Schemas and the design choice can seriously impact the generation of implementation classes in the technology of your choice. There are different schema design styles such as the Russian Doll, Venetian Blind and Garden of Eden that can be followed.

Another generated header is soapH.h (snippet):



We can use these two generated functions to read the content of the root element (Library) into object and to write it back to the XML document. gSOAP compiler has done a great job for us!

Anyway, let's see what we can do with generated classes.

Before compiling your test project, make sure you have added the following paths to Additional Include Directories in Project Settings: c:\DEVELOPMENT\Toolkits\gsoap_2.8.6\gsoap-2.8\gsoap; c:\DEVELOPMENT\Toolkits\gsoap_2.8.6\gsoap-2.8\gsoap\import. Also, make sure you've included soapH.h, soapStub.h, soapC.cpp and stdsoap2.cpp into the project.

In order to use gSOAP engine, we need to create instance of gSOAP runtime context - struct soap. There is a sequence of commands that initialize and clean up this object so I wrapped it into RAII compliant class (CScopedSoap) which makes its usage exception safe.

Notice how it's easy to modify data when we are dealing with objects instead of digging and traversing DOM tree. Adding a new Book is nothing more than adding a new _gt__Library_Books_Book object at the end of the vector!

main.cpp:



Patch I applied in LoadXML() function is necessary as soap_read__gt__Library() for some reason sets mode of the standard input stream to BINARY although it reads data from a file stream. It never reverts stdio's mode back to TEXT. This has a consequence of getline() returning a string that contains Carriage Return character at the end and that character appears in new elements inserted into the XML document. I've posted a question about this on Stack Overflow and will update this article on this as soon as I clarify gSOAP's behaviour in this case.

This is the application's output:

Displaying all books in the library:

Book:
Title:Clean Code
Author:Robert C. Martin
ISBN: 0132350882
Copies available: 2

Book:
Title:The Pragmatic Programmer
Author:Andrew Hunt
ISBN: 020161622X
Copies available: 0

Book:
Title:Design patterns
Author:Erich Gamma
ISBN: 0201633612
Copies available: 1


Adding a new book:
Title: Effective C++
Author: Scott Meyers
ISBN:0321334876
Copies available: 50


Displaying all books in the library:

Book:
Title:Clean Code
Author:Robert C. Martin
ISBN: 0132350882
Copies available: 2

Book:
Title:The Pragmatic Programmer
Author:Andrew Hunt
ISBN: 020161622X
Copies available: 0

Book:
Title:Design patterns
Author:Erich Gamma
ISBN: 0201633612
Copies available: 1

Book:
Title:Effective C++
Author:Scott Meyers
ISBN: 0321334876
Copies available: 53

And XML contains a new Book element:

library.xml:



Links and References:

The gSOAP Toolkit for SOAP Web Services and XML-Based Applications
gSOAP 2.8.7 User Guide
Genivia gSOAP
Robert van Engelen: "gSOAP & Web Services"
gSOAP Yahoo group
gSOAP tagged question on Stack Overflow

Monday, 6 February 2012

XML Data Binding - Part 1: Why do we need it?

Your application uses some XML parsing tool (Xerces, libxml, TinyXML, TinyXML++, RapidXml, PugiXML,...) in order to load XML file into a document object, then you traverse through a DOM tree structure, look for nodes, their children, search their attributes so can modify its content (add, modify or delete elements or attributes)...and all this usually requires lots of loops and string comparisons which creates huge, hard to maintain code. Wouldn't be better if you could load XML document into some object in memory which attributes match elements in your XML? You would then be dealing with (C++) objects instead of the complex DOM tree which is much quicker, easier, type safe and less error prone.

Let's say we have some XML that keeps track of the state of the local library. To keep model simple, we can say that library comprises books and staff. Each book has its title, author and ISBN number. Each member of the staff, librarian, has a name. XML document could look like this:

library.xml


We can map each class of nodes into a C++ class where class attributes are node's attributes and its children; siblings are stored in a vector. We need a single instance of the class which represents a root node. Its constructor loads XML document from a file on disk and its destructor saves (eventually modified) XML document back to the file. Our class model and use case might look like this:

main.cpp:



XML document was loaded into object, modified by adding a new book and saved back into the file in just three lines of code! Awesome! But this code is unfinished and actually doesn't work properly in the real life as I omitted the hardest bit: loading and parsing XML in library's constructor and serializing/marshalling object back to the file in the destructor. All I wanted to show was how quick and easy is to manipulate XMLs when representing them through objects - a concept which is known as XML Data Binding.

Another problem steams from the fact that for each new XML we would need to write completely new classes - doing this manually is a no way to go but luckily there are tools that do this automatically. They compile XML schema (a document which specifies XML document itself) into a set of classes, following precise rules of mapping XML into objects (Object/XML Mapping or O/X Mapping). In next two articles I will show how to use gSOAP and CodeSynthesis XSD for this.

Links and References:
XML Data Binding Tools