Thursday 9 February 2012

XML Data Binding - Part 2: gSOAP example

In my previous post, I explained the benefits of XML Data Binding. In this article I will show how to use gSOAP for conversion of data stored in XML format into objects and vice versa.

Download and unpack the latest gSOAP release package. In the previous article I said that XML Data Binding tools compile XML schemas and create C++ classes that represent XML elements. gSOAP's (Win32) compiler is located in ..\gsoap_2.8.6\gsoap-2.8\gsoap\bin\win32 directory and its full path (e.g. c:\tools\gsoap_2.8.6\gsoap-2.8\gsoap\bin\win32) should be added to Path environment variable. This directory contains two components of gSOAP XML compiler: wsdl2h.exe, which compiles XML schema to intermediate header file and soapcpp2.exe, which generates classes (in the header and source file that we need to include in our C++ project).

We need to get XML schema from our XML file. Let's use XML file based on the one from the previous article but modified by including namespace and using proper naming convention (title case for elements and camelcase for attributes):

library.xml:



We can generate schema from this XML in Visual Studio: select XML item in the main menu and click on Create Schema in the drop down menu. Visual Studio generates schema document in Russian doll design style. It supports only this XSD design pattern because it is the most restrictive one.

library.xsd:



Save both XML and XSD files in the project directory.

Now let us compile schema. This is a two step process. Schema is passed to wsdl2h.exe which generates intermediate header. soapcpp2.exe uses that header to create proxy C++ classes for data binding (header file with their declarations and source file with their definitions).

We need to provide namespace used in our XML document ("gt") as gSOAP would otherwise use its generic namespace name ("ns1") when generating data type names and when serializing our data object back to the XML.

c:\test\gSOAP_Test1>wsdl2h.exe -t "c:\tools\gsoap_2.8.6\gsoap-2.8\gsoap\
typemap.dat" -N "gt" Library.xsd

** The gSOAP WSDL/Schema processor for C and C++, wsdl2h release 2.8.6
** Copyright (C) 2000-2011 Robert van Engelen, Genivia Inc.
** All Rights Reserved. This product is provided "as is", without any warranty.

** The wsdl2h tool is released under one of the following two licenses:
** GPL or the commercial license by Genivia Inc. Use option -l for details.

Saving Library.h

Reading type definitions from type map file 'c:\tools\gsoap_2.8.6\
gsoap-2.8\gsoap\typemap.dat'

Reading file 'Library.xsd'...
Done reading 'Library.xsd'

To complete the process, compile with:
> soapcpp2 Library.h
or to generate C++ proxy and object classes:
> soapcpp2 -j Library.h

We just need to follow the instruction given in the report above - call soapcpp2:

c:\test\gSOAP_Test1>soapcpp2 -I "c:\tools\gsoap_2.8.6\gsoap-2.8\gsoap\
import" Library.h

** The gSOAP code generator for C and C++, soapcpp2 release 2.8.6
** Copyright (C) 2000-2011, Robert van Engelen, Genivia Inc.
** All Rights Reserved. This product is provided "as is", without any warranty.

** The soapcpp2 tool is released under one of the following two licenses:
** GPL or the commercial license by Genivia Inc.

Saving soapStub.h annotated copy of the input declarations
Saving gt.nsmap namespace mapping table
Saving soapH.h interface declarations
Saving soapC.cpp XML serializers

Compilation successful

c:\test\gSOAP_Test1>

As we can see in the report, class declarations are in the following header:

soapStub.h (irrelevant parts omitted):



If observing original XML document and generated classes, we can see the parallel between them: element types are mapped to classes; single children nodes and attributes are mapped to class members; sequence is mapped to a vector. _gt__Library class matches Library element. Its members, Books and Staff match XML elements of the same name. In XML, these nodes are of the sequence type so their C++ implementation (_gt__Library_Books and _gt__Library_Staff) uses STL collection type (vector) to model them. Books element contains Book elements so _gt__Library_Books's vector member contains elements of type _gt__Library_Books_Book. In the same way, Librarian element is mapped to _gt__Library_Staff_Librarian class and _gt__Library_Staff's vector contains its instances.

Class names look a bit ugly but that is because they are made by joining namespace name("gt") and element type name. If namespace isn't specified in the schema, gSOAP uses generic namespace name - "ns1". If element name contains underscore, that character is replaced with "_USCORE" because gSOAP maps hyphens to normal underscores [source].

For Russian doll styled schema, class members and vector elements are objects. This is not the case for schemas designed in Salami slice or Venetian blind styles: class members and vector elements are pointers to objects. This happens even if minOccurs attribute is set to 1. I don't know how to force gSOAP to generate classes that force composition class relationship for any design pattern of the schema provided. I found here one explanation of gSOAP's reasoning: gSOAP generates pointers when it needs to be able to represent a NULL value. If you have defined an element with minOccurs = "0", then you will get a pointer generated in your code. You can then inspect this pointer. If it is NULL, then you know that the element is not present. Conversely, you can choose to set the pointer, or not, to indicate that the element is present or not. Another author says: there are many different ways to define XML Schemas and the design choice can seriously impact the generation of implementation classes in the technology of your choice. There are different schema design styles such as the Russian Doll, Venetian Blind and Garden of Eden that can be followed.

Another generated header is soapH.h (snippet):



We can use these two generated functions to read the content of the root element (Library) into object and to write it back to the XML document. gSOAP compiler has done a great job for us!

Anyway, let's see what we can do with generated classes.

Before compiling your test project, make sure you have added the following paths to Additional Include Directories in Project Settings: c:\DEVELOPMENT\Toolkits\gsoap_2.8.6\gsoap-2.8\gsoap; c:\DEVELOPMENT\Toolkits\gsoap_2.8.6\gsoap-2.8\gsoap\import. Also, make sure you've included soapH.h, soapStub.h, soapC.cpp and stdsoap2.cpp into the project.

In order to use gSOAP engine, we need to create instance of gSOAP runtime context - struct soap. There is a sequence of commands that initialize and clean up this object so I wrapped it into RAII compliant class (CScopedSoap) which makes its usage exception safe.

Notice how it's easy to modify data when we are dealing with objects instead of digging and traversing DOM tree. Adding a new Book is nothing more than adding a new _gt__Library_Books_Book object at the end of the vector!

main.cpp:



Patch I applied in LoadXML() function is necessary as soap_read__gt__Library() for some reason sets mode of the standard input stream to BINARY although it reads data from a file stream. It never reverts stdio's mode back to TEXT. This has a consequence of getline() returning a string that contains Carriage Return character at the end and that character appears in new elements inserted into the XML document. I've posted a question about this on Stack Overflow and will update this article on this as soon as I clarify gSOAP's behaviour in this case.

This is the application's output:

Displaying all books in the library:

Book:
Title:Clean Code
Author:Robert C. Martin
ISBN: 0132350882
Copies available: 2

Book:
Title:The Pragmatic Programmer
Author:Andrew Hunt
ISBN: 020161622X
Copies available: 0

Book:
Title:Design patterns
Author:Erich Gamma
ISBN: 0201633612
Copies available: 1


Adding a new book:
Title: Effective C++
Author: Scott Meyers
ISBN:0321334876
Copies available: 50


Displaying all books in the library:

Book:
Title:Clean Code
Author:Robert C. Martin
ISBN: 0132350882
Copies available: 2

Book:
Title:The Pragmatic Programmer
Author:Andrew Hunt
ISBN: 020161622X
Copies available: 0

Book:
Title:Design patterns
Author:Erich Gamma
ISBN: 0201633612
Copies available: 1

Book:
Title:Effective C++
Author:Scott Meyers
ISBN: 0321334876
Copies available: 53

And XML contains a new Book element:

library.xml:



Links and References:

The gSOAP Toolkit for SOAP Web Services and XML-Based Applications
gSOAP 2.8.7 User Guide
Genivia gSOAP
Robert van Engelen: "gSOAP & Web Services"
gSOAP Yahoo group
gSOAP tagged question on Stack Overflow

2 comments:

softweyr said...

Thanks for this blog post. I've been searching for a simple example of how to use gSOAP to serialize to a file (or other I/O stream) for days, and finally tripped across your example. I did discover a buglet in the code, though.

The XSD generated by VS isn't quite right. It assigns the type xs:unsignedByte to CopiesAvailable tag. At least on Mac OS, this causes problems in correctly serializing the value read from cin; I get 49 for '1', 50 for '2', etc. If you correct the XSD to type xs:unsignedInt, it works correctly.

Anyhow, thanks for a great simple example. I'm going to finish my example -- using gSOAP as a serializer for ZeroMQ queues, and link a trackback to your post, since you got me started.

vhyom said...

can you help me with generating code which does not check for namespace prefix. as you used 'gt' in your example. I want nothing to be used.