XML data binding

XML data binding refers to a means of representing information in an XML document as a business object in computer memory. This allows applications to access the data in the XML from the object, rather than using the DOM or SAX to retrieve the data from a direct representation of the XML itself.

It makes it possible to read and write XML data using a programming language class library (e.g. C++, C#, Java), specifically created for a given XML data format.^[1] Whilst it is possible to manually write a computer program to achieve this, XML data binding tools generate the source code to perform these tasks.

Description

An XML data binder accomplishes this by automatically creating a mapping between elements of the XML schema of the document we wish to bind and members of a class to be represented in memory.

When this process is applied to convert an XML document to an object, it is called unmarshalling (also called deserialization). The reverse process, to serialize an object as XML, is called marshalling.

Approaches to data binding can be distinguished as follows:

XML schema based: Based on an existing XML schema, classes that correspond to the schema are generated.
Class based: Based on a set of classes to be serialized, a corresponding XML schema is generated.
Mapping-based: A mapping description, usually itself an XML document, describes how an existing XML schema maps to a set of classes, and vice versa.

Difficulties

Since XML is a document-oriented format and objects are (usually) not document-oriented, simple XML data binding mappings may ignore some of the structural information embedded in an XML document. Specifically, information such as comments, XML entity references, and sibling order may not be preserved in the object representation created by the binding application. However, this is not always the case; sufficiently powerful XML data binding tools are capable of preserving 100% of the information stored in an XML document.

Similarly, since objects residing in computer memory are not inherently sequentially stored, and may include links to other objects (including self-referential links), simple XML data binding mappings may not be capable of preserving all the information about an object when it is marshalled to XML. However, sufficiently powerful data binding tools perform graph structure analysis on objects residing in memory to marshall (cyclic) object graph structures in XML by utilizing standard XML reference attributes.

Alternatives

An alternative approach to automatic data binding relies instead on hand-crafted XPath expressions that extract data from XML. This approach has some benefits but also has some drawbacks. First, the approach only needs proximate knowledge (e.g., topology, tag names, etc.) of the XML tree structure, which developers can determine by looking at the XML data. Furthermore, XPath allows the application to bind the relevant data items and filter out everything else, avoiding the unnecessary processing that would be required to completely unmarshall the entire XML document. The drawback of this approach is the lack of automation in implementing the object model and XPath expressions. Instead, the application developers have to create these artifacts manually, which is time-consuming, potentially error-prone, and hampers application maintenance when XML schemas and XML content models are updated. Another drawback is the lack of XML schema verification, which XML data bindings typically apply automatically during unmarshalling. Schema validity is typically required in secure applications.

Data binding in general

One of XML data binding's strengths is the ability to deserialize objects across programs, languages, and platforms.^[2] You can dump a time series of structured objects from a datalogger written in C (programming language) on an embedded processor, bring it across the network to process in Perl and finally visualize in Octave. The structure and the data remain consistent and coherent throughout the journey, and no custom formats or parsing is required. This is not unique to XML. YAML, for example, is emerging as a powerful data-binding alternative to XML. JSON (which can be regarded as a subset of YAML) is often suitable for lightweight or restricted applications.

XML data binding frameworks

Name	Programming Language	License	First release	Last stable release	Code generation from XSD	Custom mapping	Note
Apache Commons Betwixt	Java	Apache	January 28, 2003 (2003-01-28)	0.8	Unknown	Unknown	Dormant. Serializes objects to XML without requiring an XML schema definition
Apache XMLBeans	Java	Apache License 2.0		5.1.1, August 29, 2022 (2022-08-29)	Yes	Unknown
Castor	Java	Apache 2.0		1.4.1, May 15, 2016 (2016-05-15)	Unknown	Unknown	Earlier versions also supported Java-to-SQL persistence but this has since been forked into a separate project
CodeSynthesis XSD	C++	GNU GPL and proprietary		4.0.0, July 22, 2014 (2014-07-22)	Unknown	Unknown	with SAX or tree-like mapping into C++ classes
gSOAP	C and C++	GNU GPL and proprietary	December 8, 2000; 24 years ago (2000-12-08)	2.8.131, September 23, 2023 (2023-09-23)	Yes	Yes	Supports XML schema, WSDL, and SOAP; XML schemas are not required to serialize C/C++ data to XML; custom mapping of XML schema types to C/C++ types via a type mapping file and from C/C++ types to compatible XML schema types by source code annotation
Java Architecture for XML Binding (JAXB)	Java	?			Yes	Yes
JiBX	Java	BSD License		1.2.6, January 1, 2015 (2015-01-01)	Yes	Yes	Maps classes to XML schemas via bytecode manipulation
Liquid XML Data Binder	C++, C#, Java, Visual Basic.Net, Visual Basic 6 (COM)	Freeware and proprietry	June 1, 2001; 24 years ago (2001-06-01)	June 18, 2024; 12 months ago (2024-06-18)	Yes	Yes	Supports XML schema (XSD), DTD, XDR, WSDL. Serializes XML to JSON and JSON to XML.
Liquid XML Objects	C# and Visual Basic .Net (Supports XSD 1.1)	Freeware and proprietry	March 3, 2019; 6 years ago (2019-03-03)	June 18, 2024; 12 months ago (2024-06-18)	Yes	Yes	Direct replacement for XSD.exe. Integrated within Microsoft Visual Studio. Supports XML schema (XSD 1.0 and XSD 1.1), DTD, WSDL. Serializes XML to JSON and JSON to XML.
Simple	Java	Apache 2.0		2.7.1, February 9, 2017 (2017-02-09)	No	Yes
System.Xml.Serialization	C#	?			Yes	No	Part of the .NET framework, contains XML data binding classes; includes `xsd.exe` tool to generate classes from XSD schema
xmlbeansxx	C++	Apache 2.0		0.9.1, April 1, 2008 (2008-04-01)	Unknown	Unknown	C++ port of Apache XMLBeans
XStream	Java	BSD-style license	January 1, 2004; 21 years ago (2004-01-01)	1.4.10, May 23, 2017 (2017-05-23)	Unknown	Unknown	Also capable of serializing to JSON
Zeus	Java	?		3.5 beta, August 16, 2002 (2002-08-16)	Unknown	Unknown