Skip to main content

Introducing AXIOM: The Axis Object Model

May 10, 2005

{cs.r.title}









Contents
Introduction
Installing AXIOM
AXIOM Architecture
Using AXIOM
   Creating AXIOM from Scratch
   Adding Elements, Attributes, and Namespaces
   Serializing
   Building from an Existing Source
   Accessing the XML Infoset
   Getting StAX Events from an Element
Conclusion
Resources

Introduction

XML has become one of the major technologies used today for
business integration software evolution. Lots of object models are
being used today to manipulate XML in various ways. AXIOM will
improve XML manipulation by providing a new lightweight object
model built around pull parsing, enabling efficient and easy
manipulation of XML. AXIOM is the object model for
Apache Axis 2, the next
generation of the Apache web services engine. AXIOM is different
from existing XML object models in various ways, the major one
being the way it incrementally builds the memory model of an
incoming XML source.

AXIOM itself does not contain a parser and it depends on
StAX for
input and output.

This tutorial will first show you how to obtain AXIOM and it
will then go through the fundamental features of the AXIOM
architecture. You will learn how to create XML documents from
scratch, using elements, attributes, element content ("texts"), and
namespaces. You will see how to read and write XML files from and
to disk.

Installing AXIOM

AXIOM comes bundled with the
Axis2 M1
release. The lib directory contains the
axis-om-m1.jar file. However, more adventurous users can
download the latest source, via Subversion, from the

Apache Axis2
project and build the sources using
Maven. AXIOM is maintained
under the xml module of Apache Axis2. One can find
more information at the
Axis2 Subversion
site
.

AXIOM Architecture

AXIOM uses StAX reader and writer interfaces to interact with
the external world, as shown in Figure 1. However, you can still
use SAX and DOM to interact with AXIOM. Use of the standard StAX
interfaces will enable AXIOM to interact with any kind of input
source, be it an input stream, file, standard data binding tool,
etc.

AXIOM interaction
Figure 1. AXIOM interaction

Now let's take a deeper look at AXIOM architecture.

AXIOM uses a "builder" that will build the XML object model in
memory, according to the events pulled from the underlying StAX
parser, but will not create the entire object model at once.
Instead, it only builds when the relevant information is
absolutely required. This builder concept is the key to the most
promising feature, the deferred building support for AXIOM. The
builder comes into the picture when you are building an object
model from an existing resource. If you build the object model
programmatically, then you don't have to use builders.

This builder can optionally provide the events generated by the
StAX parser to the user directly, while building the object model
or not. This feature is called caching in AXIOM. This enables one
to work at the event level, minimizing the memory requirement, or
to work with the object model, improving performance.

If one opts to set the cache on (i.e., to build the
object model by pulling events), then he can later retrieve the infoset through the AXIOM API. At any particular time, the XML object
model will be either "partially" built or fully built. This concept
is new to the XML processing world. AXIOM builder builds the object
model only to the extent required by the ultimate user, but will
not build the whole model at once. For example, take the following
XML fragment.

[prettify]<Employees>
  <Employee>
    <Name>Eran Chinthaka</Name>
<strong>&lt;Project&gt;Axis2&lt;/Project&gt;</strong>
    &lt;WorkPlace&gt;Ambalangoda, Sri Lanka&lt;/WorkPlace&gt;
  &lt;/Employee&gt;
  &lt;Employee&gt;
     &lt;Name&gt;Ajith Harshana&lt;/Name&gt;
     &lt;Project&gt;Axis2&lt;/Project&gt;
     &lt;WorkPlace&gt;Kuliyapitiya, Sri Lanka&lt;/WorkPlace&gt;
        &lt;/Employee&gt;
&lt;/Employees&gt;
[/prettify]

Say the user wants to get the project of the first employee. This
will make the builder build the object structure representing only
up to the fourth line of the XML fragment. The rest will be kept
"untouched" in the stream and the object structure contains only up
to line 4. Then, if the user wants to know the project of the
second employee, the builder builds only up to line 9. All of these things
will happen transparently to the user, simply providing better
performance.

The relationship of the builder to the XML data and the object
model is shown in Figure 2.

AXIOM architecture
Figure 2. AXIOM architecture

One of the most interesting things about AXIOM is that the model
discussed so far is not by any means dependent on a particular
programming language. Therefore, the AXIOM architecture can be
implemented using any programming language that has an
implementation of StAX. Moreover, this concept does not talk about
the memory representation of the object model. The Axis2 project
contains an implementation of the concept with a linked list object
model, which has proven to be lightweight, fast, and efficient
compared to other object models. There was another parallel effort
made to implement this concept using a table model as well, which
is now in the scratch area of the Apache Axis2 project. Even though
the current major implementation of AXIOM uses a linked list model,
one can implement the same concept using any other suitable memory
model, as well.

AXIOM comes bundled with several builders:

  • StAXOMBuilder: This will build a generic memory
    model from any XML input source, such as a file, string, stream,
    etc.
  • StAXSOAPModelBuilder: This will build an object
    structure of SOAP XML in memory, which can be accessed using an
    "SOAPish" API. For example, when using it, you get a
    SOAPEnvelope class, with which you can call methods
    like getHeaders() and getBody(). But this
    API is still an extension of the generic AXIOM API. This is the
    model mainly used within the Axis2 project.
  • MTOMBuilder: This can be regarded as the first
    implementation of MTOM, the new API for sending attachments using
    some optimization algorithms. The latest AXIOM sources have full
    support for MTOM.

Please note that the current AXIOM implementation lags support
for processing instructions and DTD information items of the XML
infoset. But there is an ongoing effort within the Axis2 team to
provide these features as well.

Using AXIOM

Creating AXIOM from Scratch

You can create an AXIOM using different methods. Let's try to do
it programmatically this time.

[prettify]import org.apache.axis.om.OMElement;
import org.apache.axis.om.OMFactory;

public class FirstExample {
    public static void main(String[] args) {
        OMElement documentElement =
            OMFactory.newInstance().createOMElement(
                               "MyDocumentElement",
                               "http://chinthaka.org",
                               "myPrefix");
        documentElement.setValue("Sample Text");
    }
}
[/prettify]

The first line sets up the OMFactory (remember, the
"OM" in AXIOM stands for "object model"). This
OMFactory will enable to switch between different Java
implementations of AXIOM. For example, I mentioned earlier that the
current implementation is based on a linked list model. But if
someone needed to use her own implementation of the AXIOM API, she
could do that without touching a single line that uses those
classes. The OMFactory.newInstance() method is smart
enough to pick up the first implementation of AXIOM from the
classpath. For this reason, it is highly recommended that you create new
OM objects using the OMFactory.

Note that we have passed three parameters to create an
OMElement. AXIOM is very much aware of namespaces and
encourages the use of them. So the method signature is

createOMElement(String localName, String namespaceURI, String
namespacePrefix)
. If this namespace is already defined in
the scope, AXIOM will assign that to this element, without
declaring a new one.

Texts are also considered as nodes in AXIOM. You can either
create an OMText and add that to
OMElement, or you can simply use the
element.setValue() method.

Adding Elements, Attributes, and Namespaces

First, let's create a namespace that can be used later, and then
we'll use it to create the documentElement. Since we are
using the factory for object creation, let's assign that to a new
variable, omFactory, as well.

[prettify]OMFactory omFactory = OMFactory.newInstance();
OMNamespace ns =
    omFactory.createOMNamespace("http://chinthaka.org",
                                "myPrefix");

OMElement documentElement =
    omFactory.createOMElement("MyDocumentElement", ns);

OMElement secondEle =
    OMFactory.newInstance().createOMElement("SecondElement", ns);
secondEle.setValue("Sample Text");
secondEle.insertAttribute("myAttr", "attrValue", ns);
documentElement.addChild(secondEle);

documentElement.declareNamespace("http://something.com",
                                "somePrefix")
[/prettify]

Adding an attribute to an element is as easy as saying

element.insertAttribute(String attrName, String attrValue,
OMNamespace ns)
. Here you have the option of passing
null to the namespace. The addChild() method
allows you to add either an OMElement or an
OMText.

You can use the declareNamespace() method to add a
new namespace method to the element.

Serializing

I've mentioned in the first segment of this article that AXIOM
depends on the StAX interface to interact with external world. For
serializing, AXIOM uses the StAX writer interface.

[prettify]try {
    XMLStreamWriter writer =
        XMLOutputFactory.newInstance().createXMLStreamWriter(
                                        System.out);
    documentElement.serialize(writer, false);
    writer.flush();
} catch (XMLStreamException e) {
   e.printStackTrace();
}
[/prettify]

Create a writer to any output stream and call the
serialize() method of an OMElement.
Notice the Boolean flag in the serialize method. It has no meaning
in this instance, but would be important if you were building the
object model from an existing resource, like a file, using a
builder. At any given point, you may have not built the whole XML
representation, but you want to serialize the whole thing.
Serializing will go through the whole input stream and will print
it to an output stream. Once accessed, the input stream cannot be
accessed again. So you must have the option to build the object
model while accessing the incoming stream. The true
Boolean flag will ask the builder to build the object model while
serializing, and a false flag will just flush the XML
to the outgoing stream from the incoming stream. This "caching"
concept was introduced earlier.

This is what you will get as the output:

[prettify]&lt;myPrefix:MyDocumentElement 
    xmlns:myPrefix="http://chinthaka.org"&gt;
     &lt;myPrefix:SecondElement
        myPrefix:myAttr="attrValue"&gt;Sample Text
     &lt;/myPrefix:SecondElement&gt;
&lt;/myPrefix:MyDocumentElement&gt;
[/prettify]

Editor's note: Line breaks and indentation have been added to
the XML to suit the java.net page layout.

Building from an Existing Source

You can build AXIOM from any input stream corresponding to
XML. Here, the advantage is that you can start building as soon as
you receive the first bit, without waiting to finish the whole
stream.

[prettify]FileReader soapFileReader = new FileReader(fileName); 
XMLStreamReader parser =
    XMLInputFactory.newInstance().createXMLStreamReader(
                                soapFileReader); 
StAXOMBuilder builder =
    new StAXOMBuilder(OMFactory.newInstance(), parser); 
OMElement documentElement = builder.getDocumentElement();
[/prettify]

You have to create a StAX reader from the XML file and then pass
that to the StAXOMBuilder with a reference to the
preferred OMFactory. Then you can get the document
element from that.

The best thing here in AXIOM is that you can mix elements that
are partially built with programmatically built elements. AXIOM
will take care of both types of elements.

Accessing the XML Infoset

Let's see how we can retrieve children of an element, providing
specific information like QName. You can use a
OMElement method to retrieve its children, given a
QName. This will provide you with an iterator.

For example, let's say you want to find a children with the
local name "project" and namespace URI http://myproject.org.

[prettify]QName elementQName = new QName("project", "http://myproject.org");
Iterator infoIter =
    documentElement.getChildrenWithName(elementQName);
    while (infoIter.hasNext()) {
        OMElement element = (OMElement) infoIter.next();
        System.out.println("Matching Element Name = " +
            element.getFirstElement().getText());
        }
[/prettify]

Note here that AXIOM is very much concerned about namespaces, so
one has to provide a QName to retrieve a child.
getChildWithName(QName) will return the first matching
node, while getChildrenWithName(QName) will return an
iterator.

The beauty of the parser here is that the iterator returned does
not have information until it is asked for it. The iterator asks the
builder to build if and only if the iterator needs information.
There are lots of enhancements like this within AXIOM, to make it
as lightweight as possible without compromising performance.

One more thing here to note is that we have called
contributor.getFirstElement() to get the first
element. But the method contributor.getFirstChild()
may return a node of type text if there are leading
spaces before the children of contributor element. The
getText() method returns all of the texts that are
direct children of an element, irrespective of location. Those two
features were purposely introduced to preserve the full infoset, as
is required by most security implementations.

Getting StAX Events from an Element

Let's say that you want to work on the events level and want to
get events of a particular element.

[prettify]XMLStreamReader streamReader = 
    documentElement.getPullParser(true);
[/prettify]

You will be provided with an instance of the StAX stream reader,
which is internally implemented in AXIOM. The Boolean flag is used
to set the cache on or off.

Let's look at how smart AXIOM is when handling a complex
scenario. If the documentElement() is half-built,
AXIOM will generate the StAX events from the in-memory object
model, for the built parts. For the rest, it will get the events
directly from the builder and pass them to the user. In this
process, if the user wants the cache on, the builder will build the
object structure while handing over the events to the user.

If one needs to get SAX events from AXIOM, its just a matter of
writing a converter from StAX events to SAX events, which is very
easy.

Conclusion

This article introduced you to the AXIOM concept for XML
handling and explained the implementation of it found in the Apache
Axis2 project. The AXIOM API was designed to keep convenience and
developer-friendliness in mind. I introduced only some of the
methods in AXIOM, and AXIOM is continuously being improved to
provide a better and better implementation. I strongly recommend
that curious users to have a peek at the current sources found
under the Apache Axis2 project. That said, note that the current
AXIOM implementation will not provide full infoset support--though our community has made progress in making AXIOM a full
infoset-supported object model.

Resources

width="1" height="1" border="0" alt=" " />
S. W. Eran Chinthaka is a pioneering member of Apache Axis2, AXIOM and Synapse projects, working fulltime with WSO2 Inc..
Related Topics >> Programming   |   Web Services and XML   |