Skip to main content

XQuery For Java, An Enabler For SOA

April 19, 2007


Portable data is a main concern in service-oriented architecture (SOA)
but is no longer rocket science since XML has been doing the job
perfectly fine. What is more of a concern are the overall
engineering steps involved in retrieving data from some persistent
store (we do need a data store, as we cannot live with volatile
data), massaging it, and then transforming to a portable format
(XML), adhering to some schema agreed upon by both the consumer and
producer. N number of combinations of steps can do this job,
but in some cases we need to deal with questions like:

  • What is the rationale behind the various processing steps we
    are doing?
  • Could we have a lighter approach and still enable the data for
  • Should I be writing code if it's not absolutely necessary?

In this article we are going to talk about "">XQuery and its derivatives
including the XQJ (XQuery API for Java) specification, which is under
development as part of "">JSR-225:
XQuery API for Java
. The first section of this article will
introduce both XQuery and XQJ and equip the reader with some code
and tools to get their hands dirty. Then we will revisit the
questions raised above, taking a particular context as example. We
proceed by first understanding the real pain points experienced by
developers in data transformations and then we take the reader through
a simple case study, again with some working code. Throughout the
article we will use XQuery as implemented by "">Saxon to demonstrate the
concepts in code. In doing so, we also introduce Saxon SQL
extensions with an intention to set reader expectation towards few
forthcoming implementations.

XQuery: A Primer

XQuery is a declarative query language for XML, just like SQL
plays a similar role for relational data. XQuery 1.0 is a query
language being developed by the W3C XML Query Language Work Group.
At least a few of you should be familiar with the SAX and/or DOM
APIs to manipulate XML data. We are happy with these APIs and here
we will look at how XQuery-based XQJ makes life even better
for developers. XQJ will conform to the XQuery 1.0 specification and
will define a set of interfaces and classes that enable an
application to submit XQuery queries to an XML data source and
process the results of these queries. XQJ will also facilitate
submitting XPath 2.0 expressions to an XML data source. Contrary to
a general-purpose programming language like Java or C# using the
SAX or DOM model to
manipulate XML data, XQuery is specific to a particular domain
(querying XML) itself. Due to this specific nature of XQuery, using
a single line of XML language like XSLT or XQuery we can produce
the same effect as produced by hundreds of lines of code of Java,
C#, or some other general-purpose language. XQuery is thus a
declarative language and designed upfront to work with XML data.
Perhaps we should also look at how XQuery is different from its
counterparts, like XPath and XSLT.

XPath is optimized
for accessing sections or parts of an XML document. Thus we can
immediately use XPath if the requirement is just to select a node
from within an XML document. But XPath cannot return a part of the
selected node (like the node element tag alone, omitting content)
and it cannot create new XML. "">XSLT includes XPath as a subset to
address XML document parts and also includes many other features.
XSLT can contain variables and namespaces and can create new
documents. XSLT is optimized for recursively processing an XML
document or translating XML into HTML, WML, VoiceXML, etc. But
writing a user-defined function or other common operations are
tedious in XSLT, and XQuery scores here by expressing joins and
sorts. It can also manipulate sequences of values and nodes in
arbitrary order, not just in the order in the document. It is also
easy to write user-defined and recursive functions in XQuery. An
introductory article answers the question " "">What Is
," and another one explains " "">Generating XML
and HTML using XQuery

XQuery API for Java (XQJ)

XQJ defines a set of interfaces and classes that enables a Java
application to submit XQuery queries to an XML data source and
process the results of these queries. Queries may be executed
against individual XML documents or collections of XML documents.
The XQuery standard provides a great degree of freedom for
implementers in how they choose to implement many of its features.
This means different implementations can differ in how they handle
a temporary intermediate result as long as the query produces the
correct, final "answer." A few XQuery implementations are available
now, amongst which "">Qexo is worth mentioning.
Similarly, Saxon is a collection of XML processing tools by "">Saxonica for XSLT 2.0, XPath 2.0,
XQuery 1.0, and XML Schema 1.0. Saxon also offers two other APIs for
XQuery processing: Saxon's own native API, and an early
implementation of the XQJ. Saxon is available for both the Java and
.NET platforms as two packages: Saxon-B and Saxon-SA. Saxon-B and
all its features are available under an open source license to all
users, whereas Saxon-SA requires activation by a license key.

XQJ specifies that a data source may be obtained from a JNDI
source or through other means, but is not very clear on the allowable
"other methods." Once instantiated, an XQDataSource
can act as a factory for creating XQuery connection objects,
sequences, and items. The XQDataSource has three
overloaded getConnection() methods to get a
connection, as shown below:

public XQConnection getConnection() throws XQException;

public XQConnection getConnection (java.lang.String username,
    java.lang.String passwd) throws XQException;

public XQConnection getConnection(java.sql.Connection con)
    throws XQException;

The last one is promising because the XQJ spec recommends
attempting to create a connection to an XML data source using an
existing JDBC connection. Even though an XQJ implementation is not
required to support this method, if supported, the XQJ and JDBC
connections will operate under the same transaction context. Once
an XQConnection is retrieved, we can now call
prepareExpression() to compile a query. The resulting
XQPreparedExpression object has a method called
executeQuery() (which allows the query to be evaluated),
which then returns an XQSequence. The
XQSequence can act as a cursor with a
next() method that allows us to change the cursor
position, and a getItem() method that allows us to
retrieve the item at the current position. The result of
getItem() is an XQItem object with
methods that allow us to determine the item type and convert the
item into a suitable Java object or value.

One issue with Saxon is that Saxon generally only recognizes its
own implementation of XQJ interfaces.
SaxonXQDataSource is Saxon's XQDataSource
and an XQJ client application has to instantiate a
SaxonXQDataSource directly. There is no factory class,
and hence an application that does not want compile-time references
to the Saxon XQJ implementation needs to instantiate this class
dynamically using the reflection API (e.g., with a call to "">
). We will look at the steps in executing an
XQuery using XQJ in the code listing below.

String content = null;
XQDataSource ds = new SaxonXQDataSource();

/* or
InitialContext ctx = new InitialContext();
XQDataSource ds = (XQDataSource) ctx.lookup("java:compe:/env/ddxq/ds");

XQConnection conn = ds.getConnection();
XQPreparedExpression exp =
XQResultSequence result = exp.executeQuery();

while ( {
        content = result.getItemAsString();

Example Operations Using XQJ

We can't have a detailed discussion on the power of XQuery in an
article like this; nor we will attempt to solve complex problems
here. Instead, this section will introduce simple expressions and
then hook them into XQJ to get the queries evaluated. For any
detailed discussion on XQuery expressions the readers, are directed
to the books "">
XQuery from the Experts
and "">XQuery: Rough Cuts
. For any discussions in this section, we will use
this sample XML data.
Let us now look at few expressions and understand what they will

  1. /BOOKLIST/BOOKS/ITEM/TITLE: Retrieves the titles
    of all the book items in the book list.
  2. /BOOKLIST/BOOKS/ITEM/TITLE)[2]: Retrieves the
    title of the second book item in the book list.
    : Retrieves the author of the book item with
    title "The Big Over Easy" in the book list.
  4. /BOOKLIST/BOOKS/ITEM/@CAT: Retrieves all the
    available categories of book items in the book list.
  5. /BOOKLIST/BOOKS/ITEM[2]/*: Retrieves all the
    elements of the second book item in the book list.
  6. /BOOKLIST/BOOKS/ITEM/*/@*: Retrieves any
    attributes of any elements of book items in the book list.

To get the sample code working, download the attached file (see the Resources section for sample code), and unzip it
to some folder in your local file system. Go to the
PathExpressions directory, and type ant run,
which will print out the results of the above XQuery into the
console, as shown in Figure 1.

Xqj Example Operations />

(Click thumbnail to view full-sized image)

Hierarchical: Relational Impedance Mismatch

How many times in your life have you converted objects into XML
format and vice versa? We have been doing this for many years, and
continue today. Most of the time, the business tier exposes data as
XML, either in SOAP format or in some other ad hoc XML format, in
which case we don't care about the interoperability of our data
with some client that is consuming the data. Needless to say, we
have been also using relational databases for many years as our
safe, transaction-aware, and concurrently-accessible data stores.
Hmm. Now we need an object-relational (OR) mapping tool (like
Hibernate, "">Toplink,
etc.) to convert our relational data to Java objects, and then some
Java-XML binding tools (like "">Castor, "">XML Beans, etc.) to convert
Java objects to XML and vice versa. The full dynamic is shown in
Figure 2:

Data Transformation Dynamics
Figure 2. Data transformation dynamics (Click on thumbnail to
view full-sized image)

At least some of you should be raising your eyebrows now about
the relevance of the intermediate conversion of data to "objects."
We will list out our usual justifications here:

  1. We need to "process data" using some programming language, and
    it is easy to handle the "data in object" form using programming
    language constructs.
  2. SQL is designed for relational databases; hence it is not easy to
    work at the XML layer, even though many products and standards try
    to extend it to handle XML.

So far so good. Now, we remember at least one requirement to
build a Data Access Layer (DAL) over a relational database. The
DAL in this case study has to function as the data provider for an
Service Bus (ESB)
through which all kinds of clients (data
consumers) will route their requests (queries). Since the
normalized message format within the ESB is XML and no major
processing needs to be done at the provider side, a feasible
architecture is to make the data access layer as a thin, shim layer
with minimum overhead. This layer will then retrieves data from the
database and convert them into XML format. We first looked into
ways by which SQL can be used to do this. SQL is a query language
for relational data. Relational databases usually host unordered
sets of "flat" rows, and SQL is best to operate on this data model.
On the contrary, XML data structures contain hierarchical nodes and
XQuery is best for this data structure. Thus SQL as such cannot be
directly used over XML data; nor is XQuery meant to be directly
acting over relational data.

Saxon SQL Extensions

Of course there are more than one way to do XML-relational
transformation, but let us look at how we can use Saxon SQL
extensions for the same. Using Saxon SQL extensions, we can enhance
the capability of the processor to access SQL databases. The first
step in doing this is to define a namespace prefix (for example,
sql) in the extension-element-prefixes attribute of the
xsl:stylesheet element, and then to map this prefix to
namespace URI that ends in
net.sf.saxon.sql.SQLElementFactory. Now we have seven
new stylesheet elements at our disposal to do SQL operations:

  1. sql:connect
  2. sql:query
  3. sql:insert
  4. sql:update
  5. sql:delete
  6. sql:column
  7. sql:close

The sql:connect element will returns a database connection
as a value, specifically a value of the type external object, which
can be referred to using the type

<xsl:param name="driver" select="'oracle.jdbc.driver.OracleDriver'"/>
<xsl:param name="database" select="'jdbc:oracle:thin:@'"/>
<xsl:param name="user">scott</xsl:param>
<xsl:param name="password">tiger</xsl:param>

<xsl:variable name="connection" as="java:java.sql.Connection"
        <sql:connect driver="{$driver}" database="{$database}"
            user="{$user}" password="{$password}"

Once the connection is retrieved, we can now do CRUD (create,
read, update, delete) operations in the SQL database.

CRUD of Customer, Order, and Line Item XML on SQL Database

Our aim here is to introduce the very basics of CRUD operations
using the Saxon SQL extension so that the reader's attention
doesn't drift while reading complex code. Once we agree here that
the basics work fine, it is up to the reader to fully utilize the
power of transformation (XSLT) to execute complex SQL operations.
So our data model is very simple, as represented in Figure 3:

<br "Customer Order LineItem DB Schema" width="462" height="96" />
Figure 3. Customer order LineItem DB schema


Let us first look at how we can insert a few rows into the table
shown in Figure 3. We will use the target="new_window">customer_insert.xml to demonstrate our
insert operations. The simple Java code to do the insert operation
is shown below:

import net.sf.saxon.Transform;

public class CustomerOrder{
   public void insert(){
      Transform transformData = new Transform();
      String[] args = {"customer_insert.xml",
      transformData.doTransform(args, null);

The magic lies in the sql:insert tag in "/images/2007/04/customer_insertupdate.xsl" target=
Here, we first check
whether the customer is already present in the database; if they are not we
do an insert operation:

<xsl:variable name="customerid" select="CUSTOMERID"/>
<xsl:variable name="customer-table">
  <sql:query  connection="$connection" table="customer"
    where="CUSTOMERID='{$customerid}'" column="*"
    row-tag="CUSTOMERORDER" column-tag="col"/>

<xsl:if test="count($customer-table//CUSTOMERORDER) = 0">
  <sql:insert table="customer" connection="$connection">
    <sql:column name="CUSTOMERID" select="CUSTOMERID"/>
    <sql:column name="CUSTOMERLASTNAME" select="CUSTOMERLASTNAME"/>
    <sql:column name="CUSTOMERFIRSTNAME" select="CUSTOMERFIRSTNAME"/>
    <sql:column name="CUSTOMEREMAIL" select="CUSTOMEREMAIL"/>

The CustomerOrderLineItem folder in the attached .zip file
contains the code for this. Make sure to create the database tables
and make any relevant changes in the .xsl files to suit your
database settings (driver, URL, username, and password). Then
execute ant insert, which will create rows in relevant
tables in the database as shown in Figure 4.

XQuery Inserted Data
Figure 4. XQuery-inserted data (Click on thumbnail to view
full-sized image)


For read we make use of sql:query. We use "/images/2007/04/customer_query.xml" target="new_window">customer_query.xml
with the following XML content to pass the required query
parameters to customer_query.xsl.

<?xml version="1.0"?>

The aim here is to retrieve all the order items for the
customer with ID 456. Obviously, when you need to use these
techniques in your own applications, you may have to dynamically
generate those XML documents with query parameters instead of using
static XML files. The "new_window">customer_query.xsl is having following two
template match blocks:

<xsl:template match="CUSTOMERORDERS">
  <xsl:message>customer_query.xsl :
    Connecting to <xsl:value-of select="$database"/>...</xsl:message>
  <xsl:message>customer_query.xsl : query  records....</xsl:message>
  <xsl:apply-templates select="CUSTOMERORDER" mode="Query"/>
  <sql:close connection="$connection"/>

<xsl:template match="CUSTOMERORDER" mode="Query">
  <xsl:variable name="orderid" select="CUSTORDER/ORDERID"/>
  <xsl:variable name="orderitem-table">
    <sql:query  connection="$connection" table="ORDERITEM" where="ORDERID=
      '{$orderid}'" column="*" row-tag="ORDERITEM" column-tag="col"/>
  <xsl:message>There are now <xsl:value-of
    select="count($orderitem-table//ORDERITEM)"/> orderitems.</xsl:message>
    <xsl:copy-of select="$orderitem-table"/>
  <sql:close connection="$connection"/>

The main match block is CUSTOMERORDER. Here we query the
ORDERITEM table and select all columns matching the query
parameter. We then display them to the console. Executing

will demonstrate this as shown in Figure 5.

XQuery Data
Figure 5. XQuery data (Click on thumbnail to view full-sized


The ant update command will use target="new_window">customer_update.xml to demonstrate the
update operations. The notable change here is that the customer
email has been changed from
in customer_insert.xml to
in customer_update.xml. As earlier, we first check whether
the customer is already present in the database; if the customer
exists, we do an update instead of insert, using
sql:update. Figure 6 shows the result.

<xsl:if test="count($customer-table//CUSTOMERORDER) > 0">
  <sql:update table="customer" connection="$connection"  where="CUSTOMERID='{$customerid}'">
    <sql:column name="CUSTOMERLASTNAME" select="CUSTOMERLASTNAME"/>
    <sql:column name="CUSTOMERFIRSTNAME" select="CUSTOMERFIRSTNAME"/>
    <sql:column name="CUSTOMEREMAIL" select="CUSTOMEREMAIL"/>

XQuery Updated Data

Figure 6. XQuery updated data (Click on thumbnail to view
full-sized image)


Again, since we don't want to complicate the examples, we'll do
a simple table delete only using sql:delete. For this,
we just pass an empty XML DELETE element as a command
in "new_window">customer_delete.xml, as shown below:

<?xml version="1.0"?>

Now, "new_window">customer_delete.xsl will contain the required
sql:delete command to empty the tables one by one,
which is shown below:


<xsl:template match="CUSTOMERORDERS">
  <xsl:apply-templates select="DELETE" />

<xsl:template match="DELETE">
  <sql:delete table="address" connection="$connection" />
  <sql:delete table="orderitem" connection="$connection" />
  <sql:delete table="custorder" connection="$connection" />
  <sql:delete table="customer" connection="$connection" />

DB Operations Using External XQuery Files

It is also possible to hook into external .xq files like
"new_window">customerorder.xq. Here, we will use an XML
document ( "new_window">customer_insert.xml) as the data source. The XQuery
is listed below:

xquery version "1.0";

declare copy-namespaces no-preserve, inherit;
declare variable $custid as xs:integer external;
declare variable $ordid as xs:integer external;

for      $customerorder in //CUSTOMERORDERS/CUSTOMERORDER,
         $customer in $customerorder/CUSTOMER,
         $order in $customerorder/CUSTORDER,
         $orderitem in $order/ORDERITEM
where    $customer/CUSTOMERID = $custid and $order/ORDERID = $ordid
order by string-length($customer/CUSTOMERID) , string-length($order/ORDERID)
return   <customerorder>
             { $customer/CUSTOMERID }
             { $customer/CUSTOMERFIRSTNAME }
             { $customer/CUSTOMERLASTNAME }
             { $customer/CUSTOMEREMAIL }
             { $order/ORDERID }
             { $order/ORDERDATE }
               { $orderitem/ITEMID }
               { $orderitem/NUMBER }
               { $orderitem/INSTRUCTIONS }

We have to pass customer ID and order ID as parameters to query
the details. This we do from our Java code as follows:

final Configuration config = new Configuration();
final StaticQueryContext sqc = new StaticQueryContext(config);
final XQueryExpression exp = sqc.compileQuery(
   new FileReader("customerorder.xq"));
final DynamicQueryContext dynamicContext = new DynamicQueryContext(config);
Properties props = new Properties();
   new StreamSource("customer_insert.xml")));
dynamicContext.setParameter("custid", new Long(452));
dynamicContext.setParameter("ordid", new Long(461));
final SequenceIterator iter = exp.iterator(dynamicContext);
props.setProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
props.setProperty(OutputKeys.INDENT, "yes");
QueryResult.serializeSequence(iter, config, System.out, props);
Typing ant queryXQ will run the sample and Figure 7 shows the
query results.

Query Using External XQ File
Figure 7. Query using external XQ file (Click on thumbnail to
view full-sized image)

Generated XML, So What, and What's Next?

Going by our case-study objective, we've now realized XML
generation from a relational store and we hope you will agree that
we haven't written much Java code for this. Of course, we have XSLT
code, but it only increases the system flexibility since we are no
longer constrained by the specifics of relational schema. If the
table schema changes, it is just a matter of updating the
respective XSLT files.

The XML data generated here is arbitrary, but we can leverage
XML Schema to enable
B2B participants to
express shared vocabularies and allow machines to carry out rules
made by people.

The next step is to expose XML data for consumption. We will not
go into further detail here since this is outside the scope of this
article. Still, there are multiple options available as below
(please note that this list is not exhaustive):

We Conclude, To Begin Again

This article introduced the concepts of XQuery and XQJ.
Accepting the fact that we're ignoring some significant issues like
benchmarking performance, we have a working data access layer. As
we have already mentioned, there are multiple products available in
both the open source and commercial XML worlds. All of them,
among other things, are trying to ease XML operations, especially
bridging the XML and relational worlds. Even though the above
case study implementation is based on direct XQuery, XQJ is
supposed to be even more powerful and brings new promises
especially when it comes to performing operations against data
stores (look at Jonathan Bruce and Jonathan Robie's "">XQJ
for more information). "">DataDirect XQuery is also
worth mentioning here, and the quest for such frameworks is on the
rise since we can rarely find an enterprise class application
without the need to work on XML data. The promise of some of these
new generation frameworks is also to do transactional ACID
operations on XML-based databases. These transactions can even be a
part of a bigger, global transaction. If so, developers can cheer
up, since lot of code would get reduced; code that would have been
otherwise transforming relational data to objects to XML and
the reverse.


width="1" height="1" border="0" alt=" " />
Binildas Christudas currently works as a Principal Architect for Infosys Technologies, where he heads the J2EE Architects group servicing Communications Service Provider clients.
Related Topics >> Web Services and XML   |