Skip to main content

Java Object Querying Using JXPath

{cs.r.title}






JXPath is a little-known component of the Apache Commons library that simplifies querying of sets of Java objects by using an XPath-based syntax. This article demonstrates how to use JXPath to replace complex Java code with simple expression-based queries, and how to make use of this in practical scenarios such as JSPs, templates (such as Velocity), and monitoring/management applications.

What is JXPath ?

JXPath is a library that makes use of simple expressions to query hierarchies of Java objects. The expression syntax is based on the XML standard XPath, and allows you to concisely express complex queries and to easily iterate across sets of objects. JXPath works on plain ol' Java objects (POJOs) and doesn't require objects to implement any JXPath-specific interfaces.

Let's look at an example. For the purposes of this article I'll work with a set of objects representing companies, their departments, and employees. A company contains many departments, those departments contain employees, those employees have names, telephone numbers, and so on.

If you want to follow the article against the full source code, you should download it before continuing here. See the Resources section for the code.

To query this set of companies and find, for example, all departments of companies in California with more than 10 employees you could write something like this:

for (Iterator companies = 
              database.getCompanies().iterator();
              companies.hasNext();) {

  Company company = (Company)companies.next();

  if (company.getLocation().equals("WA")) {

    for (Iterator departments =
               company.getDepartments().iterator();
               departments.hasNext(); ) {

      Department department =
                    (Department)departments.next();

      if (department.getEmployees().size() > 10) {
        System.out.println(department);
      }
    }
  }
}

Using JXPath you can simplify this:

// you only have to do the below once
JXPathContext context = JXPathContext
                           .newContext(database);

Iterator departments = context.iterate(
           "/companies[location='WA']" +
           "/departments[count(employees) > 10]");
while (departments.hasNext()) {
  System.out.println(departments.next());
}

Two points should be noted from this example:

  • The code using JXPath is considerably shorter. The query is concisely expressed in one statement.
  • You can take that expression out of the code and parameterize it, or put it in a configuration file.

Writing JXPath Queries

JXPath queries are based on the standard, well-known XPath notation and use a path-like notation to express a query. A JXPath expression represents the object hierarchy in a fashion similar to a file path. If you've ever written an XSL stylesheet, you'll have used XPath (perhaps without knowing it). A complete explanation of XPath is outside the scope of this document, but I can cover the basics here.

To get all the companies, you can write:

/companies

To get all departments within all companies, you can write:

/companies/departments

You can get companies using a predicated query. To get all the companies that are located in the state of California, you can write:

/companies[location='CA']

To get all departments within all companies based in California, you can write:

/companies[location='CA']/departments

XPath supports functions, and you can use these to create more complex queries. For example, to find all development departments within California, you can write:

/companies[location='CA']/departments[contains(name, 'Development')]

Examples of XPath functions include numeric, string, and date functions. All the standard XPath functions are supported.

You can use simple comparative expressions. For example, to find all departments with more than 10 employees, you can write:

/companies/departments[count(employees) > 10]

You can get a single object from a collection. For example, to get the second department from a particular company, you can write:

/companies[name='Sun Microsystems']/departments[2]
misinterpreted as a mathematical function. ca -->

Important: In Xpath (and hence JXpath), collections are indexed from 1, not 0!

You can get the first three departments by writing:

/companies[name='Sun Microsystems']/
           departments[position() < 4]

and the last department by writing:

/companies[name='Sun Microsystems']/
           departments[last()]

XPath contains many useful functions for doing more complex queries, and it's worth looking at the XPath tutorials and the W3C specification. However, the above examples cover most scenarios and will suffice for this introduction to JXPath.

How Does This Work With Java Objects ?

What do you have to do to get a JXPath query to work with your Java objects? The answer is: very little. Provided your Java objects use conventional setters and getters (for example, getCompanies() , setName(String n)) and return standard collections (for example, arrays, lists, and sets), everything will work transparently.

In the example code, the database object contains a method:

public List getCompanies();

which translates directly to the XPath query:

/companies

and returns a list of Company objects. Each Company object has a method:

public List getDepartments();

which returns the list of departments, and will get called for each company when you query:

/companies/departments

Arrays are handled in the same way as collections by JXPath. The method getDepartments() could instead return an array of departments, for example:

public Department[] getDepartments();

Each Department object has a method:

public String getName();

and this will get called for each department in each company when you query:

/companies/departments/name

The getName() function will also be called in the following scenario, where you make a predicated query to get all server development departments:

/companies/departments[name='Server Development']

You can interrogate maps easily. The Employee object has a method:

public Map getTelephoneNumbers();

which contains telephone numbers for home, work, fax machine, mobile phones, and so on. Using Java syntax, you can get the different numbers using the following:

Map numbers = employee.getTelephoneNumbers();
String workNumber = (String)numbers.get("work");

This translates to the following JXPath query (omitting the companies/departments bit for clarity):

.../employees[1]/telephoneNumbers/work

So in the above the map key ('work') forms part of the XPath expression. This is all well and good when the map keys are valid XPath identifiers, but what happens if your map key cannot be part of an XPath expression? For example, you may have:

Map numbers = employee.getTelephoneNumbers();
String workNumber =
          (String)numbers.get("work number");

and the XPath expression:

.../employees[1]/
          telephoneNumbers/work number

is not valid (because the path components—in this instance 'work number'—cannot have spaces). You can use an alternative syntax:

.../employees[1]/
          telephoneNumbers[@name='work number']

The @name field (which in standard XPath identifies an XML attribute) is used here to specify a map key. Note that currently JXPath supports only map objects with string for keys.

Implementing JXPath Queries

Now that you've seen how to form a JXPath query and how these translate into Java method calls, you can look at how to do this for real in your code.

First you need to get a context, representing the base object you want to query. Normally you would need to do this only once. The context provides a JXPath interface to the object hierarchy you want to query and can be considered analogous to a database connection, or an initial JNDI context:

// the 'database' contains a list of companies
JXPathContext context = JXPathContext.
                            newContext(database);

Now you can query for a set of objects:

Iterator departments = context.iterate("/companies/departments");

and then you use this iterator as normal, taking care to cast the returned objects appropriately:

while (departments.hasNext()) {
  Department dept = (Department)i.next();
}

You can get individual values by using the getValue() method call:

Department firstDepartment
      = (Department)context.getValue(
        "/companies[name='Sun Microsystems']" +
        "/departments[1]");

(remembering that XPath indices start from 1, not 0).

JXPath also allows the setting of values (provided that the property has a setter method) by using the setValue() method call:

// let's relocate Sun across the country...
context.setValue(
       "/companies[name='Sun Microsystems']" +
       "/location", "NY");

You have to be careful in instances where you're getting objects of a particular type (such as the above) that you provide the correct object types and/or cast correctly. The XPath expression:

/companies[name='Sun Microsystems']/location

relates to a java.lang.String in this particular case. But there's nothing to enforce that here, and if the object hierarchy changes to provide a USState object (for example), then the above will break. Comprehensive tests and/or error handling are recommended in these cases.

Error Handling

JXPath will normally throw an exception if an expression can't be resolved to a valid object hierarchy. For example, in the example code, the following would be invalid:

String location = (String)context.getValue(
"/companies[1]"+
"/departments[1]/location");

since departments don't have a location. In this example you get the following:

Exception in thread "main"
org.apache.commons.jxpath.JXPathException:
No value for xpath: /companies/departments/location
at org.apache.commons.jxpath.ri.JXPathContextRefe...
at org.apache.commons.jxpath.ri.JXPathContextRefe...
...

Note that JXPathException is a java.lang.RuntimeException and you don't have to explicitly check for it. In some scenarios you may want JXPath to fail silently in the event of an invalid path. For example, the following is likely to fail since you don't have any data for the Sybase corporation:

// let's find the first department in the
// Sybase corporation...
Department dept = (Department)context.getValue(
                  "/companies[name='Sybase']/" +
                  "departments[1]");

You can force JXPath to return null in these cases by setting lenient mode on the context, for example:

context.setLenient(true); context.iterate() will always return a valid iterator for an XPath, regardless of whether it contains values or not (the iterator may well be empty but is never null). In scenarios where it's possible that data is missing, or collections aren't populated, it may be safer to use JXPathContext.iterate() rather than JXPathContext.getValue().

Usage Scenarios

Given the above, you can use JXPath to vastly simplify sections of code that perform complex queries across nested objects. However, further advantages come from the fact that the query can be pulled out from the code and applied at run time from a configuration file, or from other inputs.

Here are a number of scenarios where JXPath can provide much more flexibility than a hard-coded set of object queries:

  1. JSPs are often coded to present a table of customers, trades, purchases, and so on. The appropriate query can be coded as a JXPath expression and inserted into the JSP. One JSP then can be used as a template for multiple pages, and an application can display multiple report pages simply by changing the JXPath expression.

    For example, one JSP template can make use of multiple JXPath expressions to show all purchases made on an e-commerce Web site today with a value greater than $1000.00:

    <%@ page contentType="text/html" %>
    <jsp:useBean id="query" scope="session"
        class="com.oopsconsultancy.jxpath.servlet.Query"/>
    <!-- get new valuable purchases -->
    <%
      List purchases = query.get(
                        "/purchases[ageInDays < 1"+
                        " and value > 1000]");
    %>
    <!-- ...and display... -->

    < get purchases that are out-of-stock items -->
    <%
      purchases = query.get(
                       "/purchases[ageInDays < 1" +
                       " and stocked = 'false']");
    %>
    < ...and display... -->

    You can do this in a similar way by using a templating technology like Apache Velocity. The important thing to note in these scenarios is the separation between the application logic of maintaining (in this case) a set of purchases, and the business logic of determining what to find/display in the presentation layer. Providing a JXPath expression gives you a powerful means of abstracting these queries out.

  2. When developing and running long-lived services, it's very useful to be able to interrogate these for their state. A common way of doing this is to implement a JMX (Java Management Extensions) layer and present it via HTTP. This gives you a simple Web browser interface into a service. You can then pull out useful statistics, such as lists of connections, cache sizes, and states, and the most recently processed messages.

    To do this, each statistic needs to be exposed as an MBean. (An MBean is a Java object that acts as a proxy for the original object and can be managed by JMX; if you have an HTTP adaptor, then the MBean will present its set of fields in the Web browser). If you have a lot of objects to be managed, then you have a lot of work to proxy these as MBeans.

    A simpler way to provide visibility to a set of objects and permit queries would be to provide a JXPath query object and query it via a command line and standard input, or as an MBean. This allows you to type a query into the server (either via the command line or via the browser if you're using JMX), and the server will automatically display the set of objects returned. For example, you could enter queries such as:

    /messages[last()]

    to display the last message processed by our server. Or you could enter:

    /cache[name='trades']/size

    to display the size of the "trades" cache. The advantage of this mechanism is it is very powerful in allowing you to interrogate practically everything in an object hierarchy. The downside is you need detailed knowledge of the objects and their APIs. However, for diagnostic and development purposes, this is not necessarily a disadvantage.

  3. Unit testing may require checking object hierarchies for particular values. You can use JXPath to express the components to be tested. For example, you can write a helper method in a JUnit test:

    /**
    * tests using a JXPath to express the required
    * component to test against. Note that nulls
    * aren't catered for!
    * @param path the path to find the object to
    *        test against
    * @param testable the base object to test
    * @param required the required result
    */
    private void assertFromPath(String path,
                                Object testable,
                                Object required) {
     
      JXPathContext context =
                    JXPathContext.newContext(testable);
      assertTrue(required.equals(
                    context.getValue(path)));
    }

    and then you can use this in tests:

    public void testBankAccount() {

      // you get a bank account object from some
      // operation...
      Account account = ....

      // then test against it
      assertFromPath("/accountHolder/name",
                           account, "John Smith");
      assertFromPath("/accountHolder/opened",
                           account, DATE_OPENED);
      assertFromPath("/accountHolder"+
                     "/transactions[1]/amount",
                           account, INITIAL_PAYMENT);
    }

More Advanced Capabilities

Obviously parsing XPath expressions is time-consuming and potentially the most time-consuming aspect in a JXPath query. If you're using the same expressions time and time again, then you can store a pre-compiled expression, therefore incurring the compilation hit only once:

JXPathContext context =
          JXPathContext.newContext(database);
CompiledExpression query = context.compile(
      "/companies/departments[name='Windows Development']" +
      "/employees[1]/telephoneNumbers/work");

// now you can use the query repeatedly
String workNumber =
                 (String)query.getValue(context);

Depending on the query you have, this may save a considerable amount of time. However, if the query performs complex dynamic queries (for example, finding all companies that have "Inc" in their name), then the saving may be less than you'd normally expect.

Summary

You've seen from the above that JXPath provides a powerful means to query sets of POJOs using concise expressions. While it takes very little effort to implement JXPath, the benefits are major, and range from giving you simple configurable queries contained within strings, to providing a dynamic query language for server processes.

Resources

width="1" height="1" border="0" alt=" " />
Related Topics >> Programming   |