Skip to main content

Berkeley DB, Java Edition II: Implementing Session Management

September 24, 2004

{cs.r.title}












Contents
Simple Session Management
More Complex Session
Management
Using Cursors
Using a Secondary Database
Summary

In the first article of this series, we
went through the basics of using Berkeley DB. In this article, we're going to
walk through a more extended example of using it. The example I'm going to use
is session management. While this series of examples doesn't illustrate the
full power of Berkeley DB, it will give you a good feel for how to use it. And
you might be surprised at how complicated some aspects of using Berkeley DB
are.

Conceptually, session management is very simple. There are three basic requirements:

  1. There is a global map from session keys to instances of Session.
  2. Persistence of the global map to disk: if the system shuts down and restarts, it should still have the sessions from before the shutdown.
  3. Old sessions are expired. Usually, this happens if the sessions haven't been used in a while.

Simple Session Management

Suppose we only wanted to implement the first two requirements: a global map from session keys to instances of Session and persistence of the global map to disk. The code for this is pretty simple, and you've already seen most of it. Assuming there is already have a class named Session, we need to do the following:

  • Create a binding for Session (e.g., write implementations of entryToObject() and objectToEntry()).
  • Define where the database will be stored on disk.
  • Create a session manager class, SleepycatBasedSessionManager, which the rest of the application will use. Internally, it will manage interactions with Berkeley DB. As part of SleepycatBasedSessionManager, this will include a background thread for synchronizing with the disk.

We've already talked about SessionBinding. There are, however, two important points that I'd like to emphasize. The first is that the binding has to be reversible--entryToObject(objectToEntry(session)) will create a second instance of Session. But it should be equivalent to the first session (all of the fields should have the same value). Similarly, objectToEntry(entryToObject(databaseEntry)) should yield an equivalent instance of DatabaseEntry.

The second important point is that changes in an instance of Session won't be automatically reflected in the database. Any change must be followed by updating the database. This is usually a little tricky--if you're making a series of changes to session state, you might not want to persist each individual change. On the other hand, if you delay persisting changes, then you need to build some sort of mechanism for doing delayed persistence, and you need to work on exception handling (if another part of your code throws an exception, will that prevent the persistence layer from doing its job).

When you define where the database will be stored, there are also a few things to keep in mind. Before we discuss them, let's look at the code to create a database. Here's part of SleepycatSessionManager's createDatabase method.

EnvironmentConfig environmentConfig = new EnvironmentConfig();
environmentConfig.setAllowCreate(true);
environmentConfig.setTransactional(true);
File file = new File(Preferences.getDatabaseDirectory());
_environment = new Environment(file, environmentConfig);

DatabaseConfig databaseConfig = new DatabaseConfig();
databaseConfig.setAllowCreate(true);
databaseConfig.setTransactional(true);
_mainDB = _environment.openDatabase(null,
                                    Preferences.getMainDatabaseName(),
                                    databaseConfig
                                   );

This relies on a class named Preferences, which just knows a few strings (in a real application, this would be more sophisticated).

	public class Preferences {
private static final String DATA_DB_DIR_NAME = "C:/temp/data";
private static final String MAIN_DB_NAME = "MainDB";

public static String getMainDatabaseName() {
return MAIN_DB_NAME;
}

}

This code does creates an instance of File, an instance of Environment, and then an instance of Database. There are two important points to note. The first that the instance of File corresponds to a directory--this is the place where Berkeley DB will store the database and log files and record any locks that might be held. Moreover, the directory must exist; Berkeley DB will not create it.

The second is that if you want your database to support transactions, the environment must do so as well.

The final thing we need to do to implement simple session management is cause the database to write itself out to disk occasionally, in a background thread. Rather than show you the details of a background thread, I'll just note that this consists of periodically executing the following line of code:

_environment.sync();

More Complex Session Management

The above code is pretty nice. It's not that hard to write, and it handles a lot of the session management task for us. But there's one aspect of session management it doesn't handle; namely, the expiration of old sessions. What we want is to find all of the sessions which haven't been active in some period of time, and get rid of them.

In our implementation, we're going to assume that any instance of Session that hasn't been requested from the session manager for a while is inactive. So adding the ability to expire the old sessions boils down to two things:

  • Find the old sessions.
  • Remove the old sessions once they've been found.

Using Cursors

A very simple solution to this problem is to simply iterate through the entire database, searching for old sessions. And when you find them, remove them. One way to do this is to create a cursor. The following code snippet shows the basic process for iterating through the entire database. First you create a cursor, and then you iterate, in the following code:

DatabaseEntry key = new DatabaseEntry(bytes);
DatabaseEntry primaryEntry = new DatabaseEntry();
Cursor cursor = _mainDB.openCursor(null, null);
OperationStatus statusCode =  cursor.getNext(key,
                                             primaryEntry,
                                             LockMode.DEFAULT);
while (OperationStatus.SUCCESS == statusCode) {
// perform whatever processing you want to do for each object.
  cursor.delete();
  statusCode = cursor.getNext(key, primaryEntry, LockMode.DEFAULT);
}
cursor.close();

The arguments to openCursor() are an instance of Transaction and an instance of CursorConfig, and can usually be null. Note that we're passing in result objects (the database will fill key and primaryEntry with data) and looking at a status code to see whether the result succeeded.

Using a Cursor is similar to using an Iterator. You use it to navigate the data structure when you want to examine a wide range of entries, or find an entry that you can't completely specify. You can also use cursors to put records into the database, or remove them (as in the above example).

Using a Secondary Database

Iterating through every entry in the entire database isn't usually a good solution. For one thing, databases are often big, and often only partially held in memory. For another, most of the sessions are probably active (it's fairly common to perform expiration every five or 10 minutes) and so most of the work being performed amounts to complicated no-ops.

What we'd really like is to have a second index, based on the last time the session was retrieved from the session manager. Creating the second index is easy to do: Berkeley DB has a class, SecondaryDatabase, that fills exactly this role. When you create an instance of SecondaryDatabase, you specify the main database, and Berkeley DB will automatically link the two. Whenever an object is added to the main database, it will also be added to the secondary database, and both operations will occur inside of a transaction.

Because Berkeley DB is adding the object to the secondary database, however, you have to provide additional information when creating the secondary database. You must supply an instance of SecondaryKeyCreator, which will be used to create the keys for the secondary database.

SecondaryKeyCreator is an interface with a single method:

public boolean createSecondaryKey(SecondaryDatabase secondaryDatabase, 
                                  DatabaseEntry keyEntry,
                                  DatabaseEntry sessionEntry,
                                  DatabaseEntry resultEntry) {

The first three arguments to this method (secondaryDatabase, keyEntry, and sessionEntry) are actual input to the method. The fourth argument, resultEntry, is intended to hold the return value. And the Boolean that gets returned is a success flag.

As an aside, in many ways Berkeley DB doesn't quite feel like Java. This method definition is a perfect example. I would much rather have this method throw an exception and return the created key. Passing in an object that will hold the "real return value" feels very kludgy to me. That is, I think the following feels much more natural.

public DatabaseEntry createSecondaryKey(SecondaryDatabase secondaryDatabase, 
                                        DatabaseEntry keyEntry,
                                        DatabaseEntry sessionEntry)
                     throws KeyCreationException {

Or, if you don't like defining the exception, simply returning null
is an equally valid option.

In any case, here's an implementation of SecondaryKeyCreator (note that it uses another class, LongHelper, that we had to provide).

public class SecondaryDatabaseKeyCreator implements SecondaryKeyCreator {
    private StandardSessionBinding _binding = new StandardSessionBinding();
   
    public boolean createSecondaryKey(SecondaryDatabase secondaryDatabase,
            DatabaseEntry keyEntry, DatabaseEntry sessionEntry,
            DatabaseEntry resultEntry) {
        Session session = (Session) _binding.entryToObject(sessionEntry);
        long time = session.getLastTouchTime();
        byte[] data = LongHelper.convertLong(time);
        resultEntry.setData(data);
        return true;
    }
}

There's one additional wrinkle in our use of a secondary database: we don't know the keys. We want to remove all instances of Session that haven't been touched in a while. That's implicitly a range-based query: we want to fetch based on a "less than" rather than on an "equals," and doing range queries is trickier than it sounds.

Recall that Berkeley DB is storing byte arrays. Suppose that timeOne and timeTwo are longs, and byteArrayOne and byteArrayTwo are the corresponding byte arrays. We need to either create an encoding of longs to byte arrays which guarantees that byteArrayOne < byteArrayTwo (in the standard comparison for byte arrays) whenever timeOne < timeTwo. Or we need to write an implementation of Comparator and pass it to Berkeley DB.

Since the first option, guaranteeing that "less than" is preserved by the transformation into byte arrays, is tricky, in this article we're going to use a comparator named LongComparator. The implementation of LongComparator converts the byte arrays back into longs, and then does the comparison. Here's the code for LongComparator and LongHelper

public class LongComparator implements Comparator {
    public int compare(Object firstObject, Object secondObject) {
        return LongHelper.compare(firstObject, secondObject);
    }
}

public class LongHelper {
   
    public static int compare(Object firstObject, Object secondObject) {
        if (!(firstObject instanceof byte[])) {
            return -1;
        }
        if (!(secondObject instanceof byte[])) {
            return -1;
        }
        byte[] firstByteArray = (byte[]) firstObject;
        byte[] secondByteArray = (byte[]) secondObject;
        return compare(firstByteArray, secondByteArray);
    }
   
    /*
     *  Reversed order to make iterating better. We'll just iterate to the
     *  end of the database (since
     *  things are stored in reverse-historical order.
     */

    public static int compare(byte[] firstByteArray, byte[] secondByteArray) {
        long firstLong = convertByteArray(firstByteArray);
        long secondLong = convertByteArray(secondByteArray);
        if (firstLong < secondLong) {
            return 1;
        }
        if (firstLong > secondLong) {
            return -1;
        }
        return 0;
    }
      
    public static long convertByteArray(byte[] array) {
        long returnValue = 0;
        for (int counter=7; counter >-1; counter--) {
            returnValue = returnValue >> 8;
            byte nextByte = array[counter];
            int addend = nextByte;
            if (addend>0) {
                addend+=256;
            }
            returnValue += addend;
         }
        return returnValue;
     }
   
    public static byte[] convertLong(long value) {
        byte[] returnValue = new byte[8];
                  
        for(int counter = 0; counter < 8; counter++) {
            returnValue[counter] = (byte)(0xFF & value);
            value = value >>> 8;
        }
        return returnValue;
    }
}

With those preliminaries out of the way, we can now discuss the implementation of background expiration. Here's the entire implementation of SleepycatBasedSessionManager, with the lines related to background expiration in bold.

public class SleepycatBasedSessionManager {
  private static final String DEFAULT_CHARSET = "UTF-8";
  private static final long ONE_MINUTE = 1000 * 60;
  private static final long DEFAULT_SLEEPTIME = 5 * ONE_MINUTE;
  private static final long DEFAULT_REAPING_TIME = 15 * ONE_MINUTE;

  private Database _mainDB;
  private SecondaryDatabase _timeDB;

  private Environment _environment;
  private StandardSessionBinding _sessionBinding;

  public SleepycatBasedSessionManager() {
    try {
      createDatabases();
      createReapingThread();
      _sessionBinding = new StandardSessionBinding();
    }catch (Exception e) {
      System.out.println("Database failure.");
      e.printStackTrace();
      System.exit(-1);
      }
    }

  public void addSession(Session session) {
    addSession(session, null); // null transaction is "autocommit"
  }

  public void removeSession(Session session) {
      removeSession(session, null);
  }

  public Session findSession(String sessionKey) {
      Session returnValue = getSession(sessionKey);
      if (null != returnValue) {
          updateDB(returnValue);
      }
      return returnValue;
  }

  public void updateDB(Session session) {
      try {
          Transaction transaction = _environment.beginTransaction(null, null);
          removeSession(session, transaction);
          session.touch();
          addSession(session, transaction);
          transaction.commitNoSync();
      } catch (Exception e) {
          System.out.println("Database error");
          e.printStackTrace();
      }
  }

  private void addSession(Session session, Transaction transaction) {
      String sessionKey = session.getSessionKey();
      try {
          DatabaseEntry key = getDatabaseEntry(sessionKey);
          DatabaseEntry value = getDatabaseEntry(session);
          _mainDB.put(transaction, key, value);
      } catch (Exception e) {
          System.out.println("Database error");
          e.printStackTrace();
      }
  }

  private void removeSession(Session session, Transaction transaction) {
      try {
          String sessionKey = session.getSessionKey();
          DatabaseEntry key = getDatabaseEntry(sessionKey);
          _mainDB.delete(transaction, key);
          return;
      } catch (Exception e) {
          System.out.println("Database error");
          e.printStackTrace();
      }
      return;
  }

  private Session getSession(String sessionKey) {
      try {
          DatabaseEntry key = getDatabaseEntry(sessionKey);
          DatabaseEntry value = new DatabaseEntry();
          _mainDB.get(null, key, value, LockMode.DEFAULT);
          Session returnValue = convertDatabaseEntryToSession(value);
          return returnValue;
      } catch (Exception e) {
          System.out.println("Database error");
          e.printStackTrace();
      }
      return null;
  }

  private Session convertDatabaseEntryToSession(DatabaseEntry entry)
          throws Exception {
      return (Session) _sessionBinding.entryToObject(entry);
  }

  private DatabaseEntry getDatabaseEntry(String string) throws Exception {
      return new DatabaseEntry(string.getBytes(DEFAULT_CHARSET));
  }

  private DatabaseEntry getDatabaseEntry(Session session) throws Exception {
      DatabaseEntry returnValue = new DatabaseEntry();
      _sessionBinding.objectToEntry(session, returnValue);
      return returnValue;
  }

  private void createDatabases() throws Exception {
      EnvironmentConfig environmentConfig = new EnvironmentConfig();
      environmentConfig.setAllowCreate(true);
      environmentConfig.setTransactional(true);
      File file = new File(Preferences.getDatabaseDirectory());
      _environment = new Environment(file, environmentConfig);

      DatabaseConfig databaseConfig = new DatabaseConfig();
      databaseConfig.setAllowCreate(true);
      databaseConfig.setTransactional(true);
      _mainDB = _environment.openDatabase(null, Preferences
              .getMainDatabaseName(), databaseConfig);

      SecondaryConfig secondaryConfig = new SecondaryConfig();
      secondaryConfig.setAllowCreate(true);
      secondaryConfig.setTransactional(true);
      secondaryConfig.setSortedDuplicates(true);
      secondaryConfig.setBtreeComparator(LongComparator.class);
      secondaryConfig.setKeyCreator(new SecondaryDatabaseKeyCreator());
      _timeDB = _environment.openSecondaryDatabase(null,
                                              Preferences.getTimeDatabaseName(),
                                              _mainDB,
                                             secondaryConfig);

  }

  private void createReapingThread() {
      Thread backgroundThread = new Thread(new BackgroundDBOperationsRunnable(),
                                           "Session Manager Background Thread");
      backgroundThread.setDaemon(true);
      backgroundThread.start();
  }

  private class BackgroundDBOperationsRunnable implements Runnable {
      public void run() {
          while (true) {
              try {
                  sleepUntilReapingTime();
                  reapAndPersist();
              } catch (Exception e) {

              }
          }
      }

      private void sleepUntilReapingTime() {
          try {
              Thread.sleep(DEFAULT_SLEEPTIME);
          } catch (Exception ignored) {

          }
      }

      private void reapAndPersist() throws Exception {
          long deadIfUntouchedSince = System.currentTimeMillis()
                  - DEFAULT_REAPING_TIME;
          byte[] bytes = LongHelper.convertLong(deadIfUntouchedSince);
          DatabaseEntry key = new DatabaseEntry(bytes);
          DatabaseEntry primaryEntry = new DatabaseEntry();
          SecondaryCursor cursor = _timeDB.openSecondaryCursor(null, null);
          OperationStatus statusCode = cursor.getSearchKeyRange(key,
                                                                primaryEntry,
                                                                LockMode.DEFAULT
                                                                );
          while (OperationStatus.SUCCESS == statusCode) {
              cursor.delete();
              statusCode = cursor
                      .getNext(key, primaryEntry, LockMode.DEFAULT);
          }
          cursor.close();
          _environment.sync();
      }
  }
   

}

About one-third of the class is bolded--adding background expiration didn't add a lot of code. We needed to create a secondary database, and we needed to use a cursor to use the getSearchKeyRange method on a cursor to do a search for expired records. But once you realize those two facts, and understand why a comparator is necessary, the code is easy to write.

Summary

Berkeley DB is famous for a lot of reasons. It's a low-footprint embedded database that's distributed via both open source and commercial license models, and it's very widely used. In recent years, Sleepycat has branched out from the core Berkeley DB a little bit, and has created some new embedded database products. Among them are an XML database and the new Java Edition of Berkeley DB. In this series, we talked about the Java Edition. It's a high-quality implementation of a robust and transactional persistent store that embeds in your application and is both reasonably easy to use and efficient.

In this series, we covered the basic concepts that underly Berkeley DB, and we walked through a fairly complete example of real-world use. At this point, you should have a good idea of when and where it's appropriate to use an embedded database, and a good understanding of what tradeoffs are involved in doing so.

William Grosso is the vice president of engineering for Echopass.
Related Topics >> Databases   |