Skip to main content

Synchronizing a Web Client Database: LocalCalendar and Google Calendar

January 16, 2007

alt="{cs.r.title}" border="0" align="left" hspace="10" vspace="0">






Rich Internet Applications (RIAs) are becoming more and more popular as the capabilities available to internet clients increase. There is a growing demand for RIAs to be able to store rich, structured, searchable data on the same machine as the web browser, and to be able to synchronize this data with a centralized data store running behind the web server.

Figure 1 shows the basic components of a web application that supports local storage.

Web App With Local Storage
Figure 1. Web app With local storage

Web client local storage is useful in a number of scenarios:

  • Users are temporarily disconnected from the internet but want to keep working, viewing, and modifying data locally.

  • Users are offline for long periods of time but want to be able to collaborate and share data once they get reconnected. Beyond the standard web applications such as blogging or working on a Wiki site, this is also needed by the scientific and health care communities, where they often gather data in very remote locations where there is no connectivity.

  • Some portion of the data that the users work with is confidential and the user does not wish to post, or is restricted from posting, this data on a public web server.

  • The application is highly CPU-intensive on a per-user basis, and the only way to scale is for user-specific data to be cached and manipulated locally on the client machine, taking advantage of the CPU cycles of the client.

This article describes a model for using an embedded relational database to accomplish these goals. I describe the flow and components of a synchronization architecture using standard APIs and mechanisms, and the possible approaches to making your application logic available offline.

Ingredients for an Offline Architecture

The main ingredients you need for an offline architecture include:

  • Local storage: You can't run offline unless you have somewhere store your changes. There are a number of choices for local storage. Many of them provide a simple key/value API; some of them are relational but do not provide full transactional semantics. In my blog about the merits of a relational database and Java DB I discuss the value of using a relational database (and Java DB in particular) for local storage.

  • Synchronization: Although not all applications require this, in most cases you'll want to "phone home" when you're connected and synchronize with the mother ship.

  • Application logic offline: If you are going to run offline, this means your application logic has to be available offline. This may seem obvious, but accomplishing it can be challenging.

Synchronizing over the Web

The traditional way database synchronization has been done is database-to-database synchronization. However, database-level synchronization does not work for internet applications because the only interface you have to the back-end application is a web-service-style interface, be it through standard HTTP services or through a SOAP/WSDL-style interface. This means that synchronization also has to occur through the web interface.

This actually is not as bad as it sounds. Most web interfaces are "primed" for synchronization because they already have to support clients that are connection-less and potentially out of synch. Just as an example, the Google Calendar API already has conflict detection baked in to the API.

Implementing Synchronization

It's important to look at the four key states or modes in which an offline web client runs. For each of these modes, there are specific tasks that your application needs to handle in order to support offline users.

  • Online: In this mode, in general your application behaves like any normal connected application. However, if you want to be able to handle unexpected disconnects, you need to continually keep the necessary state required to work offline in synch with the server's state. In the example of LocalCalendar, all modifications to events are stored locally as well as in Google Calendar.

  • Going Offline: Depending upon your synchronization strategy, you may need to take a snapshot of your local state prior to going offline. Most databases provide support for this through some kind of backup or export mechanism. In the case of LocalCalendar, I don't do this because I am using a strategy that doesn't require a snapshot. Note that if you use the snapshot strategy, then you cannot support an unexpected loss of connection. This may be acceptable depending upon your goals; IMAP mail tools have a mode that work like this, and users often choose this mode to reduce the overhead of downloading messages to the local disk when most of the time they are online.

  • Offline: When you are running offline, you need to store your changes locally so that the application can keep running as expected. You also need to keep track of the changes you are making so that you can synchronize with the server once you go back online.

  • Going Online: This is where synchronization takes place, and involves three key steps:

    1. Send all changes made while offline up to the server.
    2. Report any conflicts that the server could not resolve.
    3. Ensure that the local state matches the server state.

LocalCalendar

I will use a sample application called LocalCalendar to show in detail an example of how to accomplish the tasks described above for the various modes of an offline application lifecycle. I'll also discuss alternative approaches where appropriate.

LocalCalendar allows you to manage events on a calendar for a given week. It interacts with Google Calendar while online, but also continues to work when you're offline by storing information in a Java DB database that is embedded in the browser.

The code for LocalCalendar is available as a JAR file--see the Resources section below. The basic architecture is shown in Figure 3. It is important to note that although LocalCalendar is a browser-based application, this model for local storage applies just as well to any web client framework that wants to make use of local storage.

LocalCalendar Architecture
Figure 2. LocalCalendar architecture

LocalCalendar's user interface is written in HTML and JavaScript. The JavaScript code uses LiveConnect (available in all popular browsers) to communicate with a Java applet called CalendarController. This applet acts as a local controller, providing the application-level interface to the DHTML and JavaScript view, with methods such as addEvent, updateEvent, and deleteEvent.

CalendarController communicates with Google Calendar through the Google Calendar data API.

CalendarController also stores data locally using Java DB. Java DB is implemented in a single JAR file (derby.jar), and starts up automatically when you open your first connection to the database.

We'll look at the addEvent method of CalendarController as an example of how the application runs in the various modes. The other methods follow a very similar pattern. I will not be going into any detail about how LocalCalendar interacts with Google Calendar. This is interesting in its own right but is not the focus of this article.

public String addEvent(String id, String date, String title) 
    throws Exception {
    CalEvent event = null;
   
    if ( isOnline() ) {
        try {
            event = calendar.addEvent(date, title);
        } catch ( NetworkDownException nde ) {
            log("The network is down, going offline");
            goOffline();
        }
    }
   
    // Now do the database operations -- store the event
    // locally, and if we're offline, also store the request
    // to add the event so we can ship it to Google Calendar
    // when we come back online
    try {
        DatabaseManager.beginTransaction();
       
        if ( ! isOnline() ) {
            log("Storing request to add event");
            RequestManager.storeAddEvent(id, date, title);
            event = new CalEvent(id, date, title, null, null);
        }  
       
        log("Storing new event in the local database");           
        EventManager.addEvent(event);
       
        DatabaseManager.commitTransaction();
    } catch ( Exception e ) {
        DatabaseManager.rollbackTransaction();
        throw e;
    }
   
    return event.getJSONObject().toString();
}

If we're online, we first add the event to Google Calendar. calendar is a reference to a GCalendar object, which is the class that is responsible for talking to Google Calendar. Note that calendar returns an instance of DerbyCalEvent. This is a data object that contains all the information for an event. calendar returns this because Google Calendar returns important information for the event that we need to store in the local database, such as the URL to use to edit the event at a later time.

Next, we begin a database transaction. This is because we want to do two things as a single unit of work: insert the new event into our events table, and (if we're offline) store the request to insert the new event. This stored request is used to update Google Calendar when we go back online.

DatabaseManager provides utility database routines. In particular it provides nominal support for transactions.

EventManager is responsible for managing the lifecycle of events within the local database. Here is the code for EventManager.addEvent, which is responsible for inserting a new event into the database:

public static void addEvent(CalEvent event) throws Exception {
    Connection conn = DatabaseManager.getConnection();
    try {
        PreparedStatement pstmt = conn.prepareStatement(
            "INSERT INTO " + DatabaseManager.EVENTS_TABLE +
            "(event_id, date, title, edit_url, version_id) " +
            "VALUES(?, ?, ?, ?, ?)");
       
        pstmt.setString(1, event.getId());
        pstmt.setString(2, event.getDate());
        pstmt.setString(3, event.getTitle());
        if ( event.getEditURL() == null ) {
            pstmt.setNull(4, Types.VARCHAR);
        } else {
            pstmt.setString(4, event.getEditURL());
        }
       
        if ( event.getVersionId() == null ) {
            pstmt.setNull(5, Types.VARCHAR);
        } else {
            pstmt.setString(5, event.getVersionId());
        }      
       
        pstmt.executeUpdate();
    } finally {
        DatabaseManager.releaseConnection(conn);
    }
}

Viewing this code, you may wonder how the database gets started up and initialized. Java DB automatically boots up (and runs recovery and/or upgrade if necessary) the first time you obtain a connection. So when the applet initializes, I call DatabaseManager.initDatabase. This method opens a connection, sees if the tables exist, and if they don't, creates them. What's missing is any "start database" semantics, as this happens automatically.

public static void initDatabase(String dbname, String user, String password,
        boolean dropTables)
    throws Exception {
    initDataSource(dbname, user, password);

    if ( dropTables ) {
        dropTables();
    }
   
    // Assumption: if the requests table doesn't exist, none of the
    // tables exists.  Avoids multiple queries to the database
    if ( ! tableExists("REQUESTS") ) {
        createTables();
    }
}

The RequestManager is responsible for storing requests to Google Calendar when the application is offline. You can think of it as a simple store-and-forward message queue. Here is the method storeAddEvent, which is responsible for storing a request to add a new event.

public static void storeAddEvent(String eventId, String date, 
    String title) throws Exception  {
    Connection conn = DatabaseManager.getConnection();
   
    try {
        // Insert the request to add an event
        PreparedStatement pstmt = conn.prepareStatement(
            INSERT_REQUEST_SQL);

        pstmt.setInt(1, ADD_EVENT);
        pstmt.setString(2, eventId);
        pstmt.setString(3, date);
        pstmt.setString(4, title);
        pstmt.setNull(5, Types.VARCHAR);
        pstmt.executeUpdate();
    } finally {
        DatabaseManager.releaseConnection(conn);
    }
}

As you can see, this is fairly basic JDBC/SQL code; there is nothing overly complex or obtuse happening here.

In the CalendarController.addEvent method, there is one more thing worth noting. addEvent returns a string to the JavaScript code. This string is a JSON representation of the event that was added. This enables the JavaScript code to quickly turn the results into a JavaScript object. JSON is normally used for JavaScript communications with a web server, but it is also quite useful for JavaScript-to-Java data transfer within the browser. See the Resources section for more information on JSON.

Synchronizing

Now let's take a look at what happens when LocalCalendar goes back online. This happens when the user first starts up LocalCalendar, or when they press the Go Online button after being offline. In either case, the CalendarController.goOnline method is called, which looks like this:

public void goOnline() throws Exception {
    log("GOING ONLINE...");
    this.online = true;

    try {
        // Log in to Google Calendar 
        calendar = new GCalendar(calid, gmtOffset, user, password,
                startDay, endDay);
       
        RequestManager.submitRequests(calendar);
    } catch ( Exception e ) {
        e.printStackTrace();
        throw e;
    }       
}

The method RequestManager.submitRequests does the heavy lifting for goOnline. It reads all the pending requests from the database and submits them in sequence to Google Calendar. Each type of request (Add Event, Update Event, and Delete Event) has its own subclass of GCalendarRequest. Each subclass knows how to submit its request type to Google Calendar. This splash of polymorphism makes the submitRequests method fairly straightforward:

public static int submitRequests(GCalendar calendar) 
        throws Exception {
    List<GCalendarRequest> requests = getRequests();
   
    //
    // Go through the requests in order
    int failures = 0;
    int totalRequests = requests.size();
   
    System.out.println("");
    System.out.println("==== SUBMITTING PENDING REQUESTS " +
      "TO GOOGLE CALENDAR ====");
   
    conflicts = new ArrayList<String>();
   
    for ( GCalendarRequest request : requests ) {
        try {
            request.submit(calendar);
            System.out.println(request + " submitted successfully");
            deleteRequest(request);
        } catch ( NetworkDownException nde ) {
            throw nde;
        } catch ( SQLException sqle ) {
            // this is pretty severe, we need to bail
            throw sqle;
        } catch ( Exception e ) {
            System.out.println("ERROR submitting " + request +
               ": " + e.getMessage());
            failures++;
            deleteRequest(request);
        }
    }
   
    System.out.println("==== DONE - " + totalRequests +
        " requests submitted ==== ");
    System.out.println("");
                   
    return failures;
}

Notice how after each request is processed, it is deleted. The exception is if there is an network or local database exception. In this case the request is not deleted because it never made it to Google Calendar. It's worth nothing that this code could use some work in refining exactly which exceptions can be considered "successfully submitted but rejected" versus "the request never got there."

submitRequests implements the first two steps of going online: submitting all changes to the server and reporting errors. The last step, making local state match the server state, is handled by the method CalendarController.refresh. This is called by the JavaScript after calling goOnline. This method slurps all the events for the week from Google Calendar, and then tells EventManager to refresh the database with the results.

/**
* Refresh our calendar from Google Calendar and return a JSON string that
* represents the array of entries for the given date range
*
* @return a JSON string that represents an array of all the entries
*      for the calendar.
*/
public String refresh() throws Exception {
    log("DerbyCalendarApplet.refresh()");
   
    Collection<CalEvent> events = null;
    if ( isOnline() ) {
        try {
            events = calendar.getEvents();

            // Refresh the database with the events we got
            // from Google Calendar
            EventManager.refresh(events);
        } catch ( NetworkDownException nde ) {
            log("The network is down, going offline");
            goOffline();
        }
    }
   
    if ( ! isOnline() ) {
        events = EventManager.getEvents();
    }

    JSONArray jarray = new JSONArray();
    for ( CalEvent event : events ) {
       
        jarray.put(event.getJSONObject());
    }
           
    return jarray.toString();
}
refresh returns a JSON array of all the events to the JavaScript, which can then quickly turn this into a JavaScript array and use that to update the display.

Here is the code for EventManager.refresh:

/**
* Refresh the database with a list of events from the Google
* Calendar.  Google Calendar is The Truth and the local store
* must submit...
*
* Note we also clear out any pending requests, as they are no
* longer valid once we've refreshed. 
*/
public static void refresh(Collection<CalEvent> events)
    throws Exception {
    System.out.println("Refreshing local store with list of " +
            "events from Google Calendar...");
   
    DatabaseManager.clearTables();
   
    for ( CalEvent event : events ) {
        addEvent(event);
    }
}

As you can see here, I use the brute-force approach to match the local state with the server state: I clear the local tables completely, and then add all the events as I got them from Google Calendar.

Another approach to making the local state match the server state is to ask the server to send all changes since the last synchronization. This can be essential if you are dealing with large data sets. However, this requires a more sophisticated synchronization algorithm. Synchronizing this way requires the following steps:

  • Prior to going offline, take a snapshot of the current state of your local store when you go offline. You can do this by simply making a copy of your data directory.

  • When you go back online, after pushing all your changes up, ask the server for all changes since the last sync.

  • Revert your local store back to the snapshot you took at the last synch (e.g., replace your data directory with your snapshot directory).

  • Apply the changes from the server to the snapshot.

As you can see, this is fairly straightforward, but does require a little more work. I would recommend the simpler "blow away your local store and do a full refresh" approach that I use, if you can manage it.

Getting Your Application Logic Offline

The third aspect of providing offline support for an application is that you need to get your application logic offline. Many web applications are written with HTML/JavaScript on the browser, and server-side logic running in the server. This works fine when you're always connected, but breaks down when you're running offline.

One of the advantages of browser-based applications is that the user does not need to install anything. But the reality is that in an offline world something needs to be installed on the client machine. There are a few approaches to this, each of which has its strengths and weaknesses. I will mention these approaches here briefly, but this is an entire topic in itself.

  • Run a web server in the browser: You can run a Java web server like Tomcat or Jetty in the browser and deploy your application to this web server. See the Resources section for a few examples of this.
  • Install a local web server: You can install a web server to run locally, using Java Web Start or just as a simple install, and then deploy your application there for use offline.
  • Build a rich client: Depending upon the skill sets of your development team, it may be simpler to build your application as a "rich client" application deployed by Java Web Start, where all the presentation logic is contained in the application, rather than served up on the fly by the server. See the Resources for a blog I wrote talking about the potential values of going this way.

Conclusion

To build a disconnected or offline web application, there are three main requirements: local storage, synchronization, and your application logic needs to be available offline. The concepts around synchronization are fairly straightforward once you understand them. Building and delivering this kind of application is possible and achievable, and can provide great value to your customers.

Resources

David Van Couvering has spent his engineering career crossing the bridge between databases and the middle tier world of application servers, Java and distributed systems.
Related Topics >> Databases   |