Applications are no longer monolithic, especially when it comes to distributed platforms like J2EE. Non-functional requirements (NFR) compel the architecture to be distributed in the across three or more tiers, encompassing multiple nodes and connectors between them. Many times, each connector adds to the inherent weakness of the system architecture as a whole; but there are times when decreasing the number of nodes will just converge to a point where it is hard (if not impossible) to implement the system to meet non-functional requirements. Further, there's a question of, "How much of the processing can be done later?" The answer will determine which pieces of the processing need to be done synchronously and which can be asynchronous. After all, why do we need to execute the code in a sequential manner when we have multiple processors available in the hardware to do parallel computing?
This article is going to discuss the above aspects in the context of a highly scalable J2EE architecture. The article accompanies a Reference Implementation (RI) for the architecture, which can be deployed and executed. Even though the RI is designed for and tested in the open source web container Tomcat, the same concepts can be adopted and systems designed for other J2EE containers.
Getting Started
Our aim here is to list out a subset from the whole lot of concerns that arise during a software system design, and then look at how we address them in a particular context. For this, we will first try to understand the system requirements to the minimum detail required.
The existing system landscape is a typical e-commerce website with a medium to high level of traffic. The traffic pattern will vary with time in a day, but the peak transaction requirement is 300 transactions per second (TPS). Each of these transactions will end up in displaying a web page, as shown in Figure 1. These web pages consist of a single main section and another subsection. The subsection will display advertisement links that when clicked will lead to offers, discounts, etc., available in connection with the content for the main section. Currently, these links appear based on a random selection of keywords and/or phrases. When a user clicks one of the links, a separate request will go to a search engine service, which will then display all search results. The user can then click any of the links in these search results. Each click will earn revenue for our e-commerce site.
Figure 1. Existing web page sample representation
The current e-commerce site is designed for legacy technology, and it is hard to do any new development in that environment. Hence, there is a very high probability that the entire site will be redesigned for a new technology environment in the future. But the immediate requirement is to introduce intelligent logic to display ad links, which are projected to improve revenue substantially. Moreover, any new effort spent in that direction should be completely re-usable in the future when the e-commerce site is redesigned. The "as-is" state of component interactions is shown in Figure 2.
Figure 2. As-is system components
Migration Strategy
To answer the question of how to begin, we have the following constraints to take care of:
Changes to the existing system should be minimal.
Any major development should use current technology.
The new design should seamlessly integrate with the existing system.
The new system should be compatible with future changes to the system.
During migration, we should have a quick fallback strategy in case of a problem.
We should leverage open source tools as much as possible.
Taking into consideration the above requirements, we made the following decisions:
Java EE will be the development/deployment environment, since EE is a proven, distributed development platform.
HTTP will be the protocol for the new system/component interface. HTTP is a proven, popular, character-oriented open protocol.
Tomcat will be the HTTP protocol interpreter. Tomcat is mature, open source, and provides for distributed architecture.
We will also leverage HTTP processor threads in Tomcat to meet the TPS requirements.
Initially, we decided to use an open source database, but quickly we realized that we need to rethink this choice due to volume requirements. We ultimately opted for Oracle as our database.
We will limit the changes to the existing system to a soft flip-flop switch, as shown in Figure 3. Using this switch, we can continue to use the existing random logic for link generation or we can use the intelligent logic based on revenue aspects.
Figure 3. Migration strategy
The Going Gets Tough
In order to introduce intelligent logic for ad-link generation, the system needs to keep track of two kinds of data at the minimum:
Impressions: The number of times a particular key word/phrase is displayed in the browser.
Clicks: The number of times a particular key word/phrase is clicked by the user.
So the entire problem boils down to keeping track of two categories of events by the system. The constraints listed above compel us to develop new functionality as a block separate from the existing deployment. In other words, we are given the problem of creating a new system that will generate intelligent ad links. This new system is performance-critical, because it shouldn't be adding too much latency to the main flow of the transaction and also it should be serving at 300 TPS. Performance can be measured in different ways, and accesses per second or megabytes per second (MBPS) are just few amongst them.
After a proof of concept (POC), the decision was to host the deployment in a replicated farm. The volume of data collected by the system in terms of impressions and clicks is so huge that we have to aggregate around two million events in some batch, and that many events has to be inserted and/or updated to the persistent store too.
Distribute
Load balancing is one of the goals of application replication. Replication can be done in the same hardware or across multiple pieces of hardware. Which architecture to choose is a matter of what we are trying to achieve. In general, the increasing scale and scope of the use of operating system and hardware resources drives these emerging requirements. Replicated systems designed to meet these requirements are likely to be structured around the answers to a few critical architectural factors:
How much memory footprint the application requires
How many threads the application components need
How many connections each application has to handle
How many system resources like file handles are required
How much parallelism is attainable in a single process
How many CPUs can be leveraged at the same time from within the same hardware
What are the synchronization primitives available in case of multiple processes and nodes
How we coordinate shared accesses and leases
The allowable number of nodes that a unit of application job can cross, taking into consideration the performance criterion
Replication is a well-understood and very mature architectural choice, so much so that the addition of nodes is considered to be a no-risk proposition, provided we make design decisions for any negative impacts. We decided to have an application server farm with multiple nodes. In each node, we run a Tomcat web server hosting the HTTP component of our application. Most web applications can scale up in capacity and performance by adding more nodes with deployments, but the scaling factor may not be linear, due to LAN dependency and the fact that the load balancer has to distribute multiple TCP/IP connections to multiple nodes. For a 300 TPS requirement, we need to handle multiple connections at each node. The Tomcat server at each node needs to be configured to handle multiple connections. This is done in the server.xml file, which is the main configuration file for Tomcat; a portion of it is shown below:
acceptCount and connectionTimeout are the main connection-related attributes, and they are explained further below.
acceptCount: When all the HTTP processor threads in Tomcat are busy, client connections are queued up by Tomcat until a thread is available. The acceptCount attribute decides how many such connections can be queued up by Tomcat. If connections are still coming to Tomcat above and beyond this limit, clients will receive error messages.
connectionTimeout: Once a socket connection has been established between a client and Tomcat, there is a certain number of milliseconds before which the client has to send the request, which is controlled by the connectionTimeout attribute. After this time period, the connection will be closed.