Clustering with the Shoal FrameworkShoal is an open source, Java-based generic clustering framework. It can be used in your applications to add clustering functionalities like load balancing, fault tolerance, or both. Applications using Shoal can share data, communicate via messages with other cluster nodes across the network, and notify of relevant events like the joining, shutdown, and failure of a node or group of nodes. You can take appropriate measures and perform monitoring tasks when these events occur; Shoal forwards a signal to your code to track these notifications.
Shoal is the clustering framework used by the Glassfish project to implement its application server clustering. One of the benefits to your application is that Shoal abstracts away network details and the network communication API. Under the hood, the default group communication provider uses JXTA for peer-to-peer reliability and scalability.
Shoal is a lightweight component; you can embed Shoal not only in Java EE applications, but in SE applications too.
In this article, we'll cover the Shoal architecture and its basic concepts. Then we'll illustrate how to integrate your own code into the clustering infrastructure.
We will now discuss the architecture of a Shoal cluster solution and the internal design of Shoal. This section also describes main components, such as Signals and the Distributed State Cache (DSC).
Shoal's main concept is the group, a virtual union of cluster nodes by a single name. Each group is composed of members. A member is uniquely identified by its member token. A token is just a String, usually a Global Unique ID (GUID), to avoid name collision between the nodes. Also, a member can be a spectator or a core, the difference being that when a core member goes down, all other members are notified of the failure. On the contrary, an spectator's failure is not a significant event that Shoal notifies other members about. Spectators are commonly used for cluster administration and/or monitoring applications.
A network can host many groups, but be aware that each group consumes additional bandwidth and processing resources.
The second important concept is the Group Management Service (GMS). It manages the network's groups by keeping track of the members, mediating and facilitating member cooperation and communication.
Shoal divides groups into functionalities called components. This division is up to the developer; by default Shoal doesn't impose any structure on the cluster. A group, for example, can have a transaction component and a monitoring component, each processed by one or two physical machines. The API allows the sending of signals to specific nodes that implement a particular component.
The last significant piece is the signal. Signals are events in the group's lifecycle, such as a new node joining the cluster or another one leaving it. Signals are covered in greater depth below.
Figure 1 illustrates the physical distribution of a example Shoal group.

Figure 1. A Shoal system's architecture
Figure 1 shows three physical machines, with two running a JVM and executing a single instance of the application, and the third machine running two JVMs, each one executing one instance. Each application instance has loaded Shoal's GMS, which discovered the other peers and joined the group.
Note that all the machines are connected in the same physical network. The physical location is not a concern as long as multicast traffic traverses the relevant nodes.
Shoal is comprised of two main parts: the GMS Client API and the GMS Service Provider Interface (SPI). Your application interacts with the API, which in turn uses the SPI to talk to the underlying group communication protocol. The default SPI uses JXTA to provide group services.
Figure 2 illustrates the dynamics between Shoal's layers, your application, and the network:

Figure 2. Shoal's impact on your application architecture
Calling Shoal's API, you can:
The framework calls your code when signals arrive from any node on the group. For this to happen, you must register callback objects as described in the "Listening to the Group's Messages" section.
Each node in the group has the following built-in cluster lifecycle signals to both send and receive from other nodes:
Shoal can send other signals related to the domain of the application. Those are called message signals. Both signal types can arrive at any time and can encapsulate arbitrary data.
The most interesting signals are the failure family, as they enable Shoal's recovery features. When a node fails and then tries to recover, Shoal brings up something called recovery failure fencing. This protects the node from getting inconsistent states and fences its recovery process. A virtual fence is raised for a member's identity inside the group. This is done to protect cluster operations from race conditions. When the fence is lowered for the identity, any member can communicate with the previously fenced one.
Sometimes another node wants to take the failed node's place. This is achieved by a node getting the failed member's identity. Let's make it clearer with an example.
When node A goes down, the other nodes in the group get notified
via a FailureSuspectedSignal. After a timeout period a
FailureNotifySignal is fired. If any of the other members in
the group is a FailureRecoveryAction listener, then these
nodes are candidates to substitute the failed one. Let's call one
such node "node B."
At this point, the GMS master node notifies node B that it has
to become a recovery delegate for the failed node. It sends the
node a FailureRecoverySignal. Finally, if node B acknowledges
that signal, it raises the fence, does the recovery work (such as
taking over resources or tasks of node A), and lowers the fence.
This takes us back to normal operation.
If the failed node goes up again, it must check whether the fence is raised for its member identity. If it's not fenced, and it wants to recover, it must raise a fence. After the recovery is finished, it lowers the fence again.
Shoal also provides a data sharing mechanism for the group called the Distributed State Cache. Though not a full-blown distributed cache like JBossCache, it allows the nodes to share data.
The DSC is an associative data structure like a
java.util.Map. The default implementation uses a node-local
HashMap, and replicates the changes to the other nodes for
better performance. Beware that it doesn't implement a Least
Recently Used (LRU) or any other algorithm to eliminate
unused elements. You can only put a serializable object into
the map as a key, a value, or both.
In the next section, we'll cover how to integrate Shoal into your application.
Embedding Shoal into your application is straightforward; the time-consuming part is thinking about what clustering features you really need: load balancing, fault tolerance, or both. If you add something you don't need, the complexity grows and can jeopardize your project.
Begin by downloading the Shoal distribution from Shoal's java.net site. Unpack and copy both shoal.jar and jxta.jar to your application classpath. That's all you need besides a working network connection.
It's very useful to test with two instances of the application, so you can see how it responds to the other instance joining and failing, along with message passing. So let's start by creating a group and joining it.
The first thing we need to create or join a group is a group name. The first node to start the Shoal framework creates the group with this name. If no other nodes for the same group on the network have started, this node becomes the master node. From now on, each node joining the group will acknowledge this node as a master.
If there's already a group created with the group name, the new node joins as a vanilla member.
First you must start a new GMSModule by calling the
GMSFactory.startGMSModule() method, like the following
code fragment does:
GroupManagementService gms = (GroupManagementService)
GMSFactory.startGMSModule(serverName, groupName, memberType, props);
The method receives four parameters:
groupName identifies the group name to create if it
doesn't exist, or to join in the case the group is already
created.serverName uniquely identifies this instance in the
cluster. You can't create a node with a duplicate name in the
cluster.memberType parameter is an enum
characterizing the member's role: CORE or
SPECTATOR, as discussed in the previous section.In the case where a Shoal creates a new group, it will display a message like the following in your console:
7:33:14 PM com.sun.enterprise.ee.cms.impl.jxta.ViewWindow
getMemberTokens INFO: GMS View Change Received: Members in view
(before change analysis) are : 1: MemberId: node1, MemberType:
CORE, Address:
urn:jxta:uuid-4879342BEE684EC598A9B8741505B700A62767B3102B4D258A48FDBF1961AC0D03
This message tells us that a new core member, called
node1, is a member of the group, and shows its unique JXTA
identifier. If the group already exists, the message includes the
other member JXTA identifiers.
After the group is created, you must call the
join() method, as the next code snippet shows:
try {
gms.join();
}
catch (GMSException e) {
e.printStackTrace();
}
After join()ing, Shoal will output a message like the following:
Nov 29, 2007 7:33:14 PM
com.sun.enterprise.ee.cms.impl.jxta.ViewWindow newViewObserved
INFO: Analyzing new membership snapshot received as part of event :
MASTER_CHANGE_EVENT
In this case, the member becomes the master of the group, as it is the creator and only member at this moment.
Once you are part of the group, you can send and receive
messages to and from other members. Shoal exposes a
GroupHandle object to achieve several tasks, one of
which is to send message signals. Let's see how you can use the
GroupHandle to send messages in the following
fragment:
gms.getGroupHandle().sendMessage(null, message.getBytes());
This method sends a byte array to all members. The first
parameter is the component name; in this case the null means
all components. The second is the byte array of a string.
The GroupHandle can also send a message to all
members, a specific subset, or just one member by using member
tokens. See the GroupHandle Javadocs for more
details.
To begin listening to the group member messages, you must
register a callback with the framework. To do this, simply create a
class implementing the CallBack interface. You have to
implement just one method: processNotification(). The
following code fragment shows an implementation:
public void processNotification(Signal signal) {
signal.acquire();
if (signal instanceof MessageSignal) {
System.out.println(
":Message Received from:"
+ signal.getMemberToken()
+ ":[" + signal.toString() + "]");
} else {
System.out.println(
":Other Notification Received from:"
+ signal.getMemberToken() + ":["
+ signal.toString() + "]");
}
signal.release();
}
The signal must be acquire()d and
release()d to avoid race conditions. The
Signal's getMemberToken() method returns
the member token of the node that emitted the signal. It's very
useful for responding only to the member who sent the signal.
Once you have the message handling code, you must register the
CallBack with Shoal.You can register to receive custom
messages directed to a particular component of the cluster by
hooking up a MessageActionFactoryImpl and sending a
componentName parameter.
gms.addActionFactory(new MessageActionFactoryImpl(callBackClass), componentName);
This particular MessageActionFactory receives
MessageSignals, the only non-lifecycle signals built
into Shoal. The ActionFactory will call the instance
of the CallBack object, every time a
MessageSignal arrives directed to the
componentName parameter.
To share data between nodes without using messages, you can use the DSC. Remember that no automatic reaping of the unused values occurs. The default implementation doesn't have any stale value reaping mechanism or a capacity limit, so you must be very cautious using the default DSC implementation. You can write another implementation to manage DSC according to the application requirements.
The DSC is accessed via the GroupHandle by invoking
the getDistributedStateCache() method, as in the
following code:
DistributedStateCache dsc = gms.getGroupHandle().getDistributedStateCache();
This gets a reference to the DSC. To add a value, use the
add() method:
dsc.addToCache( componentName, memberTokenId, key, state);
There's an individual cache for each group component; you
specify in which cache to store the state using the
componentName parameter. The second parameter is the
calling member's token. The third is the key you'll use to recover
the state. Finally, the fourth parameter is the state to store in
the cache.
To retrieve the stored value, use the get()
method:
Object o = dsc.getFromCache( componentName, memberTokenId, key) ;
This retrieves the state, using the key to search it. Finally, to
remove an object, use the remove() method:
dsc.removeFromCache( componentName, memberTokenId, key) ;
This eliminates the state associated with the key from the DSC. .
Hopefully, after doing some useful work, the node can shut down.
The master node can shut down all members in the group. To achieve
both you must call the shutdown() method of the
gms object. Shutting down members or the whole group
sends planned shutdown signals to their members.
Hopefully this article clarifies some of the less-documented aspects of Shoal and kickstarts new users of the framework.
Clustering gives you a solution to solve two important problems: fault tolerance and load balancing, a must for enterprise mission-critical applications. Load balancing offers much-needed scalability and fault tolerance means less downtime.
Shoal makes it easy for you to achieve both load balancing and fault tolerance characteristics in your application. However, the most commonly used scenario is group communication, including message passing and state caching. You can easily add cluster facilities to your application by integrating the Shoal framework into your code.
To see detailed characteristics and advanced uses of the framework, such as cross subnets groups, take a look at the examples and design documents included with Shoal.
|
|