I operate a small company, Post Tool, with my partner Gigi
Obrecht. Our focus over the last nine years has been the design and
development of database driven, content managed Web sites. In 1997
the California College of the Arts (CCA, formerly the California College of Arts and Crafts)
asked us to create their Web presence. Gigi and I were both
teaching there at the time. The dotcom was at its roundest point
in the Bay Area.
I decided on a project to develop a rudimentary content
management system (CMS). It was written in Perl and used flat text
files for storage. The primary user interface listed
directory-grouped "pages." The rich text editor was developed using
Shockwave Director. While the user interface was not terribly
elegant, this system was very useful to the college. Many years
later, we designed new templates for the site. CCA was still using
the same Perl-based system.
In the meantime, I had transitioned fully to server-side Java.
"Current CMS" became the focus of my work.
Almost nine years later, I have finally arrived at something that
is worth sharing. The major components of Current CMS include a
scaffolding generator, a simple persistence mechanism, a layer for
user filtered and logged database transactions, template
management, a site map formality, as well as JSP pages and servlet
controllers for managing content. The code base was entirely
rewritten earlier this year. I will try to write clearly about it
now. Ready?
Requirements
Many Web frameworks address the validation and flow of form
pages. Current CMS does not rely on this type of framework nor does it rely on a
template engine beyond JSP and Expression Language (EL).
Current CMS does not rely on a third-party persistence mechanism either. Web
applications are often bound by memory limitations, so the code
base has few dependencies. From Apache Commons: upload, io, lang,
logging. From Apache tag libraries: standard, datetime, string. And
of course, there is the JDBC Connector. These dependencies come to
about 1.5 MB in file size.
Java 5 is required. MySQL is used by default, but this can
easily be changed to a database of your choice. Most applications
we deploy easily handle moderate traffic with a 48 MB heap.
A Quick Note About Information Architecture
All content managed systems require some form of Entity
Relationship Diagram (ERD) to be defined prior to development. There are
many methods of determining a client's requirements for the ERD.
Usually, I begin by asking them to describe key pages from the
site-to-be, questions like: What is on the home page? When you
click on that, what does the subsequent page look like? The
results generally look something like those shown in Figure 1.
Figure 1. A "wireframe" (not an ERD!) for a wine distributor
(click the image to enlarge)
From this exercise with our clients, we can produce a reasonable
ERD, which will be used as the foundation for the CMS. Current CMS
requires the ERD in a particular XML format. The entities
configuration file will be used by the scaffolding generator to
produce code templates. It will also be used by the application to
access meta data about the objects, their fields, and references. I
considered disposing of the XML configuration file in favor of Java 5
annotations but came back to the XML. In a digression, I could
elaborate on the decision. Instead, let's examine the details of
the configuration files.
Figure 2. A sample ERD for a wine distributor
Configuration
Current CMS requires a particular configuration of the WEB-INF
directory for each application deployment. Mostly it conforms to
Sun's established requirements for a Web Application Archive.
Following is the basic structure.
web.xml
Tag library descriptor files: dt.tld, str.tld,
pt.tld, c.tld
lib - a directory that contains the required JARs
config - a directory of Current CMS configuration
files
The standard deployment descriptor, web.xml, contains
many entries for Current CMS. There is a URLFilter, a
servlet that loads an initialization of the application, as well
as many controller servlets used by the CMS framework. Of note is
the initialization servlet com.posttool.cms.CMSConfig,
which loads the case configuration files that we are about to
examine, located in WEB-INF/config.
The three required configuration files are located in
WEB-INF/config: deployment.xml, entities.xml,
and templates.xml. The sample configuration files are part
of an installation for a wine distributor.
deployment.xml
In general, we find that there are at least three deployments of any
Web application in development. There may be more, depending on the
number of developers hosting a local version of the site. To track
deployment variations, we place all descriptors in a single file.
When the framework is initialized, a deployment is chosen by the
CMS. com.posttool.cms.CMSConfig compares the location of the configuration file and the servletPath
parameter of each deployment. If it cannot find a matching path, it
will log an error displaying the path it expects to see. If the
error is produced, copy the path and make sure it is listed exactly
in one deployment as servletPath. Note: It should
also make a database log in an attempt for complete choice
verification because two deployments may have the same base
path to the webapp; it doesn't at the moment.
In the following example, the first deployment will be used on
my development machine. The second deployment is chosen by the
initializing servlet on the live server. The last deployment, in
this case, will be selected when running the Ant task to generate
the scaffolding, which is why it does not require attributes
related to HTTP:
The entities XML configuration file will represent the ERD. Our
case study is interested in wines, vineyards, and
their related presses. The entities root
element can contain any number of object elements.
Each object may contain any number of
references and fields. All of these
elements require a name attribute and may have a
description.
The reference element has two additional attributes. The
object attribute must make reference to the name of
another object in the schema. Note: It cannot
reference itself (yet). The cardinality attribute can only
be "one" or "many." The name attribute should be
synchronized with a corresponding reference in the related
object. In the case of the vineyard, its press is
known as "PressVineyard," which means that in this bidirectional
relationship, a press instance knows its vineyards by the same
name.
Reference cardinality is meant to be very simple. It is not
intended to model much more than bidirectional, many-to-many
relationships. This model is appropriate 90 percent of the time. Historic versions of the Current CMS code modeled other
more complex restraints, but in this latest code base, I have not
found the time or need to add the required detail to the code. If
you are interested in making this area of the persistence mechanism
more robust, please contact me.
The field element is straightforward. The
type attribute can be one of the following values:
integer, decimal, file, date, string, html, password. By examining
the generated scaffolding, you can see how these types map to Java
and SQL.
Finally, there is the definition of a site contributor. A class
and table will be made for the people who will be maintaining the
content of the site. The fields listed are required at a minimum by
the class this object extends,
com.posttool.dbouser.AbstractUser. Of course, you can
add more fields if you want to store more information about your
contributors:
Templates are used to present published content. If you don't
plan on publishing any information from the content management
system, you can skip to "Generating the Scaffolding." If you do,
the Current CMS template formalities will help you.
Generally, publicly viewed page templates will be created by a
graphic designer and HTML/JavaScript developer. Our workflow allows
us to develop these pages prior to the completion of that work and
minimize integration time. The templates.xml
configuration file is used by the scaffolding generator, the CMS,
and the URLFilter. The scaffolding provides the basics of a
model/view pairing pattern. One part is the Java selection and
representation of page relevant data. The other is JSP templates
with expression language stubs dereferencing content produced by
the Java view.
The configuration file contains a list of template elements. Each has three attributes: name,
file, and class. The "name" is how the
template is known. Using factory methods, a programmer can request
a template by this name. More significantly, the name signifies a
URL pattern for this type of template. The "file" is a path for a
JSP template. The "class" is a Java program that implements the
interface com.posttool.site.TemplateView and is
responsible for providing the appropriate data for its template as
name/value pairs.
The parameter elements are not required unless a
template needs some kind of input. If this required input is a
record id, then the parameter type will refer to one of the entity
definitions. The only other valid type of parameter is a string
(for now). This list of templates is a direct representation of the
wireframe documentation shown as Figure 1:
The pages configuration file contains a hierarchical arrangement
of page elements. Each page in our CMS-driven site is
described by a name, a URL, a template, and possibly some parameters
(like which record to show by default). The template element must name one of the templates defined in the previous
configuration file. The URLFilter will look to this
configuration file first to see if the request URL matches one of
the page entries.
This file can be managed entirely by using an interface provided
in the CMS Web application shown in Figure 3. There is no need to edit the
XML directly, although the interface doesn't accommodate all
requirements. Using a text editor to perform global find and
replace operations is helpful, for example, if a template name
changes. Interfaces, as well as configuration files, should be created for this, but I would need to enlist some help. "Anyone?"
Figure 3. CMS interface for manipulating pages.xml
(click the image to enlarge)
events.xml
Although it is not necessary in our case study, in certain cases a programmer
may want to be notified of database operations. Entries into the
events XML file make this possible. The listener must implement
com.posttool.cms.record.RecordChangeListener.
Generally the purpose of this kind of listener is to spawn tasks
when data is changed or removed. A
RecordChangeListener may create and delete
thumbnails of uploaded images in a background thread, for example.
Another may email a notification if a certain data constraint
caused an exception: