Skip to main content

Using RSS in JSP pages

August 8, 2003

{cs.r.title}






RSS, or Really Simple Syndication, is a specification for XML files to provide
syndicated data. It is typically used by news sites and blogs to provide information
concerning the latest news stories, posts, etc., in such a way that links to the
stories can be included on other web sites or even downloaded by news aggregator
programs. Many thousands of RSS feeds are currently available -- take a look
at a site such as syndic8
to get an idea.

Informa is a relatively new open source Java API for parsing RSS files available from http://informa.sourceforge.net/. The Informa project was the result of merging two Java-based aggregator services: HotSheet and Risotto. This article aims to show how you can use the Informa API to quickly access RSS feeds to add some dynamic news and information content to your web sites.

RSS: An Overview

To begin with, we'll take a quick look at an RSS example. It's a very simple
format (hence the name Really Simple Syndication), but for those who would like
a more in-depth introduction to RSS, you could do far worse than checking
out O'Reilly's RSS site, or by reading Mark Pilgrim's very good overview. Here's the example:

<?xml version="1.0"? >
<!-- The version of RSS we are using -->
<rss version="0.91">
<!-- Information about our channel -->
<channel>
    <title>Random News</title>
    <link>http://www.randomnews.com/</link>
    <description>
        Random news from the random news website!
    </description>
    <language>en-us</language>
    <copyright>Copyright: (C) 2003 Random News.com</copyright>

    <image>
        <title>Random News Logo</title>
        <url>http://randomnews.org/images/logo88x33.gif</url>
        <link> http://randomnews.org/</link>
    </image>

    <item>
        <title>News piece one</title>
        <link>http://randomnews.org/getnews.pl?article=1</link>
    </item>

    <item>
        <title>News piece two</title>
        <link>http://randomnews.org/getnews.pl?article=2</link>
    </item>
</channel>

This is using version 0.91 of the RSS specification. You need a channel that describes the source for the information we are getting. There will be one channel per XML document. Without going into too much detail of this format, this is how you describe a channel:

<channel>
<title>Random News</title>
<link>http://www.randomnews.com/</link>
<description>
    Random news from the random news website!
</description>
<language>en-us</language>
<copyright>Copyright: (C) 2003 Random News.com</copyright>
</channel>

The following defines an image provided by the site.

<image>
<title>Random News Logo</title>
<url>http://randomnews.org/images/logo88x33.gif</url>
<link> http://randomnews.org/</link>
</image>

This is the real meat of the file. The <item> block gives us the title of a piece
of information, a link to the original post, and optionally, a description of the post.
This is by no means all of the data that an RSS file may provide, but this enough for our purposes.

<item>
<title>News piece one</title>
<link>http://randomnews.org/getnews.pl?article=1</link>
<description>Its an article</description>
</item>

There are thousands of news sites and blogs out there with feeds available in this format. Just think -- instead of doing the normal morning check on Slashdot, Freshmeat, or wherever,
what if their content was delivered straight to your own personal portal, or RSS aggregate service?
Implementing such a solution is very simple -- in the rest of the article we'll look at how we can
process this data and display it on our JSP pages.

Reading RSS: The Informa API

Currently at version 0.3.0, Informa works perfectly well at reading RSS versions 0.91, 0.92, 1.0, and 2.0.
Let's have a quick look at its usage:

try {      
  URL feed = new URL("file:/C:/samplefeed.rss");
  ChannelFormat format = FormatDetector.getFormat(feed);
  ChannelParserCollection parsers =
                          ChannelParserCollection.getInstance();
                      
  ChannelParserIF parser =
    parsers.getParser(format, feed);

  parser.setBuilder(new ChannelBuilder());
  ChannelIF channel = parser.parse();
 
  for (Iterator iter = channel.getItems().iterator();
                                     iter.hasNext();) {
    ItemIF item = (ItemIF)iter.next();
    System.out.println(item.getTitle());
  }
} catch (MalformedURLException mue) {
  mue.printStackTrace();
} catch (UnsupportedFormatException ufe) {
  ufe.printStackTrace();
} catch (ParseException pe) {
  pe.printStackTrace();
}

This simple example gets the RSS feed and prints out the news items.
This small piece of code will form the basis for much of what follows,
so it's worth going over in detail.
Begin by creating a URL object that will point to the feed to be loaded.
We then use the handy FormatDetector method to determine which version
of RSS the feed uses and gets us the relevant parser.

URL feed = new URL("file:/C:/samplefeed.rss");
ChannelFormat format = FormatDetector.getFormat(feed);
ChannelParserCollection parsers =
  ChannelParserCollection.getInstance();

Next, we get the correct parser for our feed type (there is one per supported version
of the RSS specification) and create a default builder object for the parser. In
Informa, a builder object is responsible for the creation and storage of a feed.
Currently in development is a Hibernate Builder, which will allow database persistence of a feed.
Here, the default ChannelBuilder is used, which simply creates an in-memory feed.

ChannelParserIF parser = parsers.getParser(format, feed);
parser.setBuilder(new ChannelBuilder());

Finally, we parse the document to create a bean representing an RSS channel.
Now, we could embed this code directly into our JSP code as a scriptlet, but
this is not best practice. Instead, we are going to produce a reusable custom
tag that will allow us to display any named feed.

ChannelIF channel = parser.parse();

RSS Custom Tags

Let's start by looking at how our tag will look in our JSP page when requesting a feed from the BBC:

<%@ taglib prefix="rss" uri="/WEB-INF/rsstaglib.tld" %>
<rss:simpleRssFeed uri="http://www.bbc.co.uk/syndication/feeds/news/ukfs_news/world/rss091.xml" />

Pretty simple -- we have a tag with one required method that names the feed. Now let's look at the code:

public class SimpleRssFeedTag extends TagSupport {
  private String uri;

  public String getUri() {
    return uri;
  }

  public void setUri(String uri) {
    this.uri = uri;
  }

  public int doEndTag() throws JspException {
    JspWriter out = pageContext.getOut();

    try {
      URL feed = new URL(getUri());
      ChannelParserCollection parsers =
        ChannelParserCollection.getInstance();

      ChannelFormat format =
    FormatDetector.getFormat(feed);
      ChannelParserIF parser =
        parsers.getParser(format, feed);
      parser.setBuilder(new ChannelBuilder());

      ChannelIF channel = parser.parse();
      out.print("<b>" + channel.getTitle() + "<b><br />");

      for (Iterator iter = channel.getItems().iterator();
           iter.hasNext();) {
        ItemIF item = (ItemIF) iter.next();
        out.print("<a href="" + item.getLink() + "">");
out.println(item.getTitle() + "</a><br />");
      }
    } catch (MalformedURLException mue) {
  throw new JspException(mue);
    } catch (UnsupportedFormatException ufe) {
      throw new JspException(ufe);
    } catch (ParseException pe) {
      throw new JspException(pe);
    } catch (IOException e) {
      throw new JspException(e);
    }

     return EVAL_PAGE;
   }
}

This time, rather than printing the titles and links to the command
prompt, we are formatting our links and titles as HTML. Let's look at an example page where
we are requesting a couple of feeds, say, OnJava.com and Java.sun.com's technology highlights. Use this tag in a JSP page as follows:

<%@ taglib prefix="rss" uri="/WEB-INF/rsstaglib.tld" %>
  <table>
    <tr>
      <td>
        <rss:simpleRssFeed uri="http://www.bbc.co.uk/syndication/feeds/news/ukfs_news/world/rss091.xml" />
      </td>
      <td>
        <rss:simpleRssFeed uri="http://servlet.java.sun.com/syndication/rss_java_highlights-PARTNER-20.xml" />
      </td>
    </tr>
  </table>

And the result:


Figure 1. Our first RSS tag in action

We have managed to display the news items, and clicking on the links will take you to
the articles. Whenever the syndicates update their RSS files, your page will change too! As an exercise, consider limiting the number of posts
or adding the display of a syndicate's logo to this basic tag.

A More Refined Tag

Currently, we are doing too much of the actual formatting of the display in the tag itself. This is inconvenient, as it means that in order to change the formatting of the tag's results, we need to change the code. It would be much
better if we could leave the mechanics of reading the feeds up to the tag, and have all of the
formatting in the JSP. In order to achieve this, we need to allow the web designer to decide
what parts of an RSS channel are required, and embed them in standard HTML.

The JSP Standard Tag Library (JSTL) introduced a simple Expression Language (EL), which allows us to quickly and easily access JavaBean properties at runtime. We are
using the JSTL EL for accessing beans and displaying properties, which is fairly straightforward.
For example, to print out the name property of a bean, we would do the following:

<c:out value="${bean.name}">

Here, the bean is a JavaBean available in the page. The <c:out> tag is used to retrieve
the returned value of the expression ${bean.name} and print it to the output stream.
Our new custom tag is going to expose the RSS feed as a series of beans, and then use the JSTL EL
to access and display its data. Let's look an example use of our new tag:

<rss:readFeed uri="http://today.java.net/pub/q/weblogs_rss?x-ver=1.0" var="channel">
<strong><c:out value="${channel.title}"/></strong>

<ol>
  <c:forEach var="item" items="${channel.items}">
  <li>
    <a href="<c:out value="${item.link}"/>">
    <c:out value="${item.title}"/></a>
  </li>
  </c:forEach>
</ol>
</rss:readFeed>

The first tag <rss:readFeed> iterates over the feeds channels and loads the
into the page scope as the bean name channel. The use of the ${channel.title}
code gets the title property and displays it. Next, we use a standard <c:foreach>
tag to iterate over the items property in channel bean, using the JSTL EL to display each
item's title and link. As you can see, all of the formatting is done by the JSP code itself -- here
we create a series of HTML lists for each channel in a feed, but this could as easily be a series
of <div>s, table rows, or whatever.

Surprisingly, this code isn't much more complicated than the original example. Let's take a look:

public class RefinedRssFeedTag extends TagSupport {
  private static final ChannelBuilder DEFAULT_BUILDER = new ChannelBuilder();
  private static final  ChannelParserCollection PARSERS =
    ChannelParserCollection.getInstance();

  private String uri;
  private String var;
  private ChannelIF channel;
 

  public String getVar() {
    return var;
  }

  public void setVar(String var) {
    this.var = var;
  }

  public String getUri() {
    return uri;
  }

  public void setUri(String uri) {
    this.uri = uri;
  }

  public int doStartTag() throws JspException {
    JspWriter out = pageContext.getOut();

    try {
      URL feed = new URL(getUri());

      ChannelFormat format =
        FormatDetector.getFormat(feed);
      ChannelParserIF parser =
        PARSERS.getParser(format, feed);
           
  parser.setBuilder(DEFAULT_BUILDER);
      channel = parser.parse();
 
      //store the channel in the page...
      pageContext.setAttribute(getVar(), channel);
    } catch (MalformedURLException mue) {
      throw new JspException(mue);
    } catch (UnsupportedFormatException ufe) {
      throw new JspException(ufe);
    } catch (ParseException pe) {
      throw new JspException(pe);
    }

    return EVAL_BODY_INCLUDE;
  }
}

The main work is done in the doStartTag method. We parse the RSS file specified
in the uri attribute, and then we store it in the pageContext under
the name specified by the var attribute (this is standard practice throughout the JSTL).
This allows ${channel} to be used in the tag body. And that's pretty much it!
Now let's use it to view a couple of feeds -- two of a computer programmer's best friends, Slashdot and Freshmeat.

<rss:readFeed uri=http://slashdot.org/slashdot.rss var="channel">
<IMG src="<c:out value="${channel.image.location}"/>">
<a href="<c:out value="${channel.image.location}"/>">
<strong><c:out value="${channel.title}"/></strong></a>
<ol>
<c:forEach var="item" items="${channel.items}">
<li><a href="<c:out value="${item.link}"/>">
<c:out value="${item.title}"/></a></li>
</c:forEach>
</ol>
</rss:readFeed>

<rss:readFeed
uri="http://freshmeat.net/backend/fm-releases-software.rdf" var="channel">
<strong><c:out value="${channel.title}"/></strong><br />
<a href="${channel.location}">[Feed]</a><br />
<c:forEach var="item" items="${channel.items}">
<a href="<c:out value="${item.link}"/>">
<c:out value="${item.title}"/></a><br />
</c:forEach>
</rss:readFeed>

When reading the Slashdot feed, we format the title as a link right back to
Slashdot itself, with the rest of the items formatted as a standard HTML ordered list.
With Freshmeat, we know they don't provide an image, so we ignore that, but we also
provide a URL link back to the source of the RSS itself, with items' simple links
separated by <br /> tags. The result can be seen below.


Figure 2. Example use of a more complex RSS Tag

Conclusion

I have shown with this article how you can quickly and simply create an RSS tag
that should enable you to quickly insert RSS feeds while keeping with your site's current
design. In no way should this be considered the end of the world -- the RefinedRssFeedTag
as presented here is far from perfect. Most importantly, no caching of the requested
feeds is done, resulting in feeds being loaded and parsed every time the tag is run.
Over the course of the next several articles, we will look at approaches that improve upon the
solutions provided here, and will also look at other ways in which we can use RSS to
enrich our software.

Sam Newman is a Java programmer. Check out his blog at magpiebrain.com.
Related Topics >> JSP   |   Web Services and XML   |