Learn about JavaFX's APIs for Reading RSS and Atom Newsfeeds
JavaFX 1.2 introduced many interesting APIs, including APIs for reading RSS and Atom newsfeeds. If you haven't worked with
these APIs, you'll discover that they greatly simplify the task of integrating a newsfeed reader into a JavaFX application.
This article introduces you to the RSS and Atom APIs. You first explore their common foundation, and then tour each API's key
classes. Finally, you gain insight into how these APIs work by exploring the FeedTask class's newsfeed-polling
implementation.
Common Foundation
The RSS and Atom APIs are offshoots of a common foundation that's rooted in the abstract javafx.async.Task
class. This class makes it possible to start, stop, and track an activity (task) that runs on a background thread.
Task provides onStart and onDone variables that identify functions to be invoked atthe start/end of the task, and other variables that report task progress and disposition (success or failure). This class
also provides abstract
start(): Void and stop(): Void functions to initiate and terminate taskexecution.
The abstract javafx.data.feed.FeedTask class extends Task. In addition to inheriting
Task's variables, and overriding its start() and stop() functions,
FeedTask provides the following functions and variables:
-
poll(): Void: Poll the newsfeedlocationfor updated content, which is fetched, parsed, and
delivered to the application. -
update(): Void: Poll the newsfeedlocation. All content is fetched, parsed, and delivered to the
application. -
headers(of typejavafx.io.http.HttpHeader[]) identifies a sequence of HTTP request headers that
are to be sent tolocationeach time this newfeed is polled. This variable defaults tonull. -
interval(of typejavafx.lang.Duration) specifies the amount of time that must elapse before the
newsfeed is once more polled for updates. You must specify a positive value for this variable, which defaults to
0.0. (I wonder if it wouldn't be better to choose a positive value, such as60s, to be the polling
default, and perhaps allow0.0to indicate that polling isn't desired.) -
location(of typeString) specifies the newsfeed's address. This variable defaults to the empty
string (""). -
onException(of typefunction(:Exception):Void) identifies a function that's invoked when an
exception occurs during the current poll. This variable defaults tonull. -
onForeignEvent(of typefunction(:javafx.data.pull.Event):Void) identifies a function that's
invoked to handle extension elements, which are newsfeed elements whose namespace URI is not Atom or RSS. For
example, given an Atom newsfeed whose feed element's start tag is specified as
, parsing a subsequent<feed
xmlns="http://www.w3.org/2005/Atom" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/">
<opensearch:totalResults>1911</opensearch:totalResults>element results in three foreign events (for
the start tag, text, and end tag) because the namespace fortotalResultsis
http://a9.com/-/spec/opensearch/1.1/(as specified by theopensearch:prefix) instead of
http://www.w3.org/2005/Atom. This variable defaults tonull.
The common foundation is also rooted in the abstract javafx.data.feed.Base class, which is the base class for
RSS and Atom classes that describe various newsfeed elements. RSS's RSS and Atom's Feed top-level
element classes are examples of Base subclasses.
Base provides a namespaces variable (of type javafx.data.Pair[]) that contains thenamespace definitions in effect for the element. The
name member of each Pair specifies thenamespace prefix; the
value member specifies the namespace URI.
Base also provides a parent variable (of type Base) that identifies the parent(enclosing) element. For example, the
parent variable of Atom's Entry element class refers to itscontaining
Feed instance. If there's no parent (as is the case with Feed), this variable containsnull.
Finally, Base provides several functions that are useful when you need to create a custom feed parser. Because
this task is beyond the scope of this article, I refer you to Rakesh Menon's Custom Feed Parsers
blog post for more information and an example.
RSS API Overview
The RSS (Resource Description Framework Site Summary, Really Simple Syndication, Rich Site Summary)
API consists of 10 classes that are located in the javafx.data.feed.rss package. Central to this package is the
RssTask class.
|
RSS versions supported by the API |
|---|
|
The RSS API handles newsfeeds that conform to versions 0.91 (with non-optional item elements) through 2.0.11 (the most recent version at time of writing) of the RSS specification. |
The RssTask entry-point class extends FeedTask, and provides the following variables for installing
a custom factory, for reporting the newsfeed's channel element's non-item content, and for
reporting the content of each of the channel element's item elements:
-
factory(of typeFactory) identifies the factory that's used to create objects that represent
newsfeed elements. You only need to install your own factory when creating a custom feed parser. -
onChannel(of typefunction(:Channel):Void) identifies a function that's invoked to report the
channel element's non-item elements -- the RSS channel element contains
item and non-item elements, and is itself contained within the top-level
rss element. This variable defaults tonull. -
onItem(of typefunction(:Item):Void) identifies a function that's invoked to report the current
item element. This variable defaults tonull.
The Channel class extends the abstract RSS class, which represents the top-level
rss element, and which provides members for accessing the factory that's creating objects, for accessing the
task that's parsing the newsfeed, and more. In turn, RSS extends Base.
Channel also provides the following variables for accessing channel-oriented(non-item-specific) content:
-
categories(of typeCategory[]) identifies the categories (in terms of domains and text values) to
which this channel belongs. -
copyright(of typeString) specifies a copyright notice for channel content. -
description(of typeString) presents a phrase or sentence that describes this
channel. -
docs(of typeString) specifies a URL that points to documentation for the format used in the RSS
file. This might simply be a pointer to a Web page, and is useful for letting people, who encounter this RSS file in the
future, understand the file's purpose (much like code comments). -
generator(of typeString) identifies the program that was used to generate this
channel. -
image(of typeImage) identifies an image (in terms of description, height, link, title, URL, and
width) that can be displayed with the channel content. -
language(of typeString) identifies the language in which the channel was
written. -
lastBuildDate(of typejavafx.date.DateTime) specifies when this channel's content
was last changed. -
link(of typeString) provides the URL to the Website that corresponds to this
channel. -
pubDate(of typeDateTime) identifies the date when this channel was published. -
title(of typeString) provides this channel's title. -
ttl(of typeDuration) provides the number of minutes in which the news-reader can cache this
channel before it must poll the newsfeed to refresh channel content.
|
Unsupported channel elements |
|---|
|
For whatever reason, the RSS API doesn't support the channel element's cloud, textInput, skipHours, and skipDays elements. These elements are not represented by javafx.data.xml.QName constants in the RSS class, and they are not represented byvariables in the Channel class.
|
As with Channel, the Item class, which describes one of the channel's
item elements, extends RSS. It provides the following variables:
-
author(of typeString) provides the email address of this item's author. -
categories(of typeCategory[]) identifies the categories to which this item
belongs. -
comments(of typeString) specifies the URL of a Web page containing comments about this
item. -
description(of typeString) provides a description of this item. -
enclosure(of typeEnclosure) describes a media object (in terms of length, MIME type, and URL)
that's attached to this item. -
guid(of typeGuid) specifies, for this item, a globally unique identifier (in
terms of text and an indicator of whether or not this text permanently points to the full item described by this
item). -
link(of typeString) provides this item's URL. -
pubDate(of typeDateTime) identifies the date when this item was published. -
source(of typeSource) identifies the originating channel (in terms of the name
of the channel and an XMLization of that channel) for this item. -
title(of typeString) provides this item's title.
I've created a NetBeans RSSDemo project whose Main.fx source code demonstrates RssTask
in terms of its interval, location, onStart, onChannel,
onItem, onException, onForeignEvent, and onDone variables.
/*
* Main.fx
*/
package rssdemo;
import java.lang.Exception;
import javafx.data.feed.rss.Channel;
import javafx.data.feed.rss.Item;
import javafx.data.feed.rss.RssTask;
import javafx.data.pull.Event;
def MAX_POLLS = 3;
var counter = 0;
def task:RssTask = RssTask
{
interval: 15s
// The following location demonstrates a basic RSS newsfeed.
location: "http://javajeff.mb.ca/rss/javajeff.xml"
// The following location demonstrates onException().
// location: "http://developers.sun.com/rss/sdn_features.xml"
// The following location demonstrates onForeignEvent().
// location: "http://feeds.dzone.com/javalobby/frontpage?format=xml"
// The following location demonstrates IllegalArgumentException (must use
// AtomTask for Atom feeds).
// location: "http://feeds.sophos.com/en/atom1_0-sophos-company-news.xml"
onStart: function (): Void
{
println ("Task is starting");
if (++counter > MAX_POLLS)
{
task.stop ();
FX.exit ()
}
}
onChannel: function (c: Channel): Void
{
println ("Channel: {c}")
}
onItem: function (i: Item): Void
{
println ("Item: {i}")
}
onException: function (e: Exception): Void
{
println ("Exception: {e}");
task.stop ();
FX.exit ()
}
onForeignEvent: function (e: Event): Void
{
println ("Event: {e}")
}
onDone: function (): Void
{
println ("Completed poll #{counter}")
}
}
task.start ()
The source code introduces a constant that specifies the maximum number of times to poll the newsfeed, and a variable that
counts the number of polls that have been made so far. The idea is to limit the number of times the newsfeed is polled so
that the application won't run indefinitely.
After invoking the RssTask instance's start() function, which starts the newsfeed-polling
operation, the newsfeed located at the address assigned to location is polled every 15 seconds. The
onStart() callback is invoked at the start of each poll.
This callback tests to see if the counter has exceeded the maximum number of polls. If so, stop() is invoked to
stop the polling, and FX.exit() is invoked to kill the background thread that's associated with the
RssTask instance, allowing the application to exit.
Perhaps you're wondering why I placed if (++counter > MAX_POLLS) in onStart(), as opposed to
onDone's callback. I did this because onDone() isn't always called at the end of each poll. (You'll
discover why this happens later in the article.)
It's possible that an exception might be thrown as a result of the newsfeed being read or parsed. If this happens, the
onException() callback invokes stop() to stop the polling task, and then invokes
FX.exit() to kill the background thread and terminate the application.
This simple framework serves as a starting point for exploring the RSS API. As an exercise, expand onChannel()
and onItem() to output the values of their Channel and Item arguments' various
variables.
Atom API Overview
In contrast to RSS, the Atom API consists of 12 classes that are located in the
javafx.data.feed.atom package. Central to this package is the AtomTask class.
|
Atom versions supported by the API |
|---|
| The Atom API handles newsfeeds that conform to version 1.0 (the most recent version at time of writing) of the href="#resources">Atom specification. |
The AtomTask entry-point class extends FeedTask, and provides the following variables for
installing a custom factory, for reporting the newsfeed's feed element's non-entry content,
and for reporting the content of each of the feed element's entry elements:
-
factory(of typeFactory) identifies the factory that's used to create objects that represent
newsfeed elements. You only need to install your own factory when creating a custom feed parser. -
onFeed(of typefunction(:Feed):Void) identifies a function that's invoked to report the
feed element's non-entry elements -- the Atom feed element contains
entry and non-entry elements, and is itself the top-level element. This variable defaults
tonull. -
onEntry(of typefunction(:Entry):Void) identifies a function that's invoked to report the current
entry element. This variable defaults tonull.
The Feed class extends the abstract Atom class (inheriting members for accessing the newsfeed's
base URI, for accessing the factory that's creating objects, and more), which extends Base.
Feed also provides the following variables for accessing feed-oriented(non-entry-specific) content:
-
authors(of typePerson[]) identifies the authors (in terms of email address, name, additional
person-specific text, and the Internationalized Resource Identifier (IRI) associated with the person) of this
feed. -
categories(of typeCategory[]) identifies the categories (in terms of a human-readable label,
category name, and categorization scheme IRI) to which this feed belongs. -
contributors(of typePerson[]) identifies the persons who have contributed to this
feed. -
generator(of typeGenerator) identifies the program (in terms of human-readable name, program URI,
and program version number) that was used to generate this feed. This information can be used to debug an
Atom newsfeed. -
icon(of typeId) identifies this feed's iconic image (in terms of a URI to the
image). -
id(of typeId) specifies a universally unique and a permanent identifier (in terms of a URI) for
this feed. -
links(of typeLink[]) specifies links (in terms of href,
hreflang, length, rel, title, and type
XML attributes, and text associated with the link) from this feed to Web resources. -
logo(of typeId) identifies this feed's non-iconic image. -
rights(of typeContent) specifies the rights (in terms of src,
text, and type XML attributes) held in and over this feed. -
subtitle(of typeContent) provides this feed's subtitle. -
title(of typeContent) provides this feed's title. -
updated(of typeDate) specifies when this feed's content was last changed.
As with Feed, the Entry class, which describes one of the feed's
entry elements, extends Atom. In addition to sharing most of the same variables as
Feed, Entry provides the following unique variables:
-
content(of typeContent) specifies this entry's content. -
published(of typeDate) specifies when this entry was published. -
source(of typeFeed) identifies this entry's feed source. -
summary(of typeContent) specifies a short summary, abstract, or excerpt for this
entry.
I've created an AtomDemo NetBeans project for demonstrating AtomTask. This project's
Main.fx source code is very similar to RSSDemo's Main.fx source code.
/*
* Main.fx
*/
package atomdemo;
import java.lang.Exception;
import javafx.data.feed.atom.AtomTask;
import javafx.data.feed.atom.Entry;
import javafx.data.feed.atom.Feed;
import javafx.data.pull.Event;
def MAX_POLLS = 3;
var counter = 0;
def task:AtomTask = AtomTask
{
interval: 15s
// The following location demonstrates a basic Atom newsfeed.
location: "http://photos.dailycamera.com/hack/feed.mg?Type=gallery&Data=9573834_9ysrR&format=atom10"
// The following location demonstrates onForeignEvent().
// location: "http://blogsearch.google.com/blogsearch/feeds?bc_lang=en&hl=en&output=atom"
// The following location demonstrates IllegalArgumentException (must use
// RssTask for RSS feeds).
// location: "http://javajeff.mb.ca/rss/javajeff.xml"
onStart: function (): Void
{
println ("Task is starting");
if (++counter > MAX_POLLS)
{
task.stop ();
FX.exit ()
}
}
onFeed: function (f: Feed): Void
{
println ("Feed: {f}")
}
onEntry: function (e: Entry): Void
{
println ("Entry: {e}")
}
onException: function (e: Exception): Void
{
println ("Exception: {e}");
task.stop ();
FX.exit ()
}
onForeignEvent: function (e: Event): Void
{
println ("Event: {e}")
}
onDone: function (): Void
{
println ("Completed poll #{counter}")
}
}
task.start ()
This simple framework serves as a starting point for exploring the Atom API. Consider expanding onFeed() and
onEntry() to output the values of their Feed and Entry arguments' various variables.
Behind the Scenes with FeedTask
The important task of polling an RSS or Atom newsfeed occurs in FeedTask and a related class. I recently
decompiled these classes to explore how newsfeeds are polled, and share my findings in this section to deepen your
understanding of RssTask and AtomTask.
FeedTask creates an instance of the java.util.Timer class in its static initializer. This instancestarts a background thread and works with an instance of
FeedTask's nested SubscriptionTask class(a
java.util.TimerTask subclass) to support newsfeed-polling.
FeedTask's overridden start() function schedules the SubscriptionTask instance forexecution by invoking
Timer's public void schedule(TimerTask task, long delay, long period) methodwith the following arguments:
-
The
SubscriptionTaskinstance is passed totask. -
The long integer
0Lis passed todelay. -
The value of
FeedTask'sintervalvariable is passed toperiod.
Approximately every period milliseconds, the SubscriptionTask instance's
public void
run() method is invoked. This method invokes the SubscriptionTask-specific doPoll() methodwith a
true argument.
The doPoll() method first clears FeedTask's inherited started, stopped,
failed, and done Boolean variables to false. It also nulls out the inherited
causeOfFailure variable, and assigns -1 to the inherited progress and
maxProgress variables.
doPoll() next instantiates the javafx.io.http.HttpRequest class, which is the vehicle used toobtain newsfeed content, and initializes the following
HttpRequest variables prior to executing this task:
-
location: The value ofFeedTask'slocationvariable is assigned to this variable. -
onStarted: A function is assigned to this variable, and is invoked when the request starts to execute. The
function is responsible for invokingonStart(). -
onResponseHeaders: A function is assigned to this variable to retrieve and save the values of the HTTP
ETag and Last-Modified response headers. These values are needed to ensure that only
changed newsfeed content will be returned in the next poll request. -
onToRead: A function is assigned to this variable to obtain the total number of bytes to read, which is assigned
tomaxProgress. -
onRead: A function is assigned to this variable to obtain the number of bytes read so far, which is assigned to
progress. -
onInput: A function is assigned to this variable to parse the request content via an internal
parse(is)method call (whereisisonInput()'sjava.io.InputStream
argument). If parsing results in a thrown exception, the exception object is assigned tocauseOfFailure,
trueis assigned tofailed, andonException()is invoked. Finally,true
is assigned todone, andonDone()is invoked. (TheonInput()function isn't invoked,
and henceonDone()isn't invoked, when only changed content is requested but that content isn't available.) -
onException: A function is assigned to this variable to report a problem with the request itself (and not
parsing). If the request fails, the exception object is assigned tocauseOfFailure,trueis
assigned tofailed, andonException()is invoked.
Continuing, doPoll() ensures that only updated newsfeed content is returned by setting the request's
If-Modified-Since and If-None-Match headers to the previously saved
Last-Modified and ETag values, respectively.
|
Obtaining a newsfeed's updated versus entire content |
|---|
When true is passed to doPoll(), which happens when this method is called fromSubscriptionTask's run() method or FeedTask's poll() function,doPoll() sets If-Modified-Since and If-None-Match so that only updated contentis returned. In contrast, when you invoke FeedTask's update() function, which invokesdoPoll() with a false argument, those request headers will not be set, and the entire content willbe returned. |
doPoll() now iterates over FeedTask's headers variable, assigning each storedHttpHeader instance to the HttpRequest instance by invoking the latter instance'ssetHeader() function.
Finally, doPoll() invokes the HttpRequest instance's start() function to execute this
task, resulting in retrieved and parsed content. doPoll() then returns to the run() method. If it
throws an exception, run() invokes onException().
|
A parsing tidbit |
|---|
For brevity, I don't discuss parsing beyond the parse(is) method call. However, if you decide to explore theparsing implementation, here's a tidbit to save you some head-scratching: The parse(InputStream) methodinitializes the javafx.data.pull.PullParser instance's impl_skippedElements variable to thequalified names of Atom's summary, content, rights, title, and subtitle elements, and RSS's description, title, and copyright elements, to ensure that the parser treats any HTML or other markup that's embedded in these elements as literal text. |
At some point, you'll probably invoke FeedTask's overridden stop() function. This function invokes
the SubscriptionTask instance's inherited public boolean cancel() method to cancel the
newsfeed-polling task (but not kill the Timer instance's background thread).
Conclusion
Enough theory! Now that you've gained knowledge of JavaFX's RSS and Atom APIs, you might want to create your own newsfeed
reader. To help you with this task, I present a practical example that handles RSS and Atom newsfeeds in my forthcoming
companion to this article.
Resources
- Sample code for this article
- Atom specification
- JavaFX 1.2.1 API Documentation
- JavaFX's official Website
-
Rakesh Menon's Custom Feed Parsers
blog post - RSS specification
- Wikipedia's Atom (standard) entry
- Wikipedia's RSS entry
width="1" height="1" border="0" alt=" " /> |
- Login or register to post comments
- Printer-friendly version
- 4480 reads

width="1" height="1" border="0" alt=" " />

