The most popular scripting languages are available in some form
for the Java platform. We have Jacl for TCL,
Jython for Python, and
JRuby for Ruby. One offering is
missing from this bunch: what Java offering exists for the Perl
hackers of the world?
In this article, I would like to introduce Sleep. Sleep is a
Java-based scripting language heavily inspired by Perl. Sleep is
what I wrote when nothing like Perl was available to build a
scriptable Java IRC client. I would like to introduce Sleep in
terms of its similarities to Perl and what it brings to the Java
platform. Sleep can be downloaded at the Sleep Scripting Project home page.
A simple "Hello World" script in Sleep looks like this:
println("Hello World");
To run the above script, copy and paste it into a file called
hello.sl and type:
java -jar sleep.jar hello.sl
Why Bother with Sleep?
There are a zillion scripting languages on the Java platform,
each with its own strengths and weaknesses. With all of these
scripting language choices, why does Java need a Perl offering?
Perl is an incredibly powerful language for text and data
processing. Perl excels at taking input, extracting stuff from it,
chewing it up several hundred times, and finally, outputting the
mess however the programmer would like. Perl is often referred to
as the
duct tape of the internet. This is due to its many uses as a
"glue"-type language.
Sleep is primarily a glue language and was designed from the
ground up to be embedded in Java applications. This is accomplished
via a two-pronged approach. Several Sleep APIs allow extension as
well as embedding of the language. By extending Sleep, developers
can practically design a domain-specific language for their
applications. The second prong and the primary focus of this article
is Sleep's similarity to Perl. Sleep steals, borrows, and begs
features from Perl. One goal of Sleep is to bring Perl's incredible
text/data processing and ease of use to the Java platform.
Some of Perl's best features include built-in regular expression
support, powerful data structures, and an easy-to-use I/O API.
Ironically enough, Sleep provides built-in regular expressions,
handy built-in data structures, and an easy to use unified I/O API.
Sleep also has a few extra tricks up its sleeve. For example, Sleep
can instantiate and talk to Java objects.
Built-In Regular Expressions
Regular expressions are a mini-language for describing patterns.
Strings can be compared against regex pattern strings to
check for a match. If there is a match, certain parts of the
matching string can be extracted as described in the pattern
string. Perl provides a bunch of operators for dealing with regular
expressions. While Sleep does not support everything that Perl
does, you'll find that the basics are there, and the operators are
a little less esoteric.
I will use a phone number pattern as an example. This example
will be simple, since a full-on regex tutorial is beyond the scope
of this article. A phone number in the U.S. might consist of a three-digit area code wrapped in parentheses, followed by a space,
followed by three digits, followed by a dash, followed by four
digits. Or in short:
(ddd) ddd-dddd
Assume the ds above mean "digit." A few changes are
needed to build a regular expression string that represents the
above pattern. The string \d represents a digit in
regular expression speak. Parentheses are special characters, so to
specify them literally they have to be escaped with a
\ character, as in \( and
\). A phone number regex pattern as described above
is:
$pattern = '\(\d\d\d\) \d\d\d-\d\d\d\d';
The above pattern is great for matching "legal" phone numbers, as
described. However it does no good for extracting information from
any matching text.
Remember that I mentioned parentheses are special? They are used
in the pattern to identify which matching text to extract. To
designate substrings of the pattern for extraction later, simply
surround these substrings with parentheses. The extracted
substrings will then be available later trough the
matched() function.
The following snippet compares a string to the phone number
pattern above. Upon finding a match, it extracts the area code and
local phone number pieces.
In the example above, the scalar $areaCode will have
the value 654 and the scalar $phoneNumber will be equal
to 555-1212. Pretty cool, eh?
The function matched() is tied to the last use of
the ismatch predicate. The matched()
function returns an array of substrings extracted from the matching
text. Just like Perl, Sleep allows individual elements of an array
to be assigned to scalars using the syntax above.
Regex Differences
For the sake of comparison, I present the phone number
extraction example written in Perl.
When I was a Perl beginner, I tripped over the =~
operator. To me, it looked like an assignment being used as a
predicate. Really the above is saying "bind the pattern on the
right to the text on the left, and make the extracted substrings
available as $1, $2, etc." I used my artistic license to make the
Sleep syntax a little simpler. Hopefully Perl hackers can forgive
me.
In Perl, regular expression patterns are enclosed in forward
slashes. This is more of a convention than a requirement; however,
it is a convention nearly everyone follows. In Sleep, regular
expression patterns are specified as double or single quoted
strings.
Many of the Perl regular-expression-related goodies are
available in Sleep. The Perl functions split and
join are both available. The ever popular
s/pattern/string/g regex operation is available in
Sleep as the &replace function.
Regexes are a good place to talk about the differences between
Java philosophy, Perl philosophy, and the Sleep philosophy.