Skip to main content

Perl on Java? An Introduction to the Sleep Language

July 14, 2005

{cs.r.title}









Contents
Why Bother with Sleep?
Built-In Regular Expressions
   Regex Differences
The Sleep Philosophy
The Unified I/O API
   Working with Binary Data
Built-In Data Structures
Instantiating and
Talking to Objects
Closures
What Sleep Brings to
the Java Platform
Sleep Resources
   Real Perl on Java Resources

The most popular scripting languages are available in some form
for the Java platform. We have "http://tcljava.sourceforge.net/docs/website/">Jacl for TCL,
Jython for Python, and
JRuby for Ruby. One offering is
missing from this bunch: what Java offering exists for the Perl
hackers of the world?

In this article, I would like to introduce Sleep. Sleep is a
Java-based scripting language heavily inspired by Perl. Sleep is
what I wrote when nothing like Perl was available to build a
scriptable Java IRC client. I would like to introduce Sleep in
terms of its similarities to Perl and what it brings to the Java
platform. Sleep can be downloaded at the "http://sleep.hick.org/">Sleep Scripting Project home page.
A simple "Hello World" script in Sleep looks like this:

println("Hello World");

To run the above script, copy and paste it into a file called
hello.sl and type:

java -jar sleep.jar hello.sl

Why Bother with Sleep?

There are a zillion scripting languages on the Java platform,
each with its own strengths and weaknesses. With all of these
scripting language choices, why does Java need a Perl offering?
Perl is an incredibly powerful language for text and data
processing. Perl excels at taking input, extracting stuff from it,
chewing it up several hundred times, and finally, outputting the
mess however the programmer would like. Perl is often referred to
as the "http://www.google.com/search?hl=en&lr=&q=%22the+duct+tape+of+the+internet%22&btnG=Search">
duct tape of the internet
. This is due to its many uses as a
"glue"-type language.

Sleep is primarily a glue language and was designed from the
ground up to be embedded in Java applications. This is accomplished
via a two-pronged approach. Several Sleep APIs allow extension as
well as embedding of the language. By extending Sleep, developers
can practically design a domain-specific language for their
applications. The second prong and the primary focus of this article
is Sleep's similarity to Perl. Sleep steals, borrows, and begs
features from Perl. One goal of Sleep is to bring Perl's incredible
text/data processing and ease of use to the Java platform.

Some of Perl's best features include built-in regular expression
support, powerful data structures, and an easy-to-use I/O API.
Ironically enough, Sleep provides built-in regular expressions,
handy built-in data structures, and an easy to use unified I/O API.
Sleep also has a few extra tricks up its sleeve. For example, Sleep
can instantiate and talk to Java objects.

Built-In Regular Expressions

Regular expressions are a mini-language for describing patterns.
Strings can be compared against regex pattern strings to
check for a match. If there is a match, certain parts of the
matching string can be extracted as described in the pattern
string. Perl provides a bunch of operators for dealing with regular
expressions. While Sleep does not support everything that Perl
does, you'll find that the basics are there, and the operators are
a little less esoteric.

I will use a phone number pattern as an example. This example
will be simple, since a full-on regex tutorial is beyond the scope
of this article. A phone number in the U.S. might consist of a three-digit area code wrapped in parentheses, followed by a space,
followed by three digits, followed by a dash, followed by four
digits. Or in short:

[prettify](ddd) ddd-dddd
[/prettify]

Assume the ds above mean "digit." A few changes are
needed to build a regular expression string that represents the
above pattern. The string \d represents a digit in
regular expression speak. Parentheses are special characters, so to
specify them literally they have to be escaped with a
\ character, as in \( and
\). A phone number regex pattern as described above
is:

[prettify]$pattern = '\(\d\d\d\) \d\d\d-\d\d\d\d';
[/prettify]

The above pattern is great for matching "legal" phone numbers, as
described. However it does no good for extracting information from
any matching text.

Remember that I mentioned parentheses are special? They are used
in the pattern to identify which matching text to extract. To
designate substrings of the pattern for extraction later, simply
surround these substrings with parentheses. The extracted
substrings will then be available later trough the
matched() function.

The following snippet compares a string to the phone number
pattern above. Upon finding a match, it extracts the area code and
local phone number pieces.

[prettify]if ("(654) 555-1212" ismatch '\((\d\d\d)\) (\d\d\d-\d\d\d\d)')
{
   ($areaCode, $phoneNumber) = matched();
}
[/prettify]

In the example above, the scalar $areaCode will have
the value 654 and the scalar $phoneNumber will be equal
to 555-1212. Pretty cool, eh?

The function matched() is tied to the last use of
the ismatch predicate. The matched()
function returns an array of substrings extracted from the matching
text. Just like Perl, Sleep allows individual elements of an array
to be assigned to scalars using the syntax above.

Regex Differences

For the sake of comparison, I present the phone number
extraction example written in Perl.

[prettify]if ('(654) 555-1212' =~ /\((\d\d\d)\) (\d\d\d-\d\d\d\d)/)
{
   ($areaCode, $phoneNumber) = ($1, $2);
}
[/prettify]

When I was a Perl beginner, I tripped over the =~
operator. To me, it looked like an assignment being used as a
predicate. Really the above is saying "bind the pattern on the
right to the text on the left, and make the extracted substrings
available as $1, $2, etc." I used my artistic license to make the
Sleep syntax a little simpler. Hopefully Perl hackers can forgive
me.

In Perl, regular expression patterns are enclosed in forward
slashes. This is more of a convention than a requirement; however,
it is a convention nearly everyone follows. In Sleep, regular
expression patterns are specified as double or single quoted
strings.

Many of the Perl regular-expression-related goodies are
available in Sleep. The Perl functions split and
join are both available. The ever popular
s/pattern/string/g regex operation is available in
Sleep as the &replace function.

Regexes are a good place to talk about the differences between
Java philosophy, Perl philosophy, and the Sleep philosophy.







The Sleep Philosophy

The designers of Java designed and built a fairly simple core
language. They decided that most of the features would be included
in the Java class libraries. Hence, most of the complexity of Java
lies within the API and not in the language itself. Perl is kind of
the opposite: Perl hackers believe that many of the most commonly
used features should be built directly into the language. This way,
a lot of built-in syntax (AKA "sugar") can follow, to make common
stuff easier to do. While this does result in a lot of power, it has
also yielded a language that can be complicated.

Sleep tries to find a middle ground between these two
philosophies. Many things are built into the Sleep language.
However, oftentimes an API is relied upon to provide functionality.
The decision of where to place functionality is based on the "least
complexity" rule. If a function is easier to understand and use as a
built-in construct, then Sleep will include it as a built-in
construct. Regex matching functionality is built into the Sleep
language, as seen in the ismatch operator illustrated
above. Other regex functionality is provided with built-in functions.
Sleep aims to provide built-in power and flexibility while
maintaining a core language that is accessible to novice
scripters.

Unified I/O API

Sleep has an API for providing uniform access to sockets,
processes, and files. These things are all data sources, as far as
Sleep is concerned. Sleep also provides functionality similar to
Perl's for manipulating byte data.

The following Perl example opens a file and reads the file
contents into an array:

open(HANDLE, "myfile.txt");
@data = <HANDLE>;
close(HANDLE);

Perl provides a special syntax for dealing with file handles.
This special syntax is called the diamond operator. The diamond
operator reads either a single line or the entire contents of a
HANDLE, depending on to what context the data is assigned. If the
data is assigned to a $scalar, then a single line is
read. The entire HANDLE is read when the data is assigned to, or
used as, an @array.

Sleep does not use assignment context to define how functions
behave. Come to think of it, Sleep does not have special syntax for
dealing with I/O handles, either. All Sleep I/O handles are object
scalars that reference an I/O stream.

The following Sleep example opens a file, reads all of the text
from the file into an array, and closes the file. This example is
the Sleep equivalent of the Perl example above.

$handle = openf("myfile.txt");
@data = readAll($handle);
closef($handle);

Sleep uses &openf to open a file stream. Other
functions exist for opening sockets, creating a listening socket,
and executing processes. These functions all return a scalar
variable that references an I/O stream. Any of these I/O sources
can be read from, and written to, using the same built-in
functions.

A very cool I/O concept in Sleep is callback reading.
Sleep can invoke a specified function or closure whenever data is
read from a source.

The following is an example of a simple echo server written in
Sleep:

[prettify]sub handleData  
{
   println($1, &quot;Right back at ya: $2&quot;);
}

$server = listen(3000);
read($server, &amp;handleData);
[/prettify]

To connect to the echo server, do this:

[raffi@beardsley ~]$ telnet 127.0.0.1 3000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Hello World
Right back at ya: Hello World

A quick explanation is in order. The echo server is created on
port 3000 using the &listen function. The
&read function is used to tie the function
&handleData to the socket stream
$server. Internally, Sleep creates a thread that
references &handleData and $server.
When data is read from $server, the function
&handleData is invoked with $server and
the read data as arguments. This process continues until
$server closes.

Working with Binary Data

Sleep provides functionality for dealing with binary data. This
functionality is similar in many ways to what Perl offers. In
Sleep, an array of byte data is stored as a string. Each character in
the string maps to one byte. Sleep provides

readb($handle,
size)
and writeb($handle, "data") to read and
write byte strings from an I/O source. For example, to copy a file
in Sleep, you'd write:

[prettify]# copy.sl [original file] [new file]

$in = openf(@ARGV[0]);
$data = readb($in, lof(@ARGV[0]));

$out = openf(&quot;&gt;&quot; . @ARGV[1]);          
writeb($out, $data);

closef($in);
closef($out);
[/prettify]

Sleep also provides Perl-like pack('format', ...)
and unpack('format', "data") for storing and
retrieving sleep data to or from a byte data string.

The following example illustrates Sleep's binary data extraction
abilities:

The wtmp file is used to record information when a user
logs in or out on a UNIX system. The information in wtmp is stored
as binary data. The Mac OS X wtmp manpage specifies the following C
structure for a wtmp record:

#define _PATH_WTMP      "/var/log/wtmp"

#define UT_NAMESIZE     8
#define UT_LINESIZE     8
#define UT_HOSTSIZE     16

struct utmp {
       char    ut_line[UT_LINESIZE];
       char    ut_name[UT_NAMESIZE];
       char    ut_host[UT_HOSTSIZE];
       time_t  ut_time;
};

Each wtmp entry consists of 36 bytes of data. These entries
contain three strings and one integer packed together. The following
example extracts the contents of the wtmp file on Mac OS X using
Sleep's &unpack function:

[prettify]$handle = openf(&quot;/var/log/wtmp&quot;);

while (1)
{
   ($tty, $uid, $host, $ctime) = bread($handle, 
                                   'Z8 Z8 Z16 I');
   
   if (-eof $handle) { break; }
   
   $date = formatDate($ctime * 1000, 
                    &quot;EEE, d MMM yyyy HH:mm:ss Z&quot;);
   
   println(&quot;$[10]tty $[10]uid $[20]host $date&quot;);
}
[/prettify]

A shortened snapshot of my wtmp file is below.

[prettify]ttyp3  raffi                 Sun, 5 Jun 2005 09:58:57 +0200
ttype  raffi  192.168.1.26   Sun, 5 Jun 2005 09:59:13 +0200
[/prettify]

Editor's note: the output has been reformatted to better suit
this article's web page layout.







Built-In Data Structures

Sleep provides two data structures built into the language: the
hashtable (usually referred to as just a "hash") and the
ever-versatile array. Normal Sleep arrays are versatile in the fact
that they can act as lists, stacks, or arrays.

To use an array and start playing with it:

[prettify]@array = array(&quot;Raphael&quot;, 
               &quot;Serge&quot;, 
               &quot;Andreas&quot;, 
               &quot;Fuzzy Puppy&quot;); 

println(&quot;The last element is: &quot; . pop(@array));
push(@array, &quot;Mr. Anderson&quot;);

foreach $element (@array)
{
   println(&quot;An element: $element&quot;);
}
[/prettify]

In Sleep multi-dimensional arrays are easy to create. Just start
indexing new dimensions:

[prettify]for ($x = 1; $x &lt;= 10; $x++)
{
   for ($y = 1; $y &lt;= 10; $y++)
   {
      @multiplication[$x - 1][$y - 1] = $x * $y;
   }
}

# print out our multiplication table

foreach $row (@multiplication)
{
   foreach $column ($row)
   {
      print(&quot; $[3]column |&quot;);
   }
    
   println();
}
[/prettify]

As a side note, the string " $[3]column |" is called
a parsed literal in Sleep. A parsed literal is a double-quoted string. Scalar variable names are evaluated inside of parsed
literals. Within parsed literals, some formatting is available. For
example, $[n]var means "append spaces to $var
until the string length is n characters". A negative value
indicates that spaces should be prepended instead. Single-quoted
strings in Sleep are simple no-frills string literals.

Hashes are another Sleep data type. The Hash interface in Sleep
is backed by nothing more than a java.util.HashMap.
All scalar keys are converted to strings prior to storage in a
hash:

[prettify]%dictionary[1] = &quot;The number one&quot;;
%dictionary[&quot;1&quot;] = &quot;The string one&quot;;
[/prettify]

In terms of multi-dimensional data structures, hashes and arrays
can be mixed and matched. This is because [] is a
special operator in Sleep. It attempts to index data from whatever
expression to which it is applied. If it is applied to an expression
returning an array, it will index array data. If it is applied to a
hash, it will index hash data. This means that technically, any
expression that returns array or hash data can be indexed. For
example:

[prettify]$temp = array(&quot;a&quot;, &quot;b&quot;, &quot;c&quot;);
println(&quot;Second element is: &quot; . $temp[1]);
[/prettify]

or:

[prettify]println(&quot;Second element is: &quot; . array(&quot;a&quot;, &quot;b&quot;, &quot;c&quot;)[1]);
[/prettify]

Arrays in Sleep are always prefixed with an @; hashes, a %, and
scalars are always prefixed with a $. Sleep uses the symbol at the
beginning of the variable name to determine which type of data
structure to create when referencing a variable that does not
exist:

[prettify]# do we want a hash or an array in this case?
$data[0] = &quot;Hello World&quot;;
[/prettify]

In the example above, Sleep will silently ignore the attempt to
assign a value to a $scalar that doesn't reference a
hash or an array.

The symbols also apply in multi-dimensional data structures. If
the symbol at the beginning of the variable name is a %, then any
time an index is applied to a nonexistent dimension, a new hash
will be created.

The nice thing about this system is that hashes and arrays are
just like any other variables, with no need for special treatment.
For example, to pass an array to a subroutine:

[prettify]sub multiplyAll
{
   foreach $temp ($1)
   {
      # assigning to $temp is the same as
      # assigning to the individual element

      $temp = $temp * $2; 
   }

   return $1;
}

@data = array(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);

printAll(multiplyAll(@data, 3));
[/prettify]

Above, @data was passed to
&multiplyAll with no special handling. Perl would
normally "flatten" @data and pass each element as a
separate parameter unless \ were used to turn
@data into a reference. In Sleep @data is
passed as a reference automatically.

Instantiating and Talking to Objects

Another fun thing in Sleep is the ability to instantiate and
talk to Java objects. This ability was added to the language to
allow access to APIs that I was too lazy to create.

The following is a simple web browser created in Sleep:

[prettify]#
# Simple Sleep Based Graphical Web Browser
# Java's HTML renderer isn't very good, therefore 
# this browser isn't either
#

import java.awt.*;
import javax.swing.*;
import java.net.*;

$window = [new JFrame:&quot;Sleep Based Web Browser&quot;];

[$window setDefaultCloseOperation: 
                          [JFrame EXIT_ON_CLOSE]];

[$window setSize:480, 320];

sub go_to_site
{
   [$display setPage:  [$address getText] ];

   if (checkError($check))
   {
      println(&quot;Error: $check&quot;);
   }
}

sub link_clicked
{
   if ([$1 getEventType] eq &quot;ACTIVATED&quot;)
   {
      [$display setPage: [$1 getURL]];
      [$address setText: [$1 getURL]];
   }
}

$address = [new JTextField:20];
[$address addActionListener:&amp;go_to_site];

$button  = [new JButton:&quot;Go!&quot;];
[$button addActionListener:&amp;go_to_site];

$panel   = [new JPanel];
[$panel add: $address, [FlowLayout CENTER]];
[$panel add: $button,  [FlowLayout RIGHT]];

[[$window getContentPane] setLayout:
                              [new BorderLayout]];

[[$window getContentPane] add: $panel, 
                            [BorderLayout NORTH]];

$display = [new JEditorPane: &quot;text/html&quot;, &quot;&quot;];
[$display addHyperlinkListener:&amp;link_clicked];
[$display setEditable:0];

[[$window getContentPane] add:
                    [new JScrollPane:  $display], 
                    [BorderLayout CENTER]];

[$window show];
[/prettify]

One will easily notice the calls to the Java API pretty quickly.
They are all surrounded in square brackets. Sleep's syntax for
using Java objects is similar to that of Objective-C:

[reference message: argument, argument, ...]

Each call has a reference, a message, and then a colon, followed
by a comma-separated list of arguments.

The web browser example demonstrates that the Sleep object
syntax allows one to get stuff done with the Java API. However,
working with swing this way is a little cumbersome. A Sleep/Swing
module is currently in the works to help make UI scripting in Sleep
more practical.

In Sleep, you can't create new Java classes. However, interfaces
can be faked by passing a subroutine or a closure to an argument
expecting a specific interface. Closures and subroutines are
actually one and the same. The topic of closures is covered
next.







Closures

In Sleep, functions are considered "first class" types. This
means that a scripter can define a new function, assign it to a
variable, pass it as a value, invoke a function referenced by a
variable, and so on.

To define a named closure:

sub foo
{
   println("bar");
}

The named closure can be invoked as follows:

foo();

OK, that wasn't too exciting. To get technical, a named closure
can also be invoked with:

[&foo];

That was confusing. It will make sense in just a minute. To
assign a named closure to a variable and invoke the closure from
the variable:

$var = &foo;
[$var];

Consequently, this could have been written as:

$var = { println("bar"); };
[$var];

or:

[{ println("bar"); }];

Sleep closures are called with the same syntax used with
objects. Arguments in closures are available starting at $1
on up to $n for the nth argument. The message
parameter (defined before the semicolon) is passed to closures as
$0. This allows you to create some cool interfaces in
Sleep. For example:

[prettify]sub BuildStack 
{
   return {
             this('@stack');

             if ($0 eq &quot;push&quot;) 
             { 
                push(@stack, $1); 
             }

             if ($0 eq &quot;pop&quot;)
             { 
                return pop(@stack); 
             }
          };
}

# construct a new stack closure...
$mystack = BuildStack();  

# push the string &quot;test&quot; onto the stack
[$mystack push: &quot;test&quot;];  

# pop the top value off of the stack and print it
println(&quot;Top value is: &quot; . [$mystack pop]);
[/prettify]

The example above defines a new subroutine called
&BuildStack. The subroutine returns a new closure.
Inside of the closure, the variable @stack is put into
the this scope. Inside of the this scope,
@stack is visible only inside of the owning closure
instance. A second call to &BuildStack() would
return a new closure instance with its own @stack
variable.

Closures can also be passed to Java objects expecting an
interface. Any Java method call against the closure interface will
result in the entire closure being executed. The message parameter
($0) will contain the name of the method Java is trying to
invoke. Closures are the closest thing to objects Sleep has.

What Sleep Brings to the Java Platform

Sleep is a language for Perl hackers who also live in the world
of Java. Sleep brings the power of Perl to the Java platform. Not
only can Sleep extract data, parse it, rework it, and spit it back
out, but Sleep can extract data from, and send it back to, Java objects.
Sleep is also highly extensible, allowing new functions, operators,
and constructs to be added to the language. Sleep's extensibility
allows it to fit into new problem domains or be embedded into Java
applications. Combine the extensibility to fit into new problem
areas with powerful language features, and the possibilities are
endless.

Sleep Resources

Real Perl on Java Resources

width="1" height="1" border="0" alt=" " />
Raphael Mudge is the developer behind the scripting language Sleep and the IRC client jIRCii.
Related Topics >> Programming   |