The Source for Java Technology Collaboration
User: Password:



   

Java Tech: Language Lessons Java Tech: Language Lessons

by Jeff Friesen
05/17/2005

Contents
Lesson 1:
The poor performance of the string
concatenation operator
Lesson 2:
Superclass constructors invoking
overridable methods
Lesson 3:
When assertions should not be used
Lesson 4:
Interfaces versus abstract classes
Lesson 5:
Those useful covariant return types
Lesson 6:
Don't forget the superclass
Lesson 7:
The compound assignment
operator surprise
Conclusion
Resources
Answers to Previous Homework

I first encountered Java in 1995. That language soon became my favorite because of its elegance, its interesting language features (e.g., interfaces, garbage collector, and threads), and its similarity to C and C++ (I come from a C/C++ background).

This article shares with you some of the lessons I've learned while working with Java. Several lessons point out problem areas to avoid; all lessons offer advice on writing better code. I hope you'll come away from this article with a greater understanding of the Java language, an awareness of these problem areas, and stronger Java coding skills.

Lesson 1: The poor performance of the string concatenation operator

Java programs should perform as well as is possible. But achieving that goal is not always easy, especially if your source codes contain frequent occurrences of Java's string concatenation operator. Concatenating strings via Java's string concatenation operator, especially within loops, can significantly impact the performance of your programs. Check out the following code fragment:

String s = "";

for (int i = 32; i < 128; i++)
     s = s + (char) i;

The code fragment employs a loop to initialize a string to those characters with Unicode values ranging from 32 to 127 inclusive. Each loop iteration uses the string concatenation and cast operators to append the next character to the string. Using the string concatenation operator in a loop to build a string is often seen in C++. As with C++, the single loop looks OK from a performance perspective: it seems to ensure that the runtime performance is linear.

Or is that performance linear? Before that question can be answered, we must examine the equivalent sequence of JVM bytecodes. That sequence appears below:

 0 ldc ""
 2 astore_1
 3 bipush 32
 5 istore_2
 6 iload_2
 7 sipush 128
10 if_icmpge 39
13 new java/lang/StringBuilder
16 dup
17 invokespecial java/lang/StringBuilder/<init>()V
20 aload_1
21 invokevirtual
   java/lang/StringBuilder/append(Ljava/lang/String;)
   Ljava/lang/StringBuilder;
24 iload_2
25 i2c
26 invokevirtual
   java/lang/StringBuilder/append(C)
   Ljava/lang/StringBuilder;
29 invokevirtual
   java/lang/StringBuilder/toString()
   Ljava/lang/String;
32 astore_1
33 iinc 2 1
36 goto 6
39 ...

The instructions at offsets 0 and 2 correspond to String s = "";. The for loop begins at offset 3 and continues through offset 36. Each for loop iteration creates a StringBuilder object and invokes two of that object's append() methods, to first append the characters in the String to the StringBuilder and then to append the next character to the StringBuilder. Finally, the toString() method is called to convert the StringBuilder back to a string. The following code fragment shows what these instructions look like from a source code perspective:

String s = "";

for (int i = 32; i < 128; i++)
     s = new StringBuilder ().append (s)
         .append ((char) i).toString ();

The code fragment above is what is really being generated when you employ the string concatenation operator. Each loop iteration creates a throwaway StringBuilder object and explicitly invokes three of that object's methods. The toString() method call creates a new String object to hold the result. The original String object is not used because String is immutable.

We end up creating two objects and making three explicit method calls for each string concatenation. Furthermore, additional method calls happen behind the scenes. One of those method calls, which occurs when we append the string to the StringBuilder, results in a call to System.arraycopy(), to copy the String's characters into the StringBuilder's memory. This is tantamount to embedding another for loop within the outer for loop. Hence our performance changes from linear to quadratic.

We can restore the performance to linear by eliminating the method call that first appends the String to the StringBuilder. We can also avoid creating the String and StringBuilder objects during each loop iteration (which reduces the probability of intermittent garbage collections). The code fragment below reveals these changes:

StringBuilder sb = new StringBuilder ();

for (int i = 32; i < 128; i++)
     sb.append ((char) i);;

String s = sb.toString ();

Lesson: Do not use the string concatenation operator in lengthy loops or other places where performance could suffer.

Lesson 2: Superclass constructors invoking overridable methods

Java's class inheritance mechanism is a powerful tool for developing reusable code. If used incorrectly, however, this tool leads to fragile software. An example of incorrect use: superclass constructors invoking overridable methods.

The problem with a superclass constructor invoking an overridable method, either directly or indirectly, is that the superclass constructor runs before the subclass constructor. The subclass's version of the overridable method will be invoked before the subclass's constructor has been invoked. If the subclass's overridable method depends upon the proper initialization of the subclass (via the subclass constructor), the method will most likely fail.

I've created a file-parsing example that illustrates this problem. The example is based on two classes organized into a superclass/subclass relationship: an abstract Parser superclass and a non-abstract HTMLParser subclass. Parser appears below:

import java.util.*;

public abstract class Parser
{
   private String filename;

   private Vector results;

   public Parser (String filename)
   {
      this.filename = filename;

      results = parse ();
   }

   public String getFilename ()
   {
      return filename;
   }

   public Vector getParsedResults ()
   {
      return results;
   }

   protected abstract Vector parse ();
}

Parser provides a standard interface to various kinds of parsers. After saving the name of the file to be parsed, Parser's constructor invokes the protected parse() method to carry out the parse. That method returns a Vector that holds the parsed results. The returned Vector is then saved for later access via getParsedResults().

Now that we've seen Parser, let's examine HTMLParser:

import java.util.*;

public final class HTMLParser extends Parser
{
   private Vector results;

   public HTMLParser (String filename)
   {
      super (filename);

      results = new Vector ();
   }

   protected Vector parse ()
   {
      System.out.println ("Parsing HTML file " + getFilename ());

      // Open file.

      // Perform the parse.

      // Close file.

      return results;
   }
}

HTMLParser describes a parser dedicated to parsing HTML files. To keep the example brief, I've omitted the actual parsing logic. The constructor passes the file name to the superclass and creates a Vector for holding parsed results. The parse() method takes care of parsing.

Now that we've examined Parser and HTMLParser, let's look at a main() method for parsing an HTML file and outputting the parsed results:

public static void main (String [] args)
{
   Parser p = new HTMLParser ("test.dat");

   Vector results = p.getParsedResults ();

   Enumeration e = results.elements ();

   while (e.hasMoreElements ())
      System.out.println (e.nextElement ());
}

After creating a Parser and performing the parse, main() extracts the parsed results Vector and accesses that Vector's Enumeration to iterate over all result elements and output each element.

If you run main(), you observe a line of output that identifies the file being parsed. This is as expected. But you also observe a NullPointerException. This exception is generated by results.elements() because results is null. Why? HTMLParser's parse() method is invoked before its constructor. When parse() runs, the results Vector has not yet been created.

The solution to the problem above is to refactor Parser's constructor so that it doesn't invoke an overridable method. The result of the refactoring appears below:

import java.util.*;

public abstract class Parser
{
   private String filename;

   private Vector results;

   boolean cache = true;

   public Parser (String filename)
   {
      this.filename = filename;
   }

   public String getFilename ()
   {
      return filename;
   }

   public Vector getParsedResults ()
   {
      if (cache)
      {
          results = parse ();
          cache = false;
      }
      return results;
   }

   protected abstract Vector parse ();
}

The new Parser class invokes parse() from within the getParsedResults() method, but only the first time that method is called. Because the HTMLParser constructor completes long before getParsedResults() is called, parse() returns the Vector created in HTMLParser's constructor.

Lesson: Do not call overridable methods from superclass constructors.

Lesson 3: When assertions should not be used

Java 1.4 added an assertions capability to the language. The assertions capability helps developers prove the correctness of their code. For example, assertions are often used to verify that local variables have been initialized correctly and that the default case in a switch statement is never executed.

Some assertions are never appropriate. For example, it is not a good idea to use assertions to validate a method's arguments. Consider the sort() method below:

static void sort (int [] x)
{
   assert x != null;

   for (int pass = 0; pass < x.length-1; pass++)
        for (int i = x.length-1; i > pass; i--)
             if (x [i] < x [pass])
             {
                 int temp = x [i];
                 x [i] = x [pass];
                 x [pass] = temp;
             }
}

The sort() method uses a bubble sort to sort the contents of the integer array referenced by array reference argument x. Prior to the sort, assert x != null; verifies that x is not a null reference.

The assertion above is problematic. Assertions must be enabled; otherwise they are ignored at runtime. Equally important: a thrown AssertionError hides the real problem (a null reference passed as an argument value), which violates any contract stating that sort() throws a NullPointerException if an invalid argument value is detected.

It is better to either not validate an argument, as in the sort() method above (just document that a NullPointerException occurs if a null argument is passed), or include one or more if statements that validate arguments and throw exceptions if those arguments are not valid. Example:

// The following method generates a positive
// random integer between 0 and limit-1 inclusive.
//
// @param limit one more than the largest integer
// that can be returned
//
// @throws IllegalArgumentException (when limit is
// less than or equal to 0)

static int random (int limit)
{
   if (limit <= 0)
       throw new IllegalArgumentException
                 ("limit <= 0");

   return (int) (Math.random () * limit);
}

Lesson: Do not use assertions to validate method arguments. Use if statements that explicitly throw exceptions if those arguments aren't valid.

Lesson 4: Interfaces versus abstract classes

I'm often asked all kinds of questions about the Java language. Many of those questions have to do with Java's interfaces and abstract classes' language features. One commonly asked question: "Because interfaces and abstract classes each make it possible to define a type that permits multiple implementations, when should an interface be used and when should an abstract class be used?" The answer to this question requires an understanding of each language feature's advantages.

Interfaces are more flexible than abstract classes. An interface-defined type can be implemented by any class in a class hierarchy, by implementing the interface. In contrast, an abstract-class-defined type can be implemented only by classes that subclass the abstract class. This flexibility of interfaces is beneficial in many ways. One example is the mixin.

Interfaces promote mixins (types that a class can implement in addition to its primary type). Java's class libraries contain many examples of mixin interfaces. One of those examples: java.awt.image.ImageObserver. That interface makes it possible for objects (such as objects whose classes subclass java.awt.Component) to receive image update notifications as images load. Other examples of mixin interfaces include java.lang.Comparable and java.io.Serializable. Those mixin interfaces introduce natural ordering and serialization to implementing classes. Abstract classes cannot be used to promote mixins because a class cannot have more than one superclass. Where in the class hierarchy would the mixin be placed?

Abstract classes evolve more easily than interfaces. To add a new method to an abstract class (in some future release of that class), simply provide a concrete method with a reasonable default implementation. Contrast this with interfaces. If you add a new method to an interface, classes that rely on the interface will break when recompiled, because they are missing the new method (unless that method happens to coincide with a method that already exists in the class).

Along with the ease of their evolution, abstract classes capture the essence of rigid class hierarchies through partial implementation while preventing the creation of meaningless objects. For example, placing an abstract Account class at the top of a hierarchy of bank account classes captures the essence of what it means to be an account while preventing the creation of meaningless Account objects. In contrast, SavingsAccount and CheckingAccount subclass objects are meaningful.

Lesson: Use interfaces for flexibility. Use abstract classes for ease of evolution or to capture the essence of rigid class hierarchies while avoiding the creation of objects that mean nothing.

Lesson 5: Those useful covariant return types

The J2SE 5.0 documentation presents seven new language features: generics, static imports, annotations, typesafe enums, an enhanced for loop, autoboxing/unboxing, and varargs. It might surprise you to learn that an eighth language feature was added to J2SE 5.0 but is not mentioned in the documentation (at least I could not find a mention of this feature): covariant return types.

Covariant return types let you override a superclass method with a return type that subtypes the superclass method's return type. To understand this language feature, let's look at an example:

class Parent
{
   Parent foo ()
   {
      System.out.println ("Parent foo() called");
      return this;
   }
}

class Child extends Parent
{
   Parent foo ()
   {
      System.out.println ("Child foo() called");
      return this;
   }
}

The code fragment above illustrates the pre-J2SE 5.0 way to override a method: each class's foo() method has the same return type. If we want to invoke Child's foo() method and assign the return value to a Child variable, we must employ a cast, as the following code fragment reveals:

Child c = (Child) new Child ().foo ();

The code fragment above must downcast foo()'s Parent return type to Child prior to the assignment. This seems so unnecessary because the type of this in Child's foo() method is Child. But since we are forced to upcast that type to Parent upon return from the method, we must downcast from Parent back to Child before assignment.

Not only does upcasting/downcasting return types muddy source code, there exists the possibility of type safety violations that result in a ClassCastException if we employ the wrong cast when downcasting.

J2SE 5.0 does away with this nonsense via covariant return types. Change the return type of Child's foo() method from Parent to Child and the upcasting and downcasting goes away. The new Child class appears below:

class Child extends Parent
{
   Child foo ()
   {
      System.out.println ("Child foo() called");
      return this;
   }
}

We can now invoke Child c = new Child ().foo (); without the need to first upcast foo()'s this reference to Parent (via the Parent return type) and then downcast that reference back to Child (via the (Child) cast). The covariant return types language feature has enabled source code clarity and type safety.

Covariant return types are useful in the context of cloning. For example, the code fragment below illustrates how pre-J2SE 5.0 code overrode Object's clone() method.

public class SomeClass
{
   public Object clone ()
   {
      //  ...
   }

   public static void main (String [] args)
   {
      SomeClass sc = new SomeClass ();
      SomeClass clone = (SomeClass) sc.clone ();
    }
}

In the bad old days, it was necessary to keep clone()'s return type set to Object, and implement an appropriate downcast after invoking clone() and before assigning the cloned object reference to the appropriate variable. Once again, source code clarity and type safety were compromised.

Now take a look at the code fragment below:

public class SomeClass
{
   public SomeClass clone ()
   {
      //  ...
   }

   public static void main (String [] args)
   {
      SomeClass sc = new SomeClass ();
      SomeClass clone = sc.clone ();
    }
}

With J2SE 5.0, we can assign the appropriate SomeClass return type to clone() and eliminate the downcast when invoking that method.

Lesson: Use covariant return types to minimize upcasting and downcasting.

Lesson 6: Don't forget the superclass

Forgetting about the superclass while writing a subclass is a mistake commonly made by those new to Java's inheritance mechanism. This forgetfulness can lead to incorrectly initialized objects or objects whose methods don't work as intended. For example, let's consider a simple hierarchy that consists of Point and Circle classes, where Point superclasses Circle. Below is the Point superclass:

public class Point
{
   private int x, y;

   public Point ()
   {
   }

   public Point (int x, int y)
   {
      this.x = x;
      this.y = y;
   }

   public int getX ()
   {
      return x;
   }

   public int getY ()
   {
      return y;
   }

   public String toString ()
   {
      return "Point: (" + x + ", " + y + ")";
   }
}

Point is a simple class that describes a single point. This class provides two constructors for creating an origin point (0,0) or a point at any other location (x,y). Let's extend Point to describe a circle. After all, a circle is just a fat point (a point with a radius), isn't it? The Circle class appears below:

public class Circle extends Point
{
   private int radius;

   public Circle (int x, int y, int radius)
   {
      this.radius = radius;
   }

   public int getRadius ()
   {
      return radius;
   }

   public String toString ()
   {
      return ": Circle (" + radius + ")";
   }
}

Circle is problematic. Can you spot what's wrong with that class? For starters, Circle's constructor doesn't explicitly invoke Point(int x, int y) to save its x and y arguments. Instead, Circle's constructor implicitly invokes Point's no-argument constructor, which does nothing. As a result, Circle objects have no center other than (0,0). Fix this problem by explicitly invoking the Point(int x, int y) constructor at the beginning of Circle's constructor: super (x, y);.

The second problem occurs in Circle's toString() method. That method is supposed to return a string describing the circle's center coordinates and radius. However, it returns a string containing only the value of the radius. It does not include the center information returned from Point's toString() method. Fix this problem by prepending a super.toString () method call to the returned string, as follows: return super.toString () + ": Circle (" + radius + ")";. Don't forget to include the super keyword, to invoke the superclass toString() method; otherwise you'll end up with a recursive loop that ultimately overflows the stack.

Although the example above is trivial, it nicely illustrates problems that can happen when you forget about the superclass. Let's examine a more practical example.

Several years ago, I was asked to build an AWT component to bounce text off of the sides of a rectangle. I began this task by subclassing the AWT's Canvas class and by implementing a constructor that, among other things, invoked one of Component's createImage() methods to return an Image, and invoked getGraphics() on that Image to return a graphics context. The idea behind those method calls: prepare for double buffering, where all drawing operations take place in the context of an offscreen buffer and the contents of the buffer are drawn on the screen in one operation, to eliminate screen flicker.

My constructor failed. A NullPointerException was thrown when the constructor attempted to invoke getGraphics() on the returned Image. Why? The createImage() method returns null if the component is not displayable (i.e., not connected to a native screen resource).

An examination of the Component class revealed an addNotify() method that makes a component displayable. Thinking that this method would solve my problem, I overrode addNotify() in my subclass, as follows:

public void addNotify ()
{
   buffer = createImage (width, height);
   context = buffer.getGraphics ();

   // Other code.
}

Guess what happened? The createImage() method still returned null, and the subsequent line of code threw a NullPointerException. It was not until I studied Component's addNotify() source code that I discovered the solution: that method takes care of making the component displayable and must be invoked prior to calling createImage(). To solve the problem, I only had to insert the super.addNotify() method call at the beginning of the overridden addNotify() method.

Lesson: Don't forget the superclass while writing a subclass.

Lesson 7: The compound assignment operator surprise

Our final lesson might surprise you: x += y; is not equivalent to x = x + y;. This is also true for the other compound assignment operators and their more lengthy kin. Consider the following code fragment:

int i = 0;
float f = 1.0f;

i = i + f;
i += f;

When the compiler encounters i = i + f;, it spits out a "possible loss of precision" error message. Adding a floating-point value to an integer value results in a floating-point sum. Converting that sum back to an integer causes the fractional part to disappear. To eliminate the "possible loss of precision" error message, you must cast the floating-point sum to an integer, as follows: i = (int) (i + f);. This tells the compiler that you know what you are doing.

But no cast is necessary with i += f;. The compiler doesn't issue an error message when it encounters that line. Why? Section 15.26.2 of the Java Language Specification (consult the Resources section for a link to the JLS) provides an answer:

A compound assignment expression of the form E1 op= E2 is equivalent to E1 = (T)((E1) op (E2)), where T is the type of E1, except that E1 is evaluated only once.
Thus the compiler transforms i += f; into i = (int) (i + f);.

I believe op= and its =op counterpart are not equivalent, because you never supply a cast with op= (the cast is internally generated by the compiler) and you must supply a cast with =op whenever there is a potential loss of precision. If you must supply a cast, you may be less likely to introduce certain coding errors because you are forced to think about what you are doing. But if you aren't required to supply a cast, there is a greater potential for introducing bugs. Consider the following code fragment:

int accum = 0;
for (int i = 0; i < 10000; i++)
     accum += Math.random ();

System.out.println (accum / 10000);

The code fragment above tests the quality of the random number generator used by Math.random(), which returns random numbers between 0.0 (inclusive) and 1.0 (exclusive). The idea is to average 10000 random numbers and see if the average lies around 0.5, which indicates a high-quality random number generator.

The code fragment outputs 0, which indicates a pathologically bad generator; a tired developer might come to this conclusion. Obviously, this is not the case. A close inspection of the fragment coupled with an understanding of += reveals the true problem--accum += Math.random (); never produces a value other than 0:

  • accum's value (0) converts from an integer to a floating-point equivalent.
  • Math.random()'s return value (between 0.0 and 1.0) adds to the floating-point equivalent.
  • The sum (that lies between 0.0 and 1.0) casts to integer 0 before storage.

These actions happen for every iteration. To overcome this problem, change accum's type to double.

Lesson: Remember that compound assignment operators automatically include cast operations in their behaviors.

Conclusion

I have learned many lessons while working with Java and shared some of them with you in this article. What lessons have you learned? I invite you to share them with myself and your fellow readers in the talkbacks below.

I have some homework for you to accomplish:

  • Is it okay for the clone() method to invoke an overridable method?
  • Should interfaces be used to only export constants (i.e., no methods are part of such interfaces, only constants)? Although I didn't discuss constant interfaces in this article, think of it as another lesson that you should know.

Next time, Java Tech introduces you to the interactive Java environment known as BlueJ.

Resources

Answers to Previous Homework

The previous Java Tech article presented you with some challenging homework on JTwain and JTwainDemo. Let's revisit that homework and investigate solutions.

  1. The first part of this series presented a JTwainDemo application. Convert that application's source code to equivalent applet source code. Test your JTwainDemo applet using the appletviewer tool found in the J2SE 5.0 SDK. Don't worry about making this applet work within a real web browser.

    The homework requires a reorganization of JTwainDemo.java, to reflect an applet instead of an application. ImageArea.java requires no changes. The homework also requires a policy file that grants permissions to the underlying Java interpreter so that the applet can invoke the Java Native Interface from appletviewer. Consult the Resources section for these files.

    Complete the steps below to build and run the JTwainDemo applet:

    1. It is best to start with a directory that holds all files and subdirectories. Create a c:\JTwainDemo directory for this purpose.
    2. Copy the net directory and its subdirectories and files to c:\JTwainDemo. You will find a copy of these directories/files in this article's sample code file.
    3. Copy the ImageArea.java, JTwainDemo.html, JTwainDemo.java, and my.policy files to c:\JTwainDemo. You will find copies of these files in this article's sample code file.
    4. Copy the pre-built jtwain.dll file from the sample code file to your c:\windows or equivalent directory. This pre-built file is the DLL that you would have built in the first article of the TWAIN/SANE series.
    5. Assuming c:\JTwainDemo is the current directory, invoke javac JTwainDemo.java to compile the applet and the associated JTwain library source files.
    6. Invoke appletviewer -J-Djava.security.policy=my.policy JTwainDemo.html to run the applet. The -J command-line option is used to pass arguments to the underlying Java interpreter. The -Djava.security.policy=my.policy argument tells the interpreter to replace the default security policy with the contents of my.policy. If you look at these contents, you'll see that they grant all permissions to the applet. For this example, it is OK to grant all permissions. But be careful about doing this with other applets--you do not want to expose your machine to malicious code that someone else might have written.
    7. If all goes well, appletviewer will display the same GUI as revealed in the first installment of the TWAIN/SANE series.

Jeff Friesen is a freelance software developer and educator specializing in Java technology. Check out his site at javajeff.mb.ca.

Read more Java Tech columns.

View all java.net Articles.

 Feed java.net RSS Feeds