Skip to main content

Return-Type-Based Method Overloading in Java

July 31, 2008

{cs.r.title}







It is an accepted fact that Java does not support return-type-based method overloading. This means that a class cannot have two methods
that differ only by return type -- you can't have

int
doXyz(int x)
and double doXyz(int x) in the
same class. And indeed, the Java compiler duly rejects any such
attempt. But recently I discovered a way to do this, which I wish
to share with all. Along the way, we will also explore some
rudiments of Java bytecode programming.

Required
Tools

The tools that we need for this exploration are rather simple:
the JDK, a Java assembler, a text editor, and a bytecode
engineering library. We will use the "http://jasmin.sourceforge.net/">Jasmin Java assembler and the
ASM bytecode manipulation
framework.

Basics of Method Invocation

Let us review the prominent features of method invocation in
the JVM. Whenever a method is invoked, a new frame is created
on the execution stack. Each frame has a local variable array and
an operand stack. When the frame is created, the operand stack is
empty and the local variable array is populated with the target
object this (in case of instance methods) and the
method's arguments. All the processing occurs on the operand stack.
The maximum number of local variables and stack slots that will be
used during the method invocation at any given moment must be known
at compile time.

To invoke a method on an object, the object reference (in the
case of instance methods) and then the method arguments in the
proper sequence must be loaded on the operand stack. The
method should then be invoked using an appropriate invoke instruction.
There are four invoke instructions: invokevirtual,
invokestatic, invokeinterface, and
invokespecial. The different instructions correspond
to different method types. The invokevirtual instruction is used to
invoke instance methods; invokestatic for static
methods; invokeinterface for interface methods; and
invokespecial for constructors, private methods of the
present class, and instance methods of superclass.

To return a value, be it a primitive value or a reference, the
value must be loaded on the operand stack and the appropriate
return instruction should then be executed. The instruction return returns
void; areturn returns a reference;
dreturn, freturn, and lreturn
return a double, float, and
long respectively; and finally, ireturn is
used to return an int, a short,
a char, a byte, or a boolean.

Basics
of Bytecode Programming

Let us now start with programming. We will create a Hello World
program in Java and its equivalent Java assembly code in Jasmin.
Then, by comparing the two, we can pick up the rudiments of Java
bytecode programming. Here goes the code:

[prettify]
public class HelloWorld {
    public static void main(String[] args) {
        System.out.println("Hello World");
    }
}
[/prettify]

Let us write the Java assembly code equivalent to the above Java
program in Jasmin:

[prettify]
; Filename : HelloWorld.j
; The semicolon comments out this line.
.class public HelloWorld2
.super java/lang/Object

.method public <init>()V
    .limit stack 1
    aload_0
    invokespecial java/lang/Object/<init>()V
    return
.end method

.method public static main([Ljava/lang/String;)V
    .limit stack 2
    getstatic java/lang/System/out Ljava/io/PrintStream;
    ldc "Hello Bytecode World"
    invokevirtual 
        java/io/PrintStream/println(Ljava/lang/String;)V
        ; For this code to compile, invokevirtual and its
        ; argument must be on the same line.
    return
.end method
[/prettify]

To compile the HelloWorld.j file, use

java -jar
jasmin.jar HelloWorld.j
to get a HelloWorld2.class
file. This class file can be run as usual.

Now let us go through the assembly code and compare it with the
Java code where necessary. In bytecode, we do not have the luxury
of import statements. We must specify the fully qualified
classnames, and that too in a different way. Here, we use the forward slash (/) as
the package delimiter instead of the usual period (.). Hence, instead of
Object, use java/lang/Object. This
representation is called the internal form of the class
name. The .class, .super, and

.end
method
directives are pretty self-explanatory. Let us take a
look at the .method directive. The public
or static keywords are the attributes of the method.
The last token of the .method directive is the method
name and the method descriptor concatenated into one token. A
constructor is always named and in
bytecode we must always specify the constructor explicitly.

The method descriptors deserve a special elaboration here, since
our attempt to do return-type-based overloading hinges on method
descriptors. Method descriptors are composed of type descriptors of
parameters and return value. The type descriptors of various Java
data types are listed in Table 1.1.

Table 1.1: Type Descriptors
Type Type Descriptor
byte B
char C
double D
float F
int I
long J
short S
boolean Z
void V
One array dimension [
An instance of class L;

The descriptors of primitive types as well as void are pretty
simple. However, descriptors of reference types may need further
elaboration. Hence, table 1.2 lists descriptors for some sample
reference types.

"Reference Type Descriptors">
Table 1.2: Reference Type
Descriptors
Type Type Descriptor
String Ljava/lang/String;
byte[] [B
Object[] [Ljava/lang/Object;

To form a method descriptor, type descriptors of parameters are
concatenated without any spaces inside a pair of
parentheses, followed by the type descriptor of return type. In a
class file, the method descriptor must be unique for every method.
Table 1.3 lists some sample method descriptors.

Table 1.3: Method Descriptors
Method Method Descriptor
void method() ()V
byte[][] method() ()[[B
String method(double x) (D)Ljava/lang/String;
void method(int a, byte b, String[] s) (IB[Ljava/lang/String;)V

Coming back to the assembly code, inside the
method, the .limit stack 1
directive declares that the maximum number of stack slots used in
this method at any given time is 1. aload_0 loads the
value at index 0 in local variable array onto the operand stack.
This value is nothing but the reference to the target object
this. On this reference, the
invokespecial instruction invokes the no-argument
constructor of the Object class, the superclass of the
present class. And finally, the return instruction
returns from the constructor.

Now let us see what's new inside the main method.
The getstatic instruction is used to load a static
field of a class onto the operand stack. For this, the field name
in internal form and the type descriptor of the field must be
specified. Here, it loads the out static variable of
the java.lang.System class, the type descriptor of
out being Ljava/io/PrintStream;. The
ldc instruction is used to load constants onto the
operand stack. Here, it loads the reference of the

"Hello
Bytecode World"
string object. And finally, the
invokevirtual instruction invokes the
println method on the out object. Note
that while invoking a method, a full method descriptor must be
specified. This sequence of three instructions stands for the
System.out.println("Hello Bytecode World"); statement
in Java.

Implementing Return-Type-Based Method Overloading

As noted in the last section, while invoking a method, a full
method descriptor must be specified, and in a class file, the
method descriptor must be unique for every method. So why can't we
overload a method based on return type? In Java, we call a method
by its name and arguments, not by its return type or method
descriptor. While calling a method, the return type does not play
any part in deciding which overloaded method should be called; in
fact, there's no syntactic need to do anything with the return
value at all. So there would be no way to distinguish which method
we mean to call, if return-type-based method overloading is
allowed. But there is no such limitation for bytecode. The method
descriptor is capable of distinguishing two methods on the basis of
their return types, even if their parameters are same. To
achieve our objective, we must bypass the Java compiler and use
assembler instead. Let us see how to do this.

Following is the assembly code for a class named
Overloaded, containing two instance methods: void returnDifferent() and

String
returnDifferent()
. The void returnDifferent()
prints Returning Void and returns nothing, whereas the
String returnDifferent() does nothing and returns a
String -- a hardcoded value of
Returning
String
.

 [prettify]
;Overloaded.j 
.class public Overloaded
.super java/lang/Object

.method public <init>()V
    .limit stack 1
    aload_0
    invokespecial java/lang/Object/<init>()V
    return
.end method

.method public returnDifferent()V
    .limit stack 2
    getstatic java/lang/System/out Ljava/io/PrintStream;
    ldc "Returning Void"
    invokevirtual 
        java/io/PrintStream/println(Ljava/lang/String;)V
    return
.end method

.method public returnDifferent()Ljava/lang/String;
    .limit stack 1
    ldc "Returning String"
    areturn         ; returns a reference
.end method
[/prettify]

How to Invoke the
Overloaded Methods

We have a class file that supposedly contains two
methods overloaded on basis of return type. But how do we verify
it? How do we call those methods?

The Java class file format contains a methods table. Each value
in this table is a structure containing a complete description of a
method in the class or interface. In the case of
Overloaded.class, there will be two methods named
returnDifferent. So, if we were to use a statement like
returnDifferent(); in Java code, the Java compiler
would look up the table and encode a call to the first method
having the required name and parameters. We would end up with a call
to one specific method, always. My experience is that it is always
the first method in the assembly code that gets called. Are we
stuck with methods that we cannot use? Fortunately, reflection
comes to our rescue here. The following code invokes these methods
using reflection.

[prettify]
import java.lang.reflect.Method;
public class CallOverloadedMethods {
  public static void main(String[] args) throws Exception {
    Overloaded oc = new Overloaded();
    Class c = Overloaded.class;
    Method[] m = c.getDeclaredMethods();
    for (int i=0; i<m.length; ++i) {
      if (m[i].getName().equals("returnDifferent")) {
        if (m[i].getReturnType().getName().equals("void"))
          m[i].invoke(oc, new Object[]{});
        else if (m[i].getReturnType().getName().equals(
                        "java.lang.String"))
          System.out.println(m[i].invoke(oc, new Object[]{}));
      }
    }
  }
}
[/prettify]

This code iterates over all the declared methods of the
Overloaded class and looks for methods named
returnDifferent. It assumes that all the
returnDifferent methods have empty parameter lists. It
only checks each of the returnDifferent method's
return type and then uses the method in an appropriate way.

Compile this class and run. Voila. It runs perfectly, giving the
expected output. We have implemented return-type-based method
overloading in Java.

Is this
Practically Useful?

Although we have been able to pull this off, you may be
wondering if it is of any practical use. After all, we can not call
those overloaded methods without resorting to reflection. So, what
is the value?

It turns out that there is a useful application. Suppose that a
class is required to implement two interfaces that have methods
with identical names and argument lists, differing only in return
type. Using normal Java code, we cannot have a class that
implements both the interfaces. But using the technique described
above, we can have such a class. Moreover, we do not need
reflection to use those methods. This is because when we call a
method on an interface reference, the relevant method's descriptor,
as specified in the interface class file, is automatically used to
make the call. Let us see a concrete example. Consider two
interfaces as follows:

[prettify]
interface Interface1 {
    void doSomething();
}

interface Interface2 {
    String doSomething();
}
[/prettify]

To implement these two interfaces, we can write assembly code
as follows:

[prettify]
;ImplementBoth.j
.class public ImplementBoth
.super java/lang/Object
.implements Interface1
.implements Interface2

.method public <init>()V
    .limit stack 1
    aload_0
    invokespecial java/lang/Object/<init>()V
    return
.end method

.method public doSomething()Ljava/lang/String;
    .limit stack 1
    ldc "Hello from STRING"
    areturn
.end method

.method public doSomething()V
    .limit stack 2
    getstatic java/lang/System/out Ljava/io/PrintStream;
    ldc "Hello from VOID"
    invokevirtual
        java/io/PrintStream/println(Ljava/lang/String;)V
    return
.end method 
[/prettify]

Now we can access both of these methods using normal Java code,
as demonstrated below:

 [prettify]
public class UsingImplementBoth {
    public static void main(String[] args) {
        ImplementBoth ib = new ImplementBoth();
        ((Interface1)ib).doSomething(); 
        System.out.println(((Interface2)ib).doSomething());
    }
}
[/prettify]

Further Enhancements
to this Technique

Now this technique has proved to be useful. But still, we need to
code in Java assembly for its implementation, which is quite
troublesome, especially with complex logic. Can we find some way
to use this technique and still be able to code in Java rather than
assembly? Certainly. The byte code engineering tools allow us to
add or remove a method or field, or to change the class attributes
of a compiled Java class. So we can have our class implement one of
the interfaces and we can write the code for the other interface's
method in some other method. Our Java code would then look like
this:

[prettify]
public class BetterTechnique implements Interface1 {
    void doSomething() {
        // -- Complex code --
        return;
    }

    String delegatedDoSomething() {
        // -- Complex Code --
        return someStringValue;
    }
}
[/prettify]

Now we compile this code to get BetterTechnique.class
file. Then, using bytecode engineering tools, we mark this class as
implementing the Interface2 interface and add the
method String doSomething(). This method will need to
be coded in assembly, but we just need to call
delegatedDoSomething() from that method and return the
result. So it's not a big deal. See the "#resources">Resources section for the sample code, which
includes all the necessary files for this example, in the
EnhancedTechnique folder. For step-by-step instructions
about compilation and modification, please go through the
ReadMe.txt file.

To simplify the use of this technique, it is possible to develop
an annotation for a class to indicate which method is to be
overloaded, which is a delegate method, and which additional
interface should be implemented by the class. Then a tool can
inspect the class file reflectively, and if it encounters the said
annotation, it can automatically transform the class file
accordingly. An interested reader may explore these
possibilities.

Conclusion

We have demonstrated that it is possible to overload Java
methods based solely on their return types. However, whether this
undocumented feature of Java is a deliberate choice or an accident
can only be clarified by the more knowledgeable people here. I
guess there may be some internal use of this feature for JVM, or else
why would it be put in? Anyway, let us look forward to interesting
and enlightening discussions.

Resources


width="1" height="1" border="0" alt=" " />
Vinit Joglekar is a System Engineer at Nihilent Technologies, Pune.
Related Topics >> Programming   |