Erasure and Accidental Compile-Time Conflicts
One consequence of erasure is that code sometimes doesn't compile because, after erasure, there are conflicts. You probably suspect that the following two classes will have an issue:
class ShoppingCart<T extends DVD>{
// ...
}
class ShoppingCart<T extends VideoTape>{
// ...
}
And the reason is pretty easy to discern: under erasure, these classes have the same name. (In fact, you'd probably have run into problems before you got as far as compiling, because both of these classes want to be stored in ShoppingCart.java.)
But there are more subtle name clashes, as well. If two methods erase to the same method, then you'll get a compile time error. For example, the following code doesn't compile, either:
class TwoForOneSpecial<T extends Rentable, W extends Rentable> {
public void add(T newRentable) {
//...
}
public void add(W newRentable) {
//...
}
}
It fails for a pretty obvious reason: the two add methods have identical erasures, and therefore cause compile-time problems. The problem here is that the system maps both T and W to Rentable, and therefore causes a compilation error (the compiler currently states "add(T) and add(W) have the same erasure").
No such problem exists for the following class:
class GetAFreeVideoTape<T extends Rentable, W extends VideoTape> {
public void add(T anything) {
//...
}
public void add(W videotape) {
//...
}
}
GetAFreeVideoTape will compile because under erasure, it becomes:
class GetAFreeVideoTape {
public void add(Rentable anything) {
//...
}
public void add(Videotape videotape) {
//...
}
}
Erasure and Static Variables
Page three of the generics specification (the June 23, 2003 draft) contains an interesting sentence:
The scope of a type parameter is all of the declared class, except any static members or initializers, but including the type parameter section itself.
This is intriguing -- it says that you can't use type parameters in static fields or methods. That is, code like the following isn't allowed:
public class Store<T> {
private static T STATIC_VARIABLE;
}
At first glance, this seems like a strange restriction to have. But it does makes sense, and it's a consequence of erasure.
The heart of the problem is the fact that when you use generics, you're not defining new classes. Erasure maps parameterized types to raw types. So, for example, Vector<String> and Vector<Integer> both get erased to the same class (namely, Vector) and the code that accesses these instances casts the values that it fetches from them.
Why is this problematic? Consider the following class (it won't compile, but pretend for a moment that it could):
public class GlobalEventQueue<T extends Event> {
// stores all events in a global linked list so that
// all events are serialized.
private static LinkedList<T> QUEUE = new LinkedList<T> ();
// ...
public static synchronized T removeEvent() {
return QUEUE.removeFirst();
}
public static synchronized void addEvent(T event) {
QUEUE.add(event);
}
}
GlobalEventQueue<T extends Event> uses a static LinkedList<T>to store events. But you can have many instances of GlobalEventQueue<T extends Event>, and you can subclass it. Suppose you did so, by creating the classes ScreenEventQueue and DiskEventQueue.
public class ScreenEventQueue {
private GlobalEventQueue<ScreenEvent> _myQueue;
public ScreenEventView() {
_myQueue = new GlobalEventQueue<ScreenEvent>();
}
public ScreenEvent removeEvent() {
return _myQueue.removeEvent();
}
public void addEvent(ScreenEvent event) {
return _myQueue.addEvent(event);
}
}
public class DiskEventQueue {
private GlobalEventQueue<DiskEvent> _myQueue;
public DiskEventQueue() {
_myQueue = new GlobalEventQueue<DiskEvent>();
}
public DiskEvent removeEvent() {
return _myQueue.removeEvent();
}
public void addEvent(DiskEvent event) {
return _myQueue.addEvent(event);
}
}
What happens if you create an instance of ScreenEventQueue and DiskEventQueue? Well, at runtime, both of these insert and remove objects from the same instance of LinkedList. And they do so by casting the values that they remove. In fact, under erasure, ScreenEventQueue and DiskEventQueue become:
public class ScreenEventQueue {
private GlobalEventQueue _myQueue;
public ScreenEventView() {
_myQueue = new GlobalEventQueue();
}
public ScreenEvent removeEvent() {
return (ScreenEvent) _myQueue.removeEvent();
}
public void addEvent(ScreenEvent event) {
return _myQueue.addEvent(event);
}
}
public class DiskEventQueue {
private GlobalEventQueue _myQueue;
public DiskEventQueue() {
_myQueue = new GlobalEventQueue();
}
public DiskEvent removeEvent() {
return (DiskEvent) _myQueue.removeEvent();
}
public void addEvent(DiskEvent event) {
return _myQueue.addEvent(event);
}
}
Which means that we can insert an instance of DiskEvent into the static LinkedList and then try to cast it as a ScreenEvent.Unless there's some synchronization logic somewhere else, we're going to eventually add an instance of DiskEvent to the queue and then try to cast it as a ScreenEvent when we remove it. At runtime, the transformed code will throw instances of ClassCastException.
More generally, if we let static members be defined using type parameters, it's impossible to avoid the possibility of a ClassCast Exception being thrown at runtime. And so the specification carefully limits the scope of type parameters to rule out code like the example above.
Bridging
So far, we've only talked about erasure. However, the current implementation of generics uses another form of code transformation as well. This process, which is referred to as bridging, consists of inserting extra methods into objects. And, like erasure, bridging is motivated by backwards compatibility.
To understand bridging, let's extend our example and have our shopping cart sort the tapes before returning them. To do this, we need to define a Comparator. In the generics specification, Comparator has become the parameterized type Comparator<T>. But it's still an interface, and it has the same methods (they've just become more strongly typed). The interface has become:
public interface Comparator<T> {
int compare(T o1, T o2);
boolean equals(Object obj);
}
Here's an implementation of RentableComparator.
public class RentableComparator implements Comparator<Rentable>{
public int compare(Rentable rentable1, Rentable rentable2) {
if (null==rentable1) {
if (null==rentable2) {
return 0;
}
return -1;
}
if (null==rentable2) {
return 1;
}
return rentable1.getDisplayName().compareTo(rentable2.getDisplayName());
}
}
This looks pretty similar to the code you write today, and it's exactly what you want: it's a strongly typed comparator. At compile time, the compiler can check that you're using things correctly. Incorporating RentableComparator into ShoppingCart is easy. We just use our comparator to sort the list before returning an Iterator in the getContents method.
public Iterator<T> getContents() {
RentableComparator comparator = new RentableComparator();
Collections.sort(_contents, comparator);
return _contents.iterator();
}
Again, this looks exactly like the code you write today. And, if you're under deadline pressure, you might just write this code, check that it works, and move on.
But if you stop and think about backwards compatibility, this can get pretty confusing. Suppose, for example, you're also using a legacy library that puts objects into instances of Vector and then sorts them, as in the following code.
public void printSortedCollection(Collection collection, Comparator comparator) {
Vector vector = new Vector(collection);
Collections.sort(vector, comparator);
Iterator i = vector.iterator();
while (i.hasNext()) {
System.out.println(i.next());
}
}
The legacy library has to work, as well. When you pass in an instance of RentableComparator, the right thing will happen (the legacy library will sort the vector correctly). This happens in spite of the fact that your comparator implemented public int compare(Rentable rentable1, Rentable rentable2) and the legacy library was expecting public int compare(Object object1, Object object2).
If you're at all familiar with the way inner classes are implemented, you've already guessed the solution to this problem: the generics compiler actually inserts extra methods, called bridge methods, into the parameterically typed classes (or subclasses) to make sure that the legacy code works. In this case, the compiler will insert code that looks like the following into RentableComparator:
public int compare(Object obj, Object ob1) {
return compare((Rentable) obj, (Rentable) obj1);
}
This is very nice. With bridging, you get the benefits of static typing in all of your code. And you get backwards compatibility with all of the old libraries you are currently using (or might use).
Final Thoughts
At this point, you're probably a little tired of learning about what the compiler is doing to your code behind the scenes. So in the final section of this column, I'm going to switch gears and talk about what the compiler can't do to your code (at least, as far as I've been able to puzzle it out). In particular, there are two things I wish it did, that it doesn't do.
The first thing on my wish list this Christmas is a typesafe equals method. I really hate code like the following (from the ItemCode class):
public boolean equals(Object object) {
if (!(object instanceof ItemCode)) {
return false;
}
int otherCode = ((ItemCode) object)._code;
return (_code == otherCode);
}
This is correct, concise, and perfectly reasonable. But the cast check in the beginning smells bad to me. In most cases, class checks are in there for logical completeness; not because the developer expects it to happen. I'd wager that in many cases the code should really be:
public boolean equals(Object object) {
if (!(object instanceof ItemCode)) {
// huh? It's not? Wow. I didn't expect that to happen at all.
// Oh well. I guess if it's a different class entirely,
// it's not equal. So returning false is the safe thing to do.
return false;
}
int otherCode = ((ItemCode) object)._code;
return (_code == otherCode);
}
and that, therefore, all the instanceof really does is obfuscate a logical error. I'll grant you that it might be better to return false than throw a ClassCastException on a production server, but it's a not good thing to do.
Another problem with equals arises from the fact that type parameters are erased. The issue is that, at runtime, we can only check the raw type and not the parameterized type. Put another way, the problem is that the return value in the following code snippet evaluates to true.
Vector<String> vector1 = new Vector<String>();
Vector<Vector<String>> vector2 = new Vector<Vector<String>>();
return vector1.getClass() == vector2.getClass();
The fact that two very different types (Vector<String> and Vector<Vector<String>>) have the same class makes relying on instanceof (or getClass) problematic.
Ideally, I'd like equals to somehow be a generic method, so that I could get rid of the class casts and take advantage of static typing. But I don't see any way to do it (and still take advantage of all of the code out there that calls equals). So I'm leaving this one open as a challenge -- can anyone see a way to have static typing on the arguments to equals and still preserve backwards compatibility?
Another place you still need to cast, and cast correctly, is inside of serialization code. If you use serialization to persist objects, you're either going to use default serialization (which is often unwise, for the reasons outlined in Java Enterprise Best Practices) or you're going to wind up writing code like:
FileOutputStream fos = new FileOutputStream(_persistenceFileName);
ObjectOutputStream oos = new ObjectOutputStream(fos);
oos.writeObject(settings);
oos.close();
// .....
FileInputStream fis = new FileInputStream(_persistenceFileName);
ObjectInputStream ois = new ObjectInputStream(fis);
Hashtable settings = (Hashtable)ois.readObject();
o = ois.readObject();
ois.close();
Wouldn't it be nice if the compiler could check this code too?
And with that thought, I'll end this month's column. In next month's column, I'll talk about how inheritance interacts with generics and how wildcards work.