I'll try to simplify it.
There's an object model. In it there are cyclic references (one object references a second one, the second one - a third one, the third one - the first one).
Some of the cyclic references are through aggregations - one object has a map of other objects.
Some of the objects have a meaningful hashCode() and equals() overridden. These two depend on some properties in the object itself.
Some of the objects get serizalized/deserialized (travel through a stream).
Now here comes the problem - the deserialization first sees the cyclic reference, makes instances of all the objects, initializes all the primitive fileds, does not initialize the other fields, then links the objects.
Here comes the problem, linking two objects (one of which has a map of the other) requires hashCode(). This requires some specific properties in that object that are not initialized - this causes NullPointerException (or in my case an AssertionError).
If the hashCode returns a default value if the properties are not there - another serious problem si caused - there are objects in the map in the wrong buckets - they entered the map with the default hash, but when they got completely initialized - they now have a different hash. I think that is really bad - the map has to be rehashed.
Here's a bug detail:
Here's what some of the guys say on the subject:
The problem is that HashMap's readObject() implementation , in order to re-hash the map, invokes the hashCode() method of some of its keys, regardless of whether those keys have been fully deserialized.
The fix for this is actually quite easy: Modify the readObject() and writeObject() of HashMap so that it also saves the original hash code. (I am currently using this fix in production code for a large web site.) That way, when the map is reconstructed, you don't have to recompute the hashcode----the problem is caused by recomputing the hashcode at a moment when it is not computable.
What you *give up* with this fix is that HashMaps containing Objects that don't override hashCode() and equals() will not be deserialized properly.
So basically, you have a choice: either it will be robust for classes that implement hashCode(), or it will work for bare Objects(). One or the other. I prefer the former, because people are supposed to implement hashCode().
But, not all my object have a rewritten equals (of course I can check with reflection which ones do and which ones don't, but...). This would also mean that I'm using a customized collection.
There's another proposition - to hash the hashcode.
The fix for this is actually quite easy: Modify the readObject() and writeObject() of HashMap so that it also saves the original hash code. (I am currently using this fix in production code for a large web site.)
The hashcode is a primitive type, so it would get initialized first and the problem would be solved. This would mean to have an hashCode() and equals() which check which one is available - the cached hash or the properties - isn't that UGLY.
I'll investigate more.