Mihail Stoynov's blog!
Surrender your ego
Thursday, July 17, 2008
Serialization, cyclic references (via hashmaps) and overriding hashCode()
I'll try to simplify it.
There's an object model. In it there are cyclic references (one object references a second one, the second one - a third one, the third one - the first one).
Some of the cyclic references are through aggregations - one object has a map of other objects.
Some of the objects have a meaningful
hashCode()
and
equals()
overridden. These two depend on some properties in the object itself.
Some of the objects get serizalized/deserialized (travel through a stream).
Now here comes the problem - the deserialization first sees the cyclic reference, makes instances of all the objects, initializes all the primitive fileds, does not initialize the other fields, then links the objects.
Here comes the problem, linking two objects (one of which has a map of the other) requires
hashCode()
. This requires some specific properties in that object that are not initialized - this causes
NullPointerException
(or in my case an
AssertionError
).
If the hashCode returns a default value if the properties are not there - another
serious
problem si caused - there are objects in the map in the wrong
buckets
- they entered the map with the default hash, but when they got completely initialized - they now have a different hash. I think that is really bad - the map has to be rehashed.
Here's a bug detail:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4957674
Here's what some of the guys say on the subject:
The problem is that HashMap's readObject() implementation , in order to re-hash the map, invokes the hashCode() method of some of its keys, regardless of whether those keys have been fully deserialized.
And:
The fix for this is actually quite easy: Modify the readObject() and writeObject() of HashMap so that it also saves the original hash code. (I am currently using this fix in production code for a large web site.) That way, when the map is reconstructed, you don't have to recompute the hashcode----the problem is caused by recomputing the hashcode at a moment when it is not computable.
What you *give up* with this fix is that HashMaps containing Objects that don't override hashCode() and equals() will not be deserialized properly.
So basically, you have a choice: either it will be robust for classes that implement hashCode(), or it will work for bare Objects(). One or the other. I prefer the former, because people are supposed to implement hashCode().
But
, not all my object have a rewritten equals (of course I can check with reflection which ones do and which ones don't, but...). This would also mean that I'm using a customized collection.
There's another proposition - to hash the hashcode.
The fix for this is actually quite easy: Modify the readObject() and writeObject() of HashMap so that it also saves the original hash code. (I am currently using this fix in production code for a large web site.)
The hashcode is a primitive type, so it would get initialized first and the problem would be solved. This would mean to have an
hashCode()
and
equals()
which check which one is available - the cached hash or the properties - isn't that
UGLY
.
I'll investigate more.
Comments [0]
|
Trackback
OpenID
Please login with either your
OpenID
above, or your details below.
Name
E-mail
(will show your
gravatar
icon)
Home page
Remember Me
Comment (Some html is allowed:
a@href@title, blockquote@cite, strike
) where the @ means "attribute." For example, you can use <a href="" title=""> or <blockquote cite="Scott">.
Enter the code shown (prevents robots):
Live Comment Preview
dasBlog theme by
Mads Kristensen
RSS feed
Search
Archives
November, 2008 (15)
October, 2008 (16)
September, 2008 (30)
August, 2008 (15)
July, 2008 (14)
June, 2008 (26)
May, 2008 (6)
April, 2008 (21)
March, 2008 (14)
February, 2008 (28)
November, 2007 (5)
October, 2007 (7)
September, 2007 (1)
August, 2007 (7)
July, 2007 (3)
June, 2007 (1)
Blog Stats
Total Posts: 203
This Year: 179
This Month: 0
This Week: 0
Comments: 80
Categories
Did you know
Java
rulez
Sucks
БГ
Blogroll
Michael Moore (no rss)
Links
BG-JUG
Copyright policy
No rights reserved.
(You are going to
copy stuff anyway :)
If you mention my
name, thank you.
2008, Mihail Stoynov