Tinder

Have JID implement Serializable instead of Externalizable

Details

  • Type: Sub-task Sub-task
  • Status: Resolved Resolved
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: None
  • Fix Version/s: 1.1 - Concurrency
  • Component/s: None
  • Labels:
    None
  • Acceptance Test - Add?:
    No

Description

Currently, JID implements Externalizable. This interface defines two methods that are used for serialization of objects. The implementation of those methods assign a deserialized values to instance fields. This prevents those field from being defined 'final', which in turn prevents the class from being defined immutable.

We should consider dropping the Externalizable interface.

Activity

Hide
Guus der Kinderen added a comment -

The Externalizable interface implementation could be moved into a structure as described in Josh Bloch's Effective Java, Second Edition as the Serialization Proxy Pattern. This would allow us to retain the way serialization is performed (through the pluggable ExternalizableUtil implementation, which introduces a performance boost in comparison with regular serialization through reflection), without the need for the methods defined by the Externalizable interface in JID.

http://lingpipe-blog.com/2009/08/10/serializing-immutable-singletons-serialization-proxy/

Show
Guus der Kinderen added a comment - The Externalizable interface implementation could be moved into a structure as described in Josh Bloch's Effective Java, Second Edition as the Serialization Proxy Pattern. This would allow us to retain the way serialization is performed (through the pluggable ExternalizableUtil implementation, which introduces a performance boost in comparison with regular serialization through reflection), without the need for the methods defined by the Externalizable interface in JID. http://lingpipe-blog.com/2009/08/10/serializing-immutable-singletons-serialization-proxy/
Hide
Guenther Niess added a comment -

I'm not an expert of this topic, but in my opinion it looks good. I miss a serialVersionUID and had some whitespace problems on applying your patch, but this should be simply fixed.

Show
Guenther Niess added a comment - I'm not an expert of this topic, but in my opinion it looks good. I miss a serialVersionUID and had some whitespace problems on applying your patch, but this should be simply fixed.
Hide
Guus der Kinderen added a comment -

Sadly, de-serialization of instances that were serialized using the 'old' implementation would fail.

After discussing this with Gato, I'm abandoning my effort to make the fields of JID final:

Gaston: yo
Guus: eyo
Gaston: I had no idea about Serialization Proxy Pattern
Gaston: weird workaround from Java
Gaston: the only way to be sure that Coherence will be happy with this code is to test it
Gaston: specially considering that proxy is private
Gaston: so the goal of that patch is ....
Gaston: more performance?
Guus: no - making the class immutable
Guus: therefor concurrency-safe
Guus: but, I've found a potential problem
Gaston: if we only allow to set things from the constructor
Gaston: and there are no init or setters
Gaston: wouldn't that produce an immutable class for people using it?
Guus: generally, I would prefer not to have to worry about that and declare the members final
Guus: that's always safe
Guus: the problem that I found:
Guus: if you serialize an object using the old implementation, you cannot restore it using the new implementation
Guus: which I think is a bit of a show-stopper, right?
Gaston: hmm, don't think so
Gaston: 1) not many people are using coherence code
Gaston: 2) if you upgrade I would say taht all nodes will go down
Gaston: so new JIDs will be serialized in the new format
Gaston: so deserialization should work
Gaston: or am I missing your point?
Guus: that's true
Guus: no
Guus: I was wondering about people that would have persisted serialized JIDs
Gaston: we are not storing serialized jids in the db
Gaston: so all things are temporary
Guus: but I'm not sure if anyone does that
Gaston: hmm, maybe fastpath has something
Gaston: not sure
Gaston: I think it is in xml format
Gaston: not serialized format
Gaston: yes, xml format almost sure
Guus: hmm
Guus: that'd be good...
Gaston: but potentially you are right
Gaston: someone may have stored jids in the db
Gaston: but that has to be done with their own extension
Gaston: I would say that JID in practice is already immutable
Gaston: so not sure about the need for the patch
Gaston: there are not setters
Gaston: and init is private
Gaston: so the API says that you have to create a jid to set things
Gaston: I vote to leave it the way it is now
Guus: yeah, all things considered, I'm wondering about the cost/benefit
Gaston: agreed

Show
Guus der Kinderen added a comment - Sadly, de-serialization of instances that were serialized using the 'old' implementation would fail. After discussing this with Gato, I'm abandoning my effort to make the fields of JID final:
Gaston: yo Guus: eyo Gaston: I had no idea about Serialization Proxy Pattern Gaston: weird workaround from Java Gaston: the only way to be sure that Coherence will be happy with this code is to test it Gaston: specially considering that proxy is private Gaston: so the goal of that patch is .... Gaston: more performance? Guus: no - making the class immutable Guus: therefor concurrency-safe Guus: but, I've found a potential problem Gaston: if we only allow to set things from the constructor Gaston: and there are no init or setters Gaston: wouldn't that produce an immutable class for people using it? Guus: generally, I would prefer not to have to worry about that and declare the members final Guus: that's always safe Guus: the problem that I found: Guus: if you serialize an object using the old implementation, you cannot restore it using the new implementation Guus: which I think is a bit of a show-stopper, right? Gaston: hmm, don't think so Gaston: 1) not many people are using coherence code Gaston: 2) if you upgrade I would say taht all nodes will go down Gaston: so new JIDs will be serialized in the new format Gaston: so deserialization should work Gaston: or am I missing your point? Guus: that's true Guus: no Guus: I was wondering about people that would have persisted serialized JIDs Gaston: we are not storing serialized jids in the db Gaston: so all things are temporary Guus: but I'm not sure if anyone does that Gaston: hmm, maybe fastpath has something Gaston: not sure Gaston: I think it is in xml format Gaston: not serialized format Gaston: yes, xml format almost sure Guus: hmm Guus: that'd be good... Gaston: but potentially you are right Gaston: someone may have stored jids in the db Gaston: but that has to be done with their own extension Gaston: I would say that JID in practice is already immutable Gaston: so not sure about the need for the patch Gaston: there are not setters Gaston: and init is private Gaston: so the API says that you have to create a jid to set things Gaston: I vote to leave it the way it is now Guus: yeah, all things considered, I'm wondering about the cost/benefit Gaston: agreed
Hide
Guus der Kinderen added a comment -

This morning I realized that it is very unlikely that serialized instances of JID have been persisted in systems anywhere. Out of the box, JID is not serializable at all (TINDER-14).

Only users that bought Jive's Coherence-based clustering support, or users that implemented a custom ExternalizableUtilStrategy would have the opportunity to persist JID instances. The likelyhood that this has indeed been done, is slim.

Coherence clusters already have an upgrading strategy, requiring that all cluster nodes are shut down before Openfire instances are updated to another version. This will prevent any serialization problem from popping up during cluster upgrades.

Show
Guus der Kinderen added a comment - This morning I realized that it is very unlikely that serialized instances of JID have been persisted in systems anywhere. Out of the box, JID is not serializable at all (TINDER-14). Only users that bought Jive's Coherence-based clustering support, or users that implemented a custom ExternalizableUtilStrategy would have the opportunity to persist JID instances. The likelyhood that this has indeed been done, is slim. Coherence clusters already have an upgrading strategy, requiring that all cluster nodes are shut down before Openfire instances are updated to another version. This will prevent any serialization problem from popping up during cluster upgrades.
Hide
Kees Jan Koster added a comment -

Weighing down the open source JID implementation for the sake of a small performance gain in a commercial product is fairly hard to understand for me. If Coherence can gain performance by using an alternative serialization strategy, it can do so without that class even being serializable in the first place. This places the burden of extra work not on the entire JID-using community (who don't benefit from that) but on the one product that does gain performance from it.

Show
Kees Jan Koster added a comment - Weighing down the open source JID implementation for the sake of a small performance gain in a commercial product is fairly hard to understand for me. If Coherence can gain performance by using an alternative serialization strategy, it can do so without that class even being serializable in the first place. This places the burden of extra work not on the entire JID-using community (who don't benefit from that) but on the one product that does gain performance from it.
Hide
Guus der Kinderen added a comment -

I've just talked to some guys that run an Openfire cluster. At first glance, roughly 1% of the available CPU time appears to be spent on performing serializing/deserializing data. Note that this is a very heavy used system, 4 node cluster, lots of custom plugins. I'm beginning to wonder if spending time on improving serialization (or keeping existing work-arounds) in place, is hardly worth the effort. I'm wondering if we wouldn't benefit more from code that's easier to maintain. I was thinking of it this way: the pluggable serialization strategy is quite complex. The complexer the code, the more people are "scared away" from maintaining it. ("eek, let's not touch that, maybe I break something"). additionally, complexer code tends to limit extensibility/adaptability. Finally: hardly anyone uses it.

I think we should seriously consider dropping the pluggable serialization stuff, and any hacks like that. If it doesn't offer you more than a 10-20% boost, we probably shouldn't even consider adding complexity like that.

Show
Guus der Kinderen added a comment - I've just talked to some guys that run an Openfire cluster. At first glance, roughly 1% of the available CPU time appears to be spent on performing serializing/deserializing data. Note that this is a very heavy used system, 4 node cluster, lots of custom plugins. I'm beginning to wonder if spending time on improving serialization (or keeping existing work-arounds) in place, is hardly worth the effort. I'm wondering if we wouldn't benefit more from code that's easier to maintain. I was thinking of it this way: the pluggable serialization strategy is quite complex. The complexer the code, the more people are "scared away" from maintaining it. ("eek, let's not touch that, maybe I break something"). additionally, complexer code tends to limit extensibility/adaptability. Finally: hardly anyone uses it. I think we should seriously consider dropping the pluggable serialization stuff, and any hacks like that. If it doesn't offer you more than a 10-20% boost, we probably shouldn't even consider adding complexity like that.
Hide
Gaston Dombiak added a comment -

+1. Booh complexity. Welcome simplicity specially when there is no big advantage. For this particular case, I vote for leaving the original code as it was. The API is already immutable so not sure about what we are gaining with this issue.

Show
Gaston Dombiak added a comment - +1. Booh complexity. Welcome simplicity specially when there is no big advantage. For this particular case, I vote for leaving the original code as it was. The API is already immutable so not sure about what we are gaining with this issue.
Hide
Guus der Kinderen added a comment -

Although making JID immutable was the reason why I started this endeavor, it is not the reason why I want to remove the complexity from our code. By trying to make JID immutable I came across something that I now feel we should avoid, regardless of whether or not that change will allow us to make JID immutable.

Replacing the implementation of JID with one that implements Serializable instead of Externalizable allows us to remove the ExternalizableUtil (+-Strategy, +-Dummy implementation) from Tinder completely.

Note that this would fix an issue that's a side-effect of the DummyStrategyImplementation: The JID class is marked Serializable, but if it's not being used in a Openfire cluster, it actually does not serialize correctly (because the dummy strategy is used by default, which doesn't do anything) (TINDER-14).

I suspect that TINDER-14 is a nice example of what I argued earlier; another reason not to hack around serialization this way.

Currently, trunk has been modified to use Serializable instead of Externalizable for the JID class only. I would appreciate your feedback on this setup.

I've modified Openfire slightly to accommodate this change (OF-54).

Note that I have limited the change to the JID class only. Any serialization penalty that I have introduced here is limited to that class only (other classes will be just as fast as they have been before).

Similar argument could be made for changing other classes like I did for the JID class, but as they're not part of the Tinder project but of the Openfire project, I would like to exclude them from the current discussion (I feel that that is out of scope here).

Show
Guus der Kinderen added a comment - Although making JID immutable was the reason why I started this endeavor, it is not the reason why I want to remove the complexity from our code. By trying to make JID immutable I came across something that I now feel we should avoid, regardless of whether or not that change will allow us to make JID immutable. Replacing the implementation of JID with one that implements Serializable instead of Externalizable allows us to remove the ExternalizableUtil (+-Strategy, +-Dummy implementation) from Tinder completely. Note that this would fix an issue that's a side-effect of the DummyStrategyImplementation: The JID class is marked Serializable, but if it's not being used in a Openfire cluster, it actually does not serialize correctly (because the dummy strategy is used by default, which doesn't do anything) (TINDER-14). I suspect that TINDER-14 is a nice example of what I argued earlier; another reason not to hack around serialization this way. Currently, trunk has been modified to use Serializable instead of Externalizable for the JID class only. I would appreciate your feedback on this setup. I've modified Openfire slightly to accommodate this change (OF-54). Note that I have limited the change to the JID class only. Any serialization penalty that I have introduced here is limited to that class only (other classes will be just as fast as they have been before). Similar argument could be made for changing other classes like I did for the JID class, but as they're not part of the Tinder project but of the Openfire project, I would like to exclude them from the current discussion (I feel that that is out of scope here).
Hide
Guenther Niess added a comment -

+1 I also like simplicity and I think the ExternalizableUtils don't belongs to Tinder, they should belong to Openfire. So if the current revision 11203 works fine with Openfire and don't break anything, I would prefer this one. (I think all seen JID implementations are immutable.)

Show
Guenther Niess added a comment - +1 I also like simplicity and I think the ExternalizableUtils don't belongs to Tinder, they should belong to Openfire. So if the current revision 11203 works fine with Openfire and don't break anything, I would prefer this one. (I think all seen JID implementations are immutable.)
Hide
Guus der Kinderen added a comment -

I've had a chat with Gato. He still does not feel comfortable with the change, but doesn't block the proposed implementation here. I'm going ahead and exclude ExternalizationUtil from the next release of Tinder, choosing reduced complexity over performance in the cluster for the reasons mentioned above.

Show
Guus der Kinderen added a comment - I've had a chat with Gato. He still does not feel comfortable with the change, but doesn't block the proposed implementation here. I'm going ahead and exclude ExternalizationUtil from the next release of Tinder, choosing reduced complexity over performance in the cluster for the reasons mentioned above.

People

Vote (0)
Watch (1)

Dates

  • Created:
    Updated:
    Resolved: