A deep graph leads to stack overflow: non-recursive serialization options?

Question

A deep graph leads to stack overflow: non-recursive serialization options?

We get StackOverflowErrors from the Java serialization library. The problem is that the default serialization implementation is recursive, the depth of which is limited only by the longest path through a network of links.

We understand that we can override the default methods, but we have hundreds of rich classes in our project, so we are not happy with the overriding approach. We are more interested if there is a generalized solution that is not recursive (or at least moves recursion from stack to heap).

I searched this topic and found that many people bitterly complain about the same thing, but most of these complaints were many years ago. Has the situation improved? If not, and we are writing a generalized implementation, do you have any tips? We assume that there is some reason (not yet obvious to us) why no one cracked this nut. Theoretically, “right” sounds as if it should be feasible.

+6

java serialization recursion non-recursive

donlibes Sep 16 '11 at 1:42

source share

4 answers

Proof that JDK 6 serialization can handle recursive object graphs:

 public static void main(String[] args) throws Exception { Foo foo = new Foo("bob"); foo.setBar(new Bar("fred", foo)); ByteArrayOutputStream baos = new ByteArrayOutputStream(); ObjectOutputStream out = new ObjectOutputStream(baos); out.writeObject(foo); out.close(); ObjectInputStream in = new ObjectInputStream(new ByteArrayInputStream(baos.toByteArray())); Object o = in.readObject(); System.out.println(o); } static class Foo implements Serializable { String name; Bar bar; Foo(String name) { this.name = name; } void setBar(Bar bar) { this.bar = bar; } @Override public String toString() { return "Foo{" + "name='" + name + '\'' + ", bar=" + bar + '}'; } } static class Bar implements Serializable { String name; Foo foo; Bar(String name, Foo foo) { this.name = name; this.foo = foo; } @Override public String toString() { return "Bar{" + "name='" + name + '\'' + '}'; } }

Conclusion:

 Foo{name='bob', bar=Bar{name='fred'}}

+1

Ryan stewart Sep 16 '11 at 2:53

source share

I don't seem to read the question very well. It seems you are interested in serializing properties that may contain circular references. If this assumption is wrong, and it would be nice to NOT serialize these objects that contain circular references, please refer to my original answer below.

New answer

I think you will need to keep track of which objects were serialized, and I don’t see this unless you do it yourself. It should not be too complicated.

On these objects that contain circular references, you can save the transient boolean by representing whether the object has already been serialized. You should then override the default serialization behavior, but this can be done with just a few lines.

 public void writeExternal(ObjectOutput out) { if(!out.serialized) { out.serializeMethod(); } out.serialized = true; }

Original answer

See transient keyword

I would suggest that most serialization libraries will abide by the transient keyword. If the member is transient , it should be excluded from serialization.

 class Something { private Dog dog; // I will be serialized upon serialization. private transient SomethingElse somethingElse; // I will not be serialized upon serialization. } class SomethingElse { private Cat cat; // I will be serialized upon serialization. private transient Something something; // I will not be serialized upon serialization. }

If you have recursive members like the scenario described above, you would like to mark one or the other (or both) as transient so that this overflow does not exist.

0

nicholas.hauschild Sep 16 '11 at 1:58

source share

GWT RPC serialization is basically equivalent to JVM serialization, and both use the stack / recursive method. Unfortunately, this will not work with notching into pieces (what you need to do if you are working in a browser, that is, with GWT), therefore a non-recursive approach is used here: https://github.com/nevella/alcina/blob/d3e37df57709620f7ad54d3d59b997e9c4c7d88d3d59b997e9c4c7d883 /extras/rpc/client/src/com/google/gwt/user/client/rpc/impl/ClientSerializationStreamReader.java

Essentially, convert serialization in three passes: * create objects * set properties (via a link) * fill collections

Two tricks: some objects need properties to create an instance (for example, Date), and you need to fill the collections with the latter, as they may need hashes of their members.

This allows non-recursive de-serialization, but actually non-recursive serialization is even simpler (there is no custom writeReplace / readResolve yet), only inside writeObject it supports two object queues, not serialized, not-serialized-for-current-object and have the serialize- property push object-property onto the stack with a marker, rather than making a recursive call.

Here is a very simple example: http://www.w3.org/2006/02/Sierra10022006.pdf

0

nickle May 20, '15 at 6:11

source share

dlawrence · Accepted Answer · 2011-09-16T05:57:28+0000

I had this problem a while ago. For richly related classes, even if you can complete serialization without, serialization is slow. When we solved this problem, we had several classes, so we just created our own serialization format, which packed the data into a set of integer identifiers, with integer field identifiers for each field and described their connections using a series of object identifiers, field identifier, others Mapping object identifiers. This user approach was very fast and extremely easy to remember, but actually only works if you have a small set of classes that you want to serialize.

The general case is much more complicated, and the demand for serializing rich classes is not so strong, so I would have guessed why no one solved it.

Mostly you are up to the problem, but you always need the stack depth equal to the maximum height of the depth tree of the first search, and therefore at any time when your graph is deeper, you will get a stack overflow. This is a fundamentally recursive problem, so you have to either use recursion or fake recursion by moving the stack distributions to the Stack object that you are putting in the heap. I would look at the OpenJDK implementation:

http://hg.openjdk.java.net/jdk6/jdk6-gate/jdk/file/tip/src/share/classes/java/io/ObjectOutputStream.java

You already have DebugTraceInfoStack, I would create a second Stack field for the current object you are writing, and change the writeObject0 method to push the object onto the stack, something like this:

stack.push(obj); while(!stack.empty()) { obj = stack.pop(); ...

Then you just change all the calls to writeObject0 (x); to stack.push (x) ;. The simple standard conversion between recursion and iteration, apart from the class, is nearly 2500 lines, and there are probably tons of gotchas.

If you still build it, I would defend the possibility of sending it as a patch to the next version of java, since it would be useful, for example, IterativeObjectOutputStream for use on plots of deep objects.

A deep graph leads to stack overflow: non-recursive serialization options?

More articles: