My Zookeeper controls several different queues for different jobs, holding the corresponding job data in each node until the computer is ready for processing. If I stop the general service so that no jobs can be started, ZooKeeper starts fine after rebooting. However, some of these tasks seem to cause ZooKeeper to crash with the following message in the ZooKeeper log:
WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@349] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x15677f740ad002a, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /127.0.0.1:46998 which had sessionid 0x15677f740ad002a
The knowledge of my ZooKeeper is very limited, as I take on the role of the guy who created it.
I tried to delete a lot of nodes with the help rmr [path]in the zookeeper shell, which seemed to have some effect (deleted 50k + nodes that were left / useless), but it crashes daily and last night I could not get it working for more than a couple of minutes before the same error / failure occurs.
How to find out what causes this?
I am sure this is a common problem with received data or stored data / nodes. The disk is only 92% full. I also found this message: Zookeeper continues to receive WARN: "caught end of stream exception" , but the solution does not make much sense to me. I am also sure that none of the messages stored in my znodes is more than 1 MB, but I am not sure how to confirm this.
- ZooKeeper, , / znode, , ?