We have a java process running as a daemon (under jsvc). Every few days, he simply stops doing any work; the output to the log file stops (it is rather verbose, with 5-minute intervals), and it does not consume CPU or IO.
There are no exceptions in the log file except syserr or sysout. The last log statement is immediately before executing db-commit, but there is no open connection on the db (MySQL) server and there is no code review, after that there should always be an additional log output, even if it encounters an exception that was about to bubble.
The most curious thing I find is that in the stream dump (see below) there is no code in our code at all, and the main stream seems to have no context:
"main" prio=10 tid=0x0000000000614000 nid=0x445d runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
As noted earlier, this is a daemon process running using jsvc, but I don't know if this has anything to do with it (I can restructure the code so that it can also run it directly to check).
Any suggestions on what could happen here?
Thanks ... dwh
Full stream dump:
Full thread dump Java HotSpot(TM) 64-Bit Server VM (14.2-b01 mixed mode):
"MySQL Statement Cancellation Timer" daemon prio=10 tid=0x00002aaaf81b8800 nid=0x447b in Object.wait() [0x00002aaaf6a22000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaab5556d50> (a java.util.TaskQueue)
at java.lang.Object.wait(Object.java:485)
at java.util.TimerThread.mainLoop(Timer.java:483)
- locked <0x00002aaab5556d50> (a java.util.TaskQueue)
at java.util.TimerThread.run(Timer.java:462)
"Low Memory Detector" daemon prio=10 tid=0x00000000006a4000 nid=0x4479 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"CompilerThread1" daemon prio=10 tid=0x00000000006a1000 nid=0x4477 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"CompilerThread0" daemon prio=10 tid=0x000000000069d000 nid=0x4476 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" daemon prio=10 tid=0x000000000069b000 nid=0x4465 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" daemon prio=10 tid=0x0000000000678800 nid=0x4464 in Object.wait() [0x00002aaaf61d6000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaab54a1cb8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
- locked <0x00002aaab54a1cb8> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
"Reference Handler" daemon prio=10 tid=0x0000000000676800 nid=0x4463 in Object.wait() [0x00002aaaf60d5000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaab54a1cf0> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:485)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
- locked <0x00002aaab54a1cf0> (a java.lang.ref.Reference$Lock)
"main" prio=10 tid=0x0000000000614000 nid=0x445d runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"VM Thread" prio=10 tid=0x0000000000670000 nid=0x4462 runnable
"GC task thread#0 (ParallelGC)" prio=10 tid=0x000000000061e000 nid=0x445e runnable
"GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000000620000 nid=0x445f runnable
"GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000000622000 nid=0x4460 runnable
"GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000000623800 nid=0x4461 runnable
"VM Periodic Task Thread" prio=10 tid=0x00000000006a6800 nid=0x447a waiting on condition
JNI global references: 797
Heap
PSYoungGen total 162944K, used 48388K [0x00002aaadff40000, 0x00002aaaf2ab0000, 0x00002aaaf5490000)
eden space 102784K, 47% used [0x00002aaadff40000,0x00002aaae2e81170,0x00002aaae63a0000)
from space 60160K, 0% used [0x00002aaaeb850000,0x00002aaaeb850000,0x00002aaaef310000)
to space 86720K, 0% used [0x00002aaae63a0000,0x00002aaae63a0000,0x00002aaaeb850000)
PSOldGen total 699072K, used 699072K [0x00002aaab5490000, 0x00002aaadff40000, 0x00002aaadff40000)
object space 699072K, 100% used [0x00002aaab5490000,0x00002aaadff40000,0x00002aaadff40000)
PSPermGen total 21248K, used 9252K [0x00002aaab0090000, 0x00002aaab1550000, 0x00002aaab5490000)
object space 21248K, 43% used [0x00002aaab0090000,0x00002aaab09993e8,0x00002aaab1550000)