We noticed random full GCs with a G1 garbage collector with parallel label overflow. Once there is a parallel-mark-reset -for-overflow, this overflow will continue in the next phases of the parallel labels. In the end, this leads to a full GC, since the matching sign no longer works.
We have four machines running the same Apache Storm application with the same data traffic. Only one of the machines has this experience once a week.
Is this due to an error: "G1 does not expand the label stack when the label stack is full during parallel marking https://bugs.openjdk.java.net/browse/JDK-8065402
Following the suggestion on the previous page, we doubled the parallel label streams from 4 to 8 and our heap size from 8 to 16 GB. However, the full GC is still happening, and the only difference is that the entries are delayed.
Any other suggestions?
Here is GC log:
Java HotSpot(TM) 64-Bit Server VM (25.65-b01) for linux-amd64 JRE(1.8.0_65b17), built on Oct 6 2015 17:16:12 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8) Memory: 4k page, physical 529167668k(69283408k free), swap 33554424k(33552380k free) CommandLine flags: -XX:ConcGCThreads=8 -XX:G1ReservePercent=20 -XX:GCLogFileSize=104857600 -XX:InitialHeapSize=17179869184 -XX:InitiatingHeapOccupancyPercent=45 -XX:MaxGCPauseMillis=100 -XX:MaxHeapSize=17179869184 -XX:NumberOfGCLogFiles=10 -XX:ParallelGCThreads=30 -XX:+PrintAdaptiveSizePolicy -XX:PrintFLSStatistics=2 -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC -XX:+UseGCLogFileRotation ... ... 2016-04-13T22:06:37.254-0400: 19839.175: [GC concurrent-root-region-scan-start] 2016-04-13T22:06:37.313-0400: 19839.234: [GC concurrent-root-region-scan-end, 0.0592966 secs] 2016-04-13T22:06:37.313-0400: 19839.234: [GC concurrent-mark-start] 2016-04-13T22:06:38.569-0400: 19840.490: [GC concurrent-mark-reset-for-overflow] ... 2016-04-13T22:06:42.810-0400: 19844.731: [GC concurrent-mark-reset-for-overflow] ... 2016-04-13T22:11:19.253-0400: 20121.175: [GC concurrent-mark-reset-for-overflow] ... ... ... 2016-04-14T01:58:17.254-0400: 33739.176: [GC concurrent-mark-reset-for-overflow] ... 2016-04-14T01:58:36.957-0400: 33758.878: [Full GC (Allocation Failure)
source share