What could lead to a big discrepancy between the lower GC time and the total pause time?

Question

What could lead to a big discrepancy between the lower GC time and the total pause time?

We have a delay sensitive application and we experience some GC-related suspensions that we do not fully understand. Sometimes we have a junior GC that results in application pause times that are much longer than the most reported GC time. Here is an example log fragment:

485377.257: [GC 485378.857: [ParNew: 105845K-> 621K (118016K), 0.0028070 secs] 136492K-> 31374K (1035520K), 0.0028720 secs] [Time: user = 0.01 sys = 0.00, real = 1.61 sec.]
Total time for which application threads stopped: 1.6032830 seconds

The total pause time here is an order of magnitude longer than the specified GC time. These are isolated and random events: immediately preceding and subsequent minor GC events do not show this major discrepancy.

The process runs on a dedicated machine with lots of free memory, 8 cores, running Red Hat Enterprise Linux ES Release 4 Update 8 with the kernel 2.6.9-89.0.1EL-smp. We observed this with the (32-bit) versions of the JVM 1.6.0_13 and 1.6.0_18.

We work with these flags:

-server -ea -Xms512m -Xmx512m -XX: + UseConcMarkSweepGC -XX: NewSize = 128m -XX: MaxNewSize = 128m -XX: + PrintGCDetails -XX: + PrintGCTimeStamps -XX: + PrintGCApplicationStoppedTlass-Class

Can someone give some explanation as to what may be happening here, and / or some ways for further investigation?

+3

java performance garbage-collection jvm

cxcg 15 . '10 15:38

3

Trent Gray-Donald · Answer 1 · 2010-04-16T08:22:14+0000

, ? :

Times: user = 0.01 sys = 0.00, real = 1.61 secs

( )

, - , , ... -. ...

Java? (, DirectByteBuffer, nio ..), " " ( ). 'top' vmstat .

Gil Tene · Answer 2 · 2013-03-12T03:54:07+0000

"--" - . , GC ( , safepoint), ( ). -XX: + PrintGCApplicationStoppedTime ( ) safepoint , .

, , , , , , safepoint , , , . - . . - safepoint JVM (, 1 , GC ). .

[Zing --safepoint, ].

Skrud · Answer 3 · 2010-04-15T17:13:18+0000

You say there is "a lot of free memory", but the size of your heap is limited to 512 MB. You may have a lack of memory more often / sooner than you think.

What could lead to a big discrepancy between the lower GC time and the total pause time?

More articles: