Why does Java G1 gc spend so much time scanning RS?

I am currently evaluating the G1 garbage collector and how it works for our application. Looking at gc-log, I noticed that many collections have very long Scan RS phases:

7968.869: [GC pause (mixed), 10.27831700 secs] [Parallel Time: 10080.8 ms] (...) [Scan RS (ms): 4030.4 4034.1 4032.0 4032.0 Avg: 4032.1, Min: 4030.4, Max: 4034.1, Diff: 3.7] [Object Copy (ms): 6038.5 6033.3 6036.7 6037.1 Avg: 6036.4, Min: 6033.3, Max: 6038.5, Diff: 5.2] (...) [Eden: 19680M(19680M)->0B(20512M) Survivors: 2688M->2624M Heap: 75331M(111904M)->51633M(115744M)] [Times: user=40.49 sys=0.02, real=10.28 secs] 

All deleted log line entries show runtime in a single-bit ms value.

I think that most of the time you need to spend on copying, right? What could be the reason the Scan RS takes so long? Any ideas on setting up the G1 settings? JVM was launched using

 -Xms40960M -Xmx128G -XX:+UseG1GC -verbose:gc -XX:+PrintGCDetails -Xloggc:gc.log 

Edit: Oh, I forgot ... I'm using Java 7u25

Update: I noticed two more strange things:

 16187.740: [GC concurrent-mark-start] 16203.934: [GC pause (young), 2.89871800 secs] (...) 16218.455: [GC pause (young), 4.61375100 secs] (...) 16237.441: [GC pause (young), 4.46131800 secs] (...) 16257.785: [GC pause (young), 4.73922600 secs] (...) 16275.417: [GC pause (young), 3.87863400 secs] (...) 16291.505: [GC pause (young), 3.72626400 secs] (...) 16307.824: [GC pause (young), 3.72921700 secs] (...) 16325.851: [GC pause (young), 3.91060700 secs] (...) 16354.600: [GC pause (young), 5.61306000 secs] (...) 16393.069: [GC pause (young), 17.50453200 secs] (...) 16414.590: [GC concurrent-mark-end, 226.8497670 sec] 

GC runs in parallel, and runs in parallel. I'm not sure if this is intended, but it seems to me wrong. Admittedly, this is an extreme example, but I see this behavior in my entire journal.

Another thing is that my JVM process grew to 160. Given a heap size of 128 g, this is a pretty big overhead. Is this expected, or is it a G1 memory leak? Any ideas on how to figure this out?

PS: I'm not sure that I should have asked new questions for updates ... if any of you think it would be useful, tell me;)

Update 2: I think G1 can really leak memory: http://printfdebugger.tumblr.com/post/19142660766/how-i-learned-to-love-cms-and-had-my-heart-broken-by -g1 Since this is a robber now, I am not going to waste any more time on this. Things I haven't tried yet are setting the size of the region (-XX: G1HeapRegionSize) and reducing the heap load (-XX: InitatingHeapOccupancyPercent).

+6
source share
1 answer

We will see.

1 - First Tips

It looks like your GC was configured to use 4 threads (or you have 4 vCPUs, but this is unlikely given the heap size). It's pretty low for a heap of 128 GB, I expected more.

GC events seem to occur at intervals of 25 seconds. However, in the extract from the journal you do not indicate the number of processed regions.

=> Have you accidentally set pause time goals for G1GC ( -XX:MaxGCPauseMillis=N )?

2 - Long RSET scan time

β€œScan RSet” means the time taken by the GC to scan memorized sets. The Remembered Set of the region contains maps matching links pointing to that region. This phase scans those cards that look for links pointing to all areas of the collection set.

So, we have one more question:

=> How many regions were processed during this particular collection (i.e. how big a CSet is)

3 - Long time copying an object

Copying time, as the name implies, is the time taken by each workflow to copy live objects from regions in a collection set to other regions.

Such a long copy time may suggest that many regions have been processed, and you can reduce this number. It may also offer an exchange, but this is very unlikely, given your user / real values ​​at the end of the log.

4 - Now what to do

You should check in the GC log the number of areas that have been processed. Compare this number with the size of your region and determine the amount of memory that was scanned.

You can then set a smaller pause time goal (for example, up to 500 ms using -XX:MaxGCPauseMillis=500 ). It will be

  • increase the number of GC events,
  • reduce the amount of freed memory per GC cycle
  • reduce STW pause during YGC

Hope this helps!

Sources:

+6
source

Source: https://habr.com/ru/post/949312/


All Articles