JMH Java Lock

I have a multithreaded JMH test:

@State(Scope.Benchmark)
@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Fork(value = 1, jvmArgsAppend = { "-Xmx512m", "-server", "-XX:+AggressiveOpts","-XX:+UnlockDiagnosticVMOptions",
        "-XX:+UnlockExperimentalVMOptions", "-XX:+PrintAssembly", "-XX:PrintAssemblyOptions=intel",
        "-XX:+PrintSignatureHandlers"})
@Measurement(iterations = 5, time = 5, timeUnit = TimeUnit.SECONDS)
@Warmup(iterations = 3, time = 2, timeUnit = TimeUnit.SECONDS)
public class LinkedQueueBenchmark {
private static final Unsafe unsafe = UnsafeProvider.getUnsafe();
private static final long offsetObject;
private static final long offsetNext;

private static final int THREADS = 5;
private static class Node {
    private volatile Node next;
    public Node() {}
}

static {
    try {
        offsetObject = unsafe.objectFieldOffset(LinkedQueueBenchmark.class.getDeclaredField("object"));
        offsetNext = unsafe.objectFieldOffset(Node.class.getDeclaredField("next"));
    } catch (Exception ex) { throw new Error(ex); }
}

protected long t0,t1,t2,t3,t4,t5,t6,t7;
private volatile Node object = new Node(null);


@Threads(THREADS)
@Benchmark
public Node doTestCasSmart() {
    Node current, o = new Node();
    for(;;) {
        current = this.object;
        if (unsafe.compareAndSwapObject(this, offsetObject, current, o)) {
            //current.next = o; //Special line:
            break;
        } else {
            LockSupport.parkNanos(1);
        }
    }
    return current;
}
}
  • In the current version, I have a performance of ~ 55 ops / us
  • But if I uncomment the "Special line" or replace it with unsafe.putOrderedObject (in either direction - current.next = o or o.next = current), the performance is ~ 2 ops / us.

As I understand it, this happens with CPU caches and maybe it clears the storage buffer. If I replaced it with a blocking method, without CAS, the performance would be 11-20 ops / us.
I try to use LinuxPerfAsmProfiler and PrintAssembly, in the second case I see:

....[Hottest Regions]...............................................................................
 25.92%   17.93%  [0x7f1d5105fe60:0x7f1d5105fe69] in SpinPause (libjvm.so)
 17.53%   20.62%  [0x7f1d5119dd88:0x7f1d5119de57] in ParMarkBitMap::live_words_in_range(HeapWord*, oopDesc*) const (libjvm.so)
 10.81%    6.30%  [0x7f1d5129cff5:0x7f1d5129d0ed] in ParallelTaskTerminator::offer_termination(TerminatorTerminator*) (libjvm.so)
  7.99%    9.86%  [0x7f1d3c51d280:0x7f1d3c51d3a2] in com.jad.generated.LinkedQueueBenchmark_doTestCasSmart::doTestCasSmart_thrpt_jmhStub 

Can someone explain to me what is really going on? Why is it so slow? Where is the storage barrier here? Why does putOrdered not work? And how to fix it?

+4
1

: "" .

SpinPause, ParMarkBitMap::live_words_in_range(HeapWord*, oopDesc*) ParallelTaskTerminator::offer_termination(TerminatorTerminator*) - GC. , - GC. , " ", -prof gc, :

# Run complete. Total time: 00:00:43

Benchmark                      Mode  Cnt      Score    Error   Units
LQB.doTestCasSmart            thrpt    5      5.930 ±  3.867  ops/us
LQB.doTestCasSmart:·gc.time   thrpt    5  29970.000               ms

, 43 30 GC. -verbose:gc :

Iteration   3: [Full GC (Ergonomics)  408188K->1542K(454656K), 0.0043022 secs]
[GC (Allocation Failure)  60422K->60174K(454656K), 0.2061024 secs]
[GC (Allocation Failure)  119054K->118830K(454656K), 0.2314572 secs]
[GC (Allocation Failure)  177710K->177430K(454656K), 0.2268396 secs]
[GC (Allocation Failure)  236310K->236054K(454656K), 0.1718049 secs]
[GC (Allocation Failure)  294934K->294566K(454656K), 0.2265855 secs]
[Full GC (Ergonomics)  294566K->147408K(466432K), 0.7139546 secs]
[GC (Allocation Failure)  206288K->205880K(466432K), 0.2065388 secs]
[GC (Allocation Failure)  264760K->264312K(466432K), 0.2314117 secs]
[GC (Allocation Failure)  323192K->323016K(466432K), 0.2183271 secs]
[Full GC (Ergonomics)  323016K->322663K(466432K), 2.8058725 secs]

2.8s GC, . 5 , GC, , 5 . .

? , . , , , object, . . , , GC, . , . , OOME. intial object head OOME.

putOrdered, , , . , , . , @Benchmark, .

+9

Source: https://habr.com/ru/post/1611171/


All Articles