Discrepancy between Cassandra’s footprint and client-side latency

Question

Discrepancy between Cassandra’s footprint and client-side latency

We are on Cassandra 2.0.15 and see huge reading latencies (> 60 sec.) That appear at regular intervals (approximately every 3 minutes) from all application hosts. We measure this delay around calls on session.execute(stmt). At the same time, Cassandra keeps track of report durations <1s. We also ran a query through cqlsh from the same hosts during these peak latency periods, and cqlsh always returned within 1 second. What can explain this inconsistency at the Java driver level?

- edit: in response to comments -

Cassandra server JVM settings: -XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -XX:+HeapDumpOnOutOfMemoryError -Xss256k -XX:StringTableSize=1000003 -Xms32G -Xmx32G -XX:+UseG1GC -Djava.net.preferIPv4Stack=true -Dcassandra.jmx.local.port=7199 -XX:+DisableExplicitGC.

The client side of the GC is negligible (below). Client settings:, -Xss256k -Xms4G -Xmx4GCassandra driver version - 2.1.7.1

Client Side Measurement Code:

val selectServiceNames = session.prepare(QueryBuilder.select("service_name").from("service_names"))

override def run(): Unit = {
  val start = System.currentTimeMillis()
  try {
    val resultSet = session.execute(selectServiceNames.bind())
    val serviceNames = resultSet.all()
    val elapsed = System.currentTimeMillis() - start
    latency.add(elapsed) // emits metric to statsd
    if (elapsed > 10000) {
      log.info("Canary2 sensed high Cassandra latency: " + elapsed + "ms")
    }
  } catch {
    case e: Throwable =>
      log.error(e, "Canary2 select failed")
  } finally {
    Thread.sleep(100)
    schedule()
  }
}

Cluster Design Code:

def createClusterBuilder(): Cluster.Builder = {
  val builder = Cluster.builder()
  val contactPoints = parseContactPoints()
  val defaultPort = findConnectPort(contactPoints)
  builder.addContactPointsWithPorts(contactPoints)
  builder.withPort(defaultPort) // This ends up config.protocolOptions.port
  if (cassandraUsername.isDefined && cassandraPassword.isDefined)
    builder.withCredentials(cassandraUsername(), cassandraPassword())
  builder.withRetryPolicy(ZipkinRetryPolicy.INSTANCE)
  builder.withLoadBalancingPolicy(new TokenAwarePolicy(new LatencyAwarePolicy.Builder(new RoundRobinPolicy()).build()))
}

Another remark I can not explain. I executed two threads that execute the same query in the same way (as indicated above) in a loop, the only difference is the yellow thread, which is 100 milliseconds between requests, and the green thread is 60 seconds between requests. The green thread falls on a low delay (less than 1 s) much more often than on the yellow.

+4

java cassandra cql driver cqlsh

Yuri shkuro 18 sept. '15 at 16:22

source share

2 answers

, compoent .

, .
, .
JVM , , .

- . , 100 , , 1 . , , 1 , 100 , 0 , 99 , , 1 , 100 , 99 .

, , , , , . .. , .

+3

Peter Lawrey 18 . '15 19:35

Yuri Shkuro · Accepted Answer · 2015-09-29T05:43:45+0000

- . , , , .

ONE LOCAL_ONE
DC- ( ).

- Java, , . , - , , , DC .

Discrepancy between Cassandra’s footprint and client-side latency

More articles: