Reaching the announced Cloud Bigtable writes QPS

We installed a Bigtable cluster with 5 nodes, and the GCP console stated that it should support 50K QPS @ 6ms for reading and writing.

We are trying to load a large dataset (~ 800M records) with ~ 50 fields containing mostly numeric data and a few short lines. Keys are 11-digit numerical strings.

When loading this dataset via the HBase API from a single client VM into the GCE, we observe up to 4K QPS when each field is placed in a separate column. We use one HBase connection, and several threads (5-30) perform batch records of 10K records.

When combining all the fields into one column (Avro-encoded, ~ 250 bytes per record), write performance with packet positions improves to 10K QPS. The number of concurrent threads does not seem to affect QPS. When using a separate HBase connection per stream, recording performance is increased to 20K QPS with 5 streams.

The client virtual machine is in the same availability zone as the Bigtable cluster, and it remains almost inactive at boot time, so it does not look like a bottleneck on the client side.

Questions:

  • Our tests show that the QPS record decreases with the number of inserted columns. Is this expected, and how can this relationship be quantified? (By the way, it would be great if it were mentioned in the Bigtable performance documentation ).
  • What can we miss to achieve the claimed QPS record? I understand that each node cluster must support 10K write QPS, however it seems that we are moving against a single node with one HBase connection and only against two nodes with several HBase connections.
+5
source share
2 answers

In order to get maximum performance with Cloud Bigtable, you want to use OpenSSL instead of alpn-Boot .

BufferedMutator in 0.2.3-SNAPSHOT with OpenSSL and Java8 provides 22-23K COOTs for small (1Kb) mutations on 4 processor machines and up to 90K on a 32 processor machine. 0.2.2 gave 10K-12K QPS. For best performance, open one HBase connection.

+1
source

Answering the second question: we managed to get more than 50 thousand QPS, switching from packet HBase Puts to mutators . We still use multiple HBase connections, one connection seems to be limited by node bandwidth.

+1
source

Source: https://habr.com/ru/post/1241142/


All Articles