We installed a Bigtable cluster with 5 nodes, and the GCP console stated that it should support 50K QPS @ 6ms for reading and writing.
We are trying to load a large dataset (~ 800M records) with ~ 50 fields containing mostly numeric data and a few short lines. Keys are 11-digit numerical strings.
When loading this dataset via the HBase API from a single client VM into the GCE, we observe up to 4K QPS when each field is placed in a separate column. We use one HBase connection, and several threads (5-30) perform batch records of 10K records.
When combining all the fields into one column (Avro-encoded, ~ 250 bytes per record), write performance with packet positions improves to 10K QPS. The number of concurrent threads does not seem to affect QPS. When using a separate HBase connection per stream, recording performance is increased to 20K QPS with 5 streams.
The client virtual machine is in the same availability zone as the Bigtable cluster, and it remains almost inactive at boot time, so it does not look like a bottleneck on the client side.
Questions:
- Our tests show that the QPS record decreases with the number of inserted columns. Is this expected, and how can this relationship be quantified? (By the way, it would be great if it were mentioned in the Bigtable performance documentation ).
- What can we miss to achieve the claimed QPS record? I understand that each node cluster must support 10K write QPS, however it seems that we are moving against a single node with one HBase connection and only against two nodes with several HBase connections.
source share