Calculating the size of a table in Kassandra

In Cassandra's Ultimate Guide (2nd Edition), Jeff Carpenter and Eben Hewitt use the following formula to calculate the size of a table on a disk: "// p>

table size equation

  • ck: primary key columns
  • cs: static columns
  • cr: regular columns
  • cc: cluster columns
  • Nr: number of rows
  • Nv: it is used to calculate the total size of timestamps (I do not get this part completely, but for now I will ignore it).

In this equation I do not understand.

First: why are cluster column sizes counted for each regular column? Shouldn't we multiply it by the number of rows? It seems to me that, in calculating this method, we say that the data in each column of clustering is replicated for each regular column, which, I believe, is not so.

Second: why are the primary key columns not multiplied by the number of partitions? In my opinion, if we have a node with two partitions, we should multiply the size of the primary key columns by two, because we will have two different primary keys in the node.

+5
source share
3 answers

This is because of the Cassandra version <3 internal structure.

  • There is only one entry for each section key value.
  • For each individual section key value, there is only one entry for a static column
  • There is an empty entry for the clustering key
  • For each column in the row, there is one entry for each column of the clustering key

Take an example:

CREATE TABLE my_table ( pk1 int, pk2 int, ck1 int, ck2 int, d1 int, d2 int, s int static, PRIMARY KEY ((pk1, pk2), ck1, ck2) ); 

Insert some dummy data:

  pk1 | pk2 | ck1 | ck2 | s | d1 | d2 -----+-----+-----+------+-------+--------+--------- 1 | 10 | 100 | 1000 | 10000 | 100000 | 1000000 1 | 10 | 100 | 1001 | 10000 | 100001 | 1000001 2 | 20 | 200 | 2000 | 20000 | 200000 | 2000001 

The internal structure will be:

  |100:1000: |100:1000:d1|100:1000:d2|100:1001: |100:1001:d1|100:1001:d2| -----+-------+-----------+-----------+-----------+-----------+-----------+-----------+ 1:10 | 10000 | | 100000 | 1000000 | | 100001 | 1000001 | |200:2000: |200:2000:d1|200:2000:d2| -----+-------+-----------+-----------+-----------+ 2:20 | 20000 | | 200000 | 2000000 | 

Thus, the size of the table will be:

 Single Partition Size = (4 + 4 + 4 + 4) + 4 + 2 * ((4 + (4 + 4)) + (4 + (4 + 4))) byte = 68 byte Estimated Table Size = Single Partition Size * Number Of Partition = 68 * 2 byte = 136 byte 
  • Here the whole field type is int (4 bytes)
  • There are 4 primary key columns, 1 static column, 2 key clustering columns, and 2 regular columns

More details: http://opensourceconnections.com/blog/2013/07/24/understanding-how-cql3-maps-to-cassandras-internal-data-structure/

+8
source

As an author, I really appreciate the question and your participation in the material!

Regarding the initial questions - remember that this is not a formula for calculating the size of a table, this is a formula for calculating the size of one section. The goal is to use this formula with the β€œworst” number of rows to identify sections that are too large. You will need to multiply the result of this equation by the number of partitions to get an estimate of the total data size for the table. And of course, this does not allow for replication.

Also thanks to those who answered the original question. Based on your feedback, I spent some time on the new storage format (3.0) to see if this could affect the formula. I agree that Aaron Morton's article is a useful resource (link above).

The basic approach of the formula remains sound for storage format 3.0. How the formula works, you basically add:

  • partition key and static column sizes
  • clustering column size for each row, number of rows
  • 8 bytes of metadata for each cell

Updating a formula for storage format 3.0 requires revising constants. For example, the original equation assumes 8 bytes of metadata per cell to store a timestamp. The new format treats the timestamp in the cell as optional, since it can be applied at the row level. For this reason, there is currently a variable amount of metadata per cell, which can be as 1-2 bytes, depending on the type of data.

After reading this feedback and re-reading this section of the chapter, I plan to update the text to add some clarifications, and stronger warnings about this formula are useful as an approximation, not an exact value. There are factors that he does not take into account at all, such as records distributed over several SSTables, as well as tombstones. We plan to print this spring (2017) again to fix a few errors, so find these changes soon.

+6
source

Here is the updated formula from Artyom Chebotko:

Formula

t_avg is the average amount of metadata per cell, which may vary depending on the complexity of the data, but 8 is a good estimate of the worst case.

+6
source

Source: https://habr.com/ru/post/1265325/


All Articles