Why does HBase need to store a column family for each value?

Since HBase tables are sparse tables, HBase stores for each cell not only the value, but also all the information needed to identify the cell (often described as a key, not to be confused with RowKey). The key is as follows:

RowKey-ColumnFamily-ColumnQualifier-Mark

And all this information is stored for each record. That's why there is a recommendation to use short names for Column Families and Column Qualifiers to reduce additional overhead.

My question is: why do I need to store ColumnFamily for each record? In my opinion, each Store file belongs to exactly one family of columns. Wouldn't it be enough to save the column last name once per storage file? This will reduce overhead, arbitrary Column column names can be used, and we can still identify the column family for each record. What am I missing here?

+4
source share
2 answers

, , , , RPC. , , . , , , , HBase . , , , . Kiji, , , , , .

0

, HBase . HBase . . . HFiles . HFiles , . , HF .

+1

Source: https://habr.com/ru/post/1546678/


All Articles