To answer the question about physical storage, the key feature that makes Cassandra fast is that they are add- ons only . That is, Cassandra only writes consecutive blocks to disk; he should not make slow attempts at random disk locations during recording.
When a column is updated, two things happen: the record is added to the commit log (to repair the failure) and the Memtable in memory is updated. When the Memtable is full, it is unloaded to disk as a new SSTable. Thus, the length of the data does not matter, since you are not trying to fit it into a fixed-length disk structure.
SSTables are read-only - you never go back or overwrite the old value when updating, you just write new ones. On read, Cassandra first looks in Memtable for the key. If it does not find it, Cassandra scans the SSTables in order from the newest to the oldest and stops when it finds the key. This gives you the latest value.
There are several optimizations. Each SSTable has an associated Bloom filter for its keys, which is a compact probabilistic index that can create false positives, but never false negatives. If the key is not included in the Bloom filter, you can safely skip this SSTable because it does not contain the key, although you can sometimes read the SSTable that you do not need.
When you get too many SSTables, they combine together into a larger process with compaction . Essentially, this is a great merge option on SSTables. This allows Cassandra to return space for values ββthat have been overwritten or deleted, and to defragment rows that have been distributed across multiple SSTables.
See http://www.mikeperham.com/2010/03/13/cassandra-internals-writing/ and http://wiki.apache.org/cassandra/MemtableSSTable for more information.
source share