Cassandra replication or raid

With traditional DBMSs, we usually use RAID10 in most cases, but if we use cassandra RF = 2, then we definitely have one copy as a backup, then in this case why and why use RAID10.

I think this will reduce the overhead of cassandra for replication.

In addition, in RAID10, if the hard drive fails, then the entire node will continue to work, but if replication is used, then one hard drive failure will cause the whole node to go down?

Although I think that using RAID10 there will be overhead per record, cleaning is done when SSTABLE is full so that it is not felt all the time.

+6
source share
1 answer

I would say that RAID 10 is a waste of money. Two reasons:

1) One of the important attributes of BigTable (Cassandra or HBase) is the ability to quickly and cheaply expand a cluster or add redundancy by adding new servers. Based on the latest prices, RAID 10 (striping AND spanning) is so expensive that it's almost the same price as adding another whole server with JBOD storage.

2) Cassandra replication protects you from machine crashes, not just disk failures. RAID 10 will not protect you if your processor dies, but Cassandra replication will work. It also protects you from disk failure and allows multiple clients to read from multiple nodes, preventing hot spots.

+5
source

Source: https://habr.com/ru/post/896148/


All Articles