I am currently running a Cassandra cluster with a 12-node, storing 4 TB of data, with a replication factor set to 3. For the needs of updating the application, we need to reconfigure our key space, d would like to avoid any downtime, if possible.
I read on the mailing list what the best way to do this is:
- Kill a cassandra process on a single cluster server
- Run it again, wait until the commit log is written to disk, and run it again.
- Make changes to the storage.xml file
- Rename or delete files in data directories in accordance with our changes
- Run cassandra
- Go to 1 with the next server in the list
My questions:
- I understand this process well?
- Is there a risk of data corruption?
- During the process, there will be servers with different versions of the storage.xml file in the same cluster, in the same key space. This is problem?
- The same question as above if we not only add, rename and delete ColumnFamilies, but if we change the CompareWith parameter / transform the existing column family into a super-one. Or do we need to change the name?
Thank you for your responses. This is the first time I will do it, and I'm a little afraid.
source share