Cassandra 1.2: how to get the real load on each virtual node

I have a Cassandra 1.2 cluster and I use virtual nodes and ByteOrderedPartitioner. I know that this is not recommended, because I need to make sure that the data keys are evenly distributed throughout the key space, so the load on each physical node will be correctly distributed. The problem I am facing is that I cannot find a way to see the actual load on each virtual node. If I use nodetool as follows:

nodetool status 

I get output like this:

 Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns Host ID Rack UN XXX.XXX.XXX.XXX 14.73 GB 256 11.3% a4d365ca-f21b-4418-ab0e-656520d931b5 rack1 UN XXX.XXX.XXX.XXX 8.51 GB 256 10.6% f587fe0b-e765-4c02-bd50-cef9758e9a6b rack1 UN XXX.XXX.XXX.XXX 10.92 GB 256 10.3% 6160ca91-1e07-47ec-8fa9-ef886c140e91 rack1 UN XXX.XXX.XXX.XXX 9.62 GB 256 10.0% 9c4a8476-1de2-455b-956a-c4cea31675bf rack1 UN XXX.XXX.XXX.XXX 11.11 GB 256 11.2% 61639d9c-ad49-4f38-86b3-cd48e0c90c49 rack1 UN XXX.XXX.XXX.XXX 7.86 GB 256 35.1% 195b6f79-7d68-4a98-8a9b-55bd0dd699e2 rack1 UN XXX.XXX.XXX.XXX 11.29 GB 256 11.4% 0ac03b6a-0a0e-4f83-8b9e-2f16d4db47ab rack1 

This means the distribution is not so good, but I want to see the actual distribution on the virtual nodes, the problem I am facing is that it is executing:

 nodetool ring 

Gives me a lot of records, one for each virtual node (256 in total) in the node I run the command, but the information is practically useless because the load looks the same for each virtual node (and the actual size is unrealistic compared to the general information about the physical node)

 XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[2daad5a3e325e152d7be5bc2d5f87fef]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[2ffef9060e59c1c922a1ecf8e2643794]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[31041cc591d63d91a67a21ecf44a57c2]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[31bbcaafcdcb2ecc3a4ef3fb3af4b82b]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[324e972b43b63d63df4255e459fed524]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[3353224ae20e902e5b2b243c8fc5ff97]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[350ed29fa9a1a377b8014beef1d160f0]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[3553ad83beaf91d98a692e22718e321d]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[35893a82c84982c467251115a7406f00]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[37fad1c7dbd8d66d75747699ce4d6d2e]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[388bcf470bd5c97e1f3cb45c01bd1f2c]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[38a0cdc654a9934e5a16e5242c26fc5f]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[393b8185b527f036cd44f5f6791484b9]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[39ae4356a22bbb5ea20d5c6fc83cd2de]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[39dd01bb66beeeb46627f0303671c30d]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[3a49f707a7cea045935524900094c4e4]) XXX.XXX.XXX.XXX rack1 Up Normal 11.29 GB 11.45% Token(bytes[3a58eba6a5730a75fd899cf77c93d6cb]) 

My question is, is there another tool / way to get the real load of each virtual node in a Cassandra cluster?

Thanks in advance!

+4
source share
2 answers

When starting nodetool ring without a nodetool ring it checks the load based on SimpleStrategy for replication. If your tokens for NetworkTopologyStrategy are correctly installed, it will look "off".

Since the replication strategy determines the load, and each key space can have a different replication strategy, you need to pass the key name as the second argument to see the true load distribution for each key space.

If you use NetworkTopologyStrategy, the nodetool ring <keyspace> will take into account the location of the data centers and racks to determine the distribution of tokens and give an accurate load value.

+1
source

Have you tried Cassandra OpsCenter? http://www.datastax.com/what-we-offer/products-services/datastax-opscenter

I'm not sure (never tried) if you can get a real load for each virtual node, but this is a great tool for monitoring and managing your database

0
source

Source: https://habr.com/ru/post/1484941/


All Articles