Cassandra CQL range query denied despite equality operator and secondary index

From the chart below, I am trying to select all pH values ​​that are below 5.

I followed these three tips:

  • Using ALLOW FILTERING
  • Enable equality comparison
  • Create a secondary index in the read_value column.

Here is my request:

select * from todmorden_numeric where sensor_name = 'pHradio' and reading_value < 5 allow filtering; 

This message was denied:

 Bad Request: No indexed columns present in by-columns clause with Equal operator 

I tried adding a secondary index to the sensor_name column and said that it is already part of the key and therefore already indexed.

I created an index after the table was used for a while - maybe a problem? I ran an update for nodetool in the hope that it would make the index available, but that didn't work. Here is the output of describe table todmorden_numeric :

 CREATE TABLE todmorden_numeric ( sensor_name text, reading_time timestamp, reading_value float, PRIMARY KEY ((sensor_name), reading_time) ) WITH bloom_filter_fp_chance=0.010000 AND caching='KEYS_ONLY' AND comment='Data that suits being stored as floats' AND dclocal_read_repair_chance=0.000000 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.100000 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='99.0PERCENTILE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; CREATE INDEX todmorden_numeric_reading_value_idx ON todmorden_numeric (reading_value); 
+4
source share
1 answer

Cassandra allows you to select a range only by:

a) Section key only if ByteOrderPartitioner is used (now default is murmur3).

b) any individual clustering key ONLY IF any clustering keys defined before the end column in the primary key definition are already specified by the = operator in the predicate.

They do not work with secondary indexes.

Consider the following table definition:

 CREATE TABLE tod1 (name text, time timestamp, val float, PRIMARY KEY (name, time)); 

In this case, you CANNOT do a range in val.

Consider this:

 CREATE TABLE tod2 (name text, time timestamp, val float, PRIMARY KEY (name, time, val)); 

Then the following is true:

 SELECT * FROM tod2 WHERE name='X' AND time='timehere' AND val < 5; 

The difference is meaningless, but this is not true:

 SELECT * from tod2 WHERE name='X' AND val < 5; 

This is not valid because you did not filter out the previous clustering key in the primary key definition (in this case, time).

At your request, you can do this:

 CREATE TABLE tod3 (name text, time timestamp, val float, PRIMARY KEY (name, val, time)); 

Note the column order of the primary key: val before time.

This will allow you to do:

 SELECT * from tod3 WHERE name='asd' AND val < 5; 

On the other hand, how long are you going to store the data? How often do you get evidence? This can lead to the rapid growth of your section. You might want to write it into several sections (manual scalding). Perhaps one section per day? Of course, such things are highly dependent on your access patterns.

Hope this helps.

+7
source

Source: https://habr.com/ru/post/1244214/


All Articles