How to execute a query with a timestamp cassandra column as a WHERE clause

I have the following Cassandra table

cqlsh:mydb> describe table events; CREATE TABLE mydb.events ( id uuid PRIMARY KEY, country text, insert_timestamp timestamp ) WITH bloom_filter_fp_chance = 0.01 AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; CREATE INDEX country_index ON mydb.events (country); CREATE INDEX insert_timestamp_index ON mydb.events (insert_timestamp); 

As you can see, the index has already been created in the insert_timestamp column.

I walked through https://stackoverflow.com/a/312969/

I believe the correct request

 cqlsh:mydb> select * from events where insert_timestamp >= '2016-03-01 08:27:22+0000'; InvalidRequest: code=2200 [Invalid query] message="No secondary indexes on the restricted columns support the provided operators: 'insert_timestamp >= <value>'" cqlsh:mydb> select * from events where insert_timestamp >= '2016-03-01 08:27:22+0000' ALLOW FILTERING; InvalidRequest: code=2200 [Invalid query] message="No secondary indexes on the restricted columns support the provided operators: 'insert_timestamp >= <value>'" 

But a query with a country column is executed as a WHERE clause .

 cqlsh:mydb> select * from events where country = 'my'; id | country | insert_timestamp --------------------------------------+---------+-------------------------- 53167d6a-e125-46ff-bacf-f5b267de0258 | my | 2016-03-01 08:27:22+0000 

Any idea why a query with a timestamp as a condition does not work? Is there something wrong with the query syntax?

+1
source share
4 answers

Any idea why a query with a timestamp as a condition does not work? Is there something wrong with the query syntax?

The secondary index of the native Cassandra is bounded = predicate. To enable inequality predicates, you need to add ALLOW FILTERING , but it will perform a full cluster scan : - (

If you can afford to wait a couple of weeks, Cassandra 3.4 will be released with a new secondary SASI index, which is much more efficient for range queries: https://github.com/apache/cassandra/blob/trunk/doc/SASI.md

+7
source

Direct queries on secondary indexes support only =, CONTAINS or CONTAINS KEY restrictions.

Secondary index queries allow you to restrict the returned results using the constraints =,>,> =, <= and <, CONTAINS and CONTAINS KEY on non-indexed columns using filtering.

This way your query will work after adding ALLOW FILTERING to it.

 select * from events where insert_timestamp >= '2016-03-01 08:27:22+0000' ALLOW FILTERING; 

The link you mentioned in your question has a timestamp column as a clustering key. Therefore, he works there.

According to RangeQuery's comment on the secondary index, it doesn't matter until version 2.2.x

FYI: When Cassandra needs to execute a secondary index query, it will contact all nodes to check the portion of the secondary index located on each node. Therefore, in the cassandra, an anti-pattern is considered to have an index in a column with high power, for example, a time stamp. You should consider modifying the data model to suit your needs.

+1
source

The index in cassandra is very different from the index in a relational database. One of the differences is that querying the range in the cassandra index is not allowed at all. Typically, a range query only works with clustering keys (it can also work with partition keys if ByteOrderPartitioner is used, but this is not a common occurrence), which means that you need to carefully design your columns for your potential query patterns. Fooobar.com/questions/1244214 / ... already exists.

To understand when to use the cassandra index (it is designed for fairly specific cases) and its limitations, this one is a good entry,

0
source

Using cequel ORM

  now = DateTime.now today = DateTime.new(now.year, now.month, now.day, 0, 0, 0, now.zone) tommorrow = today + (60 * 60 * 24); MyObject.allow_filtering!.where("done_date" => today..tommorrow).select( "*" ) 

Worked for me.

0
source

Source: https://habr.com/ru/post/1244201/


All Articles