Date and time query columns in Kassandra

We are trying to create / query information from CF based on the following structure (e.g. datetime, datetime, integer)

eg 03-22-2012 10.00, 03-22-2012 10.30 100 03-22-2012 10.30, 03-22-2012 11.00 50 03-22-2012 11.00, 03-22-2012 11.30 200 

How to simulate the above structure in Cassandra and execute the following queries through Hector.

 select * from <CF> where datetime1 > 03-22-2012 10.00 and datetime2 < 03-22-2012 10.30 select * from <CF> where datetime1 > 03-22-2012 10.00 and datetime2 < 03-22-2012 11.00 select * from <CF> where datetime = 03-22-2012 (ie for the entire day) 
+4
source share
2 answers

This is a great introduction to working with dates and times in Kassandra: Basic time series with Cassandra .

In short, use timestamps (or v1 UUIDs) as the column names and set the comparator to LongType (or TimeUUIDType) to get chronological sorting of the columns. Then it’s easy to get a piece of data between two time points.

Your question is not entirely clear, but if you want to get all the events that occurred during a given time range of the day, regardless of the date, then you will want to structure your data in different ways. In this case, the column names can be CompositeType (LongType, AsciiType), where the first component is the standard time type 86400 (number of seconds per day), and the second component is the date or something else that changes over time, like a complete timestamp . You would also like to split the line in this case, perhaps dedicating a different line to each hour.

+2
source

Unfortunately, there is no way to do this easily with just one column family in Kassandra. The problem is that you want cassandra to sort based on two different things: datetime1 and datetime2.

The obvious structure for this would be for your columns to be Composite composite types (TimeUUID, TimeUUID, Integer). In this case, they will be sorted by datetime1, then datetime2, then an integer.

But you will always receive orders based on datetime1, not datetime2 (although if two records have the same datetime1, then they will only order these records based on datetime2).

A possible workaround would be to have two families of columns with duplicate data (or even two rows for each logical row). One line where data is inserted (datetime1: datetime2: integer), and the other where it is inserted (datetime2: datetime1: integer). You can then perform a multi-segment operation on these two lines and combine the data before passing it to the caller:

 final MultigetSliceQuery<String, Composite, String> query = HFactory.createMultigetSliceQuery(keyspace, StringSerializer.get(), CompositeSerializer.get(), StringSerializer.get()); query.setColumnFamily("myColumnFamily"); startQuery.setKeys("myRow.arrangedByDateTime1", "myRow.arrangedByDateTime2"); startQuery.setRange(new Composite(startTime), new Composite(endTime), false, Integer.MAX_VALUE); final QueryResult<Rows<String,Composite,String>> queryResult = query.execute(); final Rows<String,Composite,String> rows = queryResult.get(); 
0
source

Source: https://habr.com/ru/post/1402620/


All Articles