Although the documentation on DataStax is generally really good, I could not find anything discussing the details behind it. However, I came across this article entitled β β Breakdown of the WHQL WHERE clause. ββ Section # 2 is called β The last column in the section key supports the IN statement .β
Paraphrasing, he basically says this:
For individual sections of column sections, the IN operator is allowed without restriction. For compound partition keys, I have to use the = operator in the first columns of the N-1 partition key to use the IN operator in the last column.
In your case, x is your section key, which means that x is the only column that the IN CQL statement will support. If you really need to support IN queries in the z column, you will have to de-normalize your data and create a (redundant) table designed to support this query. For instance:
CREATE TABLE test ( x int, y int, z int, PRIMARY KEY (z) );
... will support the request, but z values ββmay not be unique. In this case, you can define x and / or y as LIST<int> , and that would do it.
In addition, DataStax has documentation available on if you do not use an index , and they declare that the same conditions apply to the use of the IN operator.
In most cases, using IN in a WHERE clause is not recommended. Using IN can degrade performance because typically many nodes should be requested. For example, in one local data center a cluster with 30 nodes, a replication coefficient of 3 and a LOCAL_QUORUM level of consistency, one key request goes to two nodes, but if the request uses the IN condition, the number of requested nodes is likely to be even higher, up to 20 nodes depending from where the keys fall into the token range.