Spark Streaming, Spark , , . , , . -:
Spark DataSource. , .
:
DataFrame numPartitions (concurrent.reads ). n ~ 50 " ", - where(dayIndex < 50 * factor * num_records).
CQL LIMIT SparkPartitionLimit, CQL () - , . CassandraRdd , RDD.
:
filteredDataFrame.rdd.asInstanceOf[CassandraRDD].limit(n).take(n).collect()
LIMIT $N CQL. DataFrame, CassandraRDD LIMIT (.limit(10).limit(20)) - . , n n / numPartitions + 1, ( Spark Cassandra ) (, - ). take(n), <= numPartitions * n n.
, where CQL ( explain()) - LIMIT .
P.S. CQL , sparkSession.sql(...) ( ) .