Java.lang.UnsupportedOperationException: "Writing to a non-empty Cassandra table is not allowed

Question

Java.lang.UnsupportedOperationException: "Writing to a non-empty Cassandra table is not allowed

I have a scenario where I get streaming data that is processed by my sparking program, and the output for each interval is added to my existing cassandra table.

Currently, my sparking program generates a data frame that I need to save in my cassandra table. The problem I am currently facing is that I cannot add data / rows to my existing cassandra table when I use below command

dff.write.format("org.apache.spark.sql.cassandra").options(Map("table" -> "xxx", "yyy" -> "retail")).save()

I read the following link http://rustyrazorblade.com/2015/08/migrating-from-mysql-to-cassandra-using-spark/ , where it passed mode = "append" to the save method, but its throw syntax error

Also, I could not understand where I needed to fix from the link below https://groups.google.com/a/lists.datastax.com/forum/#!topic/spark-connector-user/rlGGWQF2wnM

Need help on how to fix this problem. I am writing my work on spark flow in scala

+5

cassandra apache-spark apache-spark-sql spark-streaming datastax-enterprise

Mohana Feb 11 '16 at 6:33

source share

1 answer

Niemand · Answer 1 · 2016-02-11T09:30:35+0000

I think you should do it like this:

 dff.write.format("org.apache.spark.sql.cassandra").mode(SaveMode.Append).options(Map("table" -> "xxx", "yyy" -> "retail")).save()

The way cassandra processes the data forces you to do so-called "upserts" - you must remember that the insert can overwrite some of the lines where the primary key of the already saved record matches the primary key of the inserted reccord, Cassandra is a "quick write" database. therefore, the availability of data is not checked before recording.

Java.lang.UnsupportedOperationException: "Writing to a non-empty Cassandra table is not allowed

More articles: