Java.lang.UnsupportedOperationException: "Writing to a non-empty Cassandra table is not allowed

I have a scenario where I get streaming data that is processed by my sparking program, and the output for each interval is added to my existing cassandra table.

Currently, my sparking program generates a data frame that I need to save in my cassandra table. The problem I am currently facing is that I cannot add data / rows to my existing cassandra table when I use below command

dff.write.format("org.apache.spark.sql.cassandra").options(Map("table" -> "xxx", "yyy" -> "retail")).save() 

I read the following link http://rustyrazorblade.com/2015/08/migrating-from-mysql-to-cassandra-using-spark/ , where it passed mode = "append" to the save method, but its throw syntax error

Also, I could not understand where I needed to fix from the link below https://groups.google.com/a/lists.datastax.com/forum/#!topic/spark-connector-user/rlGGWQF2wnM

Need help on how to fix this problem. I am writing my work on spark flow in scala

+5
source share
1 answer

I think you should do it like this:

 dff.write.format("org.apache.spark.sql.cassandra").mode(SaveMode.Append).options(Map("table" -> "xxx", "yyy" -> "retail")).save() 

The way cassandra processes the data forces you to do so-called "upserts" - you must remember that the insert can overwrite some of the lines where the primary key of the already saved record matches the primary key of the inserted reccord, Cassandra is a "quick write" database. therefore, the availability of data is not checked before recording.

+8
source

Source: https://habr.com/ru/post/1242746/


All Articles