I think you should do it like this:
dff.write.format("org.apache.spark.sql.cassandra").mode(SaveMode.Append).options(Map("table" -> "xxx", "yyy" -> "retail")).save()
The way cassandra processes the data forces you to do so-called "upserts" - you must remember that the insert can overwrite some of the lines where the primary key of the already saved record matches the primary key of the inserted reccord, Cassandra is a "quick write" database. therefore, the availability of data is not checked before recording.
source share