How to insert in Cassandra without a null value in a column

I am trying to save some tweets in a Cassandra Database using the Python driver and DataStax (Python -> Cassandra).

Everything works well, but there is something that I cannot understand. How to insert a row without a null value?

As an example

CREATE TABLE tweets ( id_tweet text PRIMARY KEY, texttweet text, hashtag text, url text, ) 

If I want to insert a row without a url value, it works, but in Cassandra I will see "null" in the url column.

I check this document:

http://datastax.imtqy.com/python-driver/getting_started.html#passing-parameters-to-cql-queries

So, I tried 2 different ways:

First I create a String as a complete string and execute it.

 requete = "insert into Tweets(id_tweet,texttweet,hashtag,url) values ('%s','%s','%s','%s')"%(id_tweet,texttweet,hashtag,url) session.execute(requete) 

Or
I am sending parameters to the execution function.

 requete2 = "insert into Tweets(id_tweet,texttweet,hashtag,url) values ('%s','%s','%s','%s')" session.execute(requete2,(id_tweet,id_texttweet,hashtag,url)) 

The problem is that the 2differents methods give me a null value if I don't get the URL or Hashtag in my tweet as an example.

Is it possible to see a column if it is empty in a row, as I see in many textbooks?

enter image description here

Thanks.

+5
source share
1 answer

This is what you can do if you are using Cassandra 2.2 or later. In Kassandra 2.2, the concept of "UNSET" was introduced. This allows you to use the same operator to insert a row, even if you do not want to provide some of the values, here is how you do it:

 from cassandra.query import UNSET_VALUE ps = session.prepare("insert into tweets(id_tweet,texttweet,hashtag,url) values (?,?,?,?)") session.execute(ps, ("id", "hello world!", UNSET_VALUE, UNSET_VALUE)); 

This indicates to cassandra that you do not want to insert these values ​​into zero, instead they should be omitted together, therefore no "zero" values ​​(internally these are tombstones) are inserted into cassandra.

On your side, I think you will need to do some preprocessing logic to convert any incoming None values ​​to UNSET_VALUE . A pre 2.2 solution would be to customize your query based on which columns are missing, i.e. insert into tweets(id_tweet,texttweet) values (?,?) if the hashtag and url are None .

At the end of the search, you need to technically go away to distinguish between null and unset values ​​(I will consider this), but I do not think that such a mechanism exists in the python driver. I will open a ticket if it is possible to do this in the protocol, but the function is missing in the driver. EDIT . It does not look like cassandra distinguishes between values ​​that were explicitly set to zero (which are labeled inside as tombstones) and those that were never set when returning data.

You can read more about 'UNSET' and other 2.2 features in the python driver in this blog post .

+6
source

Source: https://habr.com/ru/post/1240008/


All Articles