Why is "IS NULL" 100x slower than "length () = 0" in the blob column?

I have a ~ 90 MB database, consisting mainly of message attachments, including a content BLOB column that stores binary attachment data.

I suppose it is not wise to create an index over a BLOB, so no indexes are associated with an auto index.

To get empty attachments, I compared the following queries:

 SELECT message_id FROM attachments WHERE content IS NULL; 

and

 SELECT message_id FROM attachments WHERE length(content) = 0; 

which result in the same lines in my usecase.

Why does the first take 250 ms, and the second only 1-2 ms (as on an SSD)? What is the reason for this? Is there a hidden length index or something else? Any insight was appreciated.

Additional Information

  • EXPLAIN QUERY PLAN in both cases

    0 | 0 | 0 | SCAN TABLE Attachments

  • Denial IS NOT NULL vs. length() != 0 leads to the same performance difference of 250 ms versus 2 ms.

  • In combined queries that include only columns {NULL} WHERE content IS NULL AND length(content) = 0; , takes 250 ms, and WHERE length(content) = 0 AND content IS NULL; takes 2 ms.
+5
source share
1 answer

These are just different requests: LENGTH is a scalar function that returns (see here )

(i) NULL if input is NULL
(ii) 0 if the input is a string of zero length (or if it is converted to a string, respectively).

Therefore, the condition length(content)=0 true for content that is an empty string, and false when the content is NULL (since comparison with NULL always false).

Based on this, I assume that your table contains several NULL fields and only some that actually contain a value. This is confirmed by your second additional information, where you say that IS NOT NULL shows comparable performance.

+4
source

Source: https://habr.com/ru/post/1206399/


All Articles