Why MYSQL IN Keyword Does Not Consider NULL Values

I am using the following query:

select count(*) from Table1 where CurrentDateTime>'2012-05-28 15:34:02.403504' and Error not in ('Timeout','Connection Error'); 

Surprisingly, this statement does not include strings that have the Error value as NULL. We intend to filter only rows with the Error value as "Timeout" (or) "Connection Error". I need to provide an additional condition (OR Error is NULL) to get the correct result.

Why is MYSQL filtering results using NULL values? I thought that the IN keyword will return a boolean result (1/0), and now I understand that some MYSQL words do not return booleans, it can also return NULL .... but why does it treat NULL as special?

+6
source share
6 answers

It:

 Error not in ('Timeout','Connection Error'); 

semantically equivalent:

 Error <> 'TimeOut' AND Error <> 'Connection Error' 

Zero comparison rules also apply to IN. Therefore, if the Error value is NULL, the database cannot make the expression true.

To fix this, you can do the following:

 COALESCE(Error,'') not in ('Timeout','Connection Error'); 

Or even better:

 Error IS NULL OR Error not in ('Timeout','Connection Error'); 

Or even better:

  CASE WHEN Error IS NULL THEN 1 ELSE Error not in ('Timeout','Connection Error') THEN 1 END = 1 

OR does not close, CASE may somehow short-circuit your request


Perhaps a concrete example might show why NULL NOT IN expression returns nothing:

Given this data: http://www.sqlfiddle.com/#!2/0d5da/11

 create table tbl ( msg varchar(100) null, description varchar(100) not null ); insert into tbl values ('hi', 'greet'), (null, 'nothing'); 

And you make this expression:

 select 'hulk' as x, msg, description from tbl where msg not in ('bruce','banner'); 

This will only output hi.

NOT IN translates to:

 select 'hulk' as x, msg, description from tbl where msg <> 'bruce' and msg <> 'banner'; 

NULL <> 'bruce' cannot be defined, not even true, not even false

NULL <> 'banner' cannot be defined, not even true, not even false

So, a null value expression is effectively resolved:

 can't be determined AND can't bedetermined 

In fact, if your RDBMS supports boolean on SELECT (e.g. MySQL, Postgresql), you can understand why: http://www.sqlfiddle.com/#!2/d41d8/828

 select null <> 'Bruce' 

Returns null.

This also returns null:

 select null <> 'Bruce' and null <> 'Banner' 

Given that you are using NOT IN , which is basically an AND expression.

 NULL AND NULL 

Results in NULL. So how do you do: http://www.sqlfiddle.com/#!2/0d5da/12

 select * from tbl where null 

Nothing will be returned

+19
source

Since null is undefined, so null is not null. You should always explicitly handle null.

+1
source

IN returns NULL if the expression on the left is NULL . To get NULL values ​​you need to do:

 select count(*) from Table1 where CurrentDateTime>'2012-05-28 15:34:02.403504' and (Error not in ('Timeout','Connection Error') or Error is null); 
+1
source

IN returns a three-digit BOOLEAN (which takes NULL as a value). NOT IN returns the trivial negation of IN , and the negation of NULL is NULL .

Suppose we have a table with all numbers from 1 to 1,000,000 in id and this query:

 SELECT * FROM mytable WHERE id IN (1, 2, NULL) 

or its equivalent:

 SELECT * FROM mytable WHERE id = ANY ( SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT NULL ) 

The predicate returns TRUE for 1 and 2 and NULL for all other values, so 1 and 2 returned.

In its oppposite:

 SELECT * FROM mytable WHERE id NOT IN (1, 2, NULL) 

or

 SELECT * FROM mytable WHERE id <> ALL ( SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT NULL ) 

the predicate returns FALSE for 1 and 2 and NULL for all other values, so nothing is returned.

Note that logical negation not only changes the operator ( = to <> ), but also the quantifier ( ANY to ALL ).

+1
source

@ Michael Buen's answer was the right answer to my case, but let me simplify why.

@ Michael says in his post:


The error is not ("Timeout", "Connection Error");

semantically equivalent:

Error <> 'TimeOut' And error <> 'Connection error'

Zero comparison rules also apply to IN. Therefore, if the Error value is NULL, the database cannot make the expression true.

And in [1], I found this sentence that confirms its most important statement for understanding why IN fails with NULL. In the specifications ("specs") in [1] you will: "If one or both arguments are NULL, the result of the comparison is NULL, except for the NULL-safe <=> operator.

So, the fact is that, unfortunately, Mysql is lost in this case. I think that Mysql designers should not have done this, because when I compare 2 with NULL, Mysql MUST be able to see that they are VARIOUS, and not just throwing erroneous results. For example, I did:

 select id from TABLE where id not in (COLUMN WITH NULLS); 

then it produces EMPTY results. BUT. If i do

 select id from TABLE where id not in (COLUMN WITH OUT NULLS); 

shows the correct result. Therefore, when using the IN operator, you must filter out NULLS. This is not the desired behavior for me as a user, but it is described in the specifications in [1]. I think that languages ​​and technologies should be simpler, in the sense that you should be able to DEDUCE without requiring reading specifications. Indeed, 2 EXCELLENT from NULL, I should be responsible for controlling and fixing errors of a higher level of abstraction, but MySQL MUST return a FALSE result when comparing NULL with a specific value.

Links for specifications: [1] http://dev.mysql.com/doc/refman/5.6/en/type-conversion.html

0
source

Sorry to post twice in the same forum, but I want to illustrate another example:

I agree with @Wagner Bianchi in [2] on this forum when he says: & L; <His trick when working with data and subqueries β†’

In addition, this should NOT be a behavior; I think that Mysql designers are wrong when they made this decision, documented in [1]. The design should be different. Let me explain: you know that when comparing

 select (2) not in (1, 4, 3); you will get: +----------------------+ | (2) not in (1, 4, 3) | +----------------------+ | 1 | +----------------------+ 1 row in set (0.00 sec) 

BUT, if there is at least one NULL in the list, then:

 select (2) not in (1, NULL, 3); throws: +-------------------------+ | (2) not in (1, NULL, 3) | +-------------------------+ | NULL | +-------------------------+ 1 row in set (0.00 sec) This is pretty absurd. 

We are not the first to get confused by this. See [2]

References:

[1] http://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html#function_in

[2] http://blog.9minutesnooze.com/sql-not-in-subquery-null/comment-page-1/#comment-86954

0
source

Source: https://habr.com/ru/post/916973/


All Articles