MySQL: why isn't "FOO" IS NULL optimized?

MySQL 5.5.28. I have two tables Person and Message , and the latter has a foreign key for the first. Each table has id as the primary key column, and the Person table also has a personId column that is indexed (uniquely).

The query below should use the personId key personId , but instead, MySQL requires scanning the entire Message table for some reason:

  mysql> EXPLAIN SELECT `m`. *
     -> FROM
     -> `Message` AS` m`
     -> LEFT JOIN
     -> `Person` AS` p` ON (` m`.`person` = `p`.`id`)
     -> WHERE
     -> 'M002649397' IS NULL OR
     -> `p`.`personId` = 'M002649397';
 + ---- + ------------- + ------- + -------- + ------------- - + --------- + --------- + ---------------- + -------- + - ------------ +
 |  id |  select_type |  table |  type |  possible_keys |  key |  key_len |  ref |  rows |  Extra |
 + ---- + ------------- + ------- + -------- + ------------- - + --------- + --------- + ---------------- + -------- + - ------------ +
 |  1 |  SIMPLE |  m |  ALL |  NULL |  NULL |  NULL |  NULL |  273220 |  |
 |  1 |  SIMPLE |  p |  eq_ref |  PRIMARY |  PRIMARY |  8 |  pcom.m.person |  1 |  Using where |
 + ---- + ------------- + ------- + -------- + ------------- - + --------- + --------- + ---------------- + -------- + - ------------ +
 2 rows in set (0.00 sec)

But when I comment on the sentence 'M002649397' IS NULL OR (which does not affect the result), the request suddenly becomes more efficient:

  mysql> EXPLAIN SELECT `m`. *
     -> FROM
     -> `Message` AS` m`
     -> LEFT JOIN
     -> `Person` AS` p` ON (` m`.`person` = `p`.`id`)
     -> WHERE
     -> - 'M002649397' IS NULL OR
     -> `p`.`personId` = 'M002649397';
 + ---- + ------------- + ------- + ------- + -------------- ------ + -------------------- + --------- + ------- + ---- - + ------------- +
 |  id |  select_type |  table |  type |  possible_keys |  key |  key_len |  ref |  rows |  Extra |
 + ---- + ------------- + ------- + ------- + -------------- ------ + -------------------- + --------- + ------- + ---- - + ------------- +
 |  1 |  SIMPLE |  p |  const |  PRIMARY, personId |  personId |  767 |  const |  1 |  Using index |
 |  1 |  SIMPLE |  m |  ref |  FK9C2397E7A0F6ED11 |  FK9C2397E7A0F6ED11 |  9 |  const |  3 |  Using where |
 + ---- + ------------- + ------- + ------- + -------------- ------ + -------------------- + --------- + ------- + ---- - + ------------- +
 2 rows in set (0.01 sec)

My question is: why is MySQL not smart enough to understand that the 'M002649397' IS NULL always false, optimizes it, and saves the need to flawlessly scan each row in a huge table?

In other words, does the MySQL optimizer not know that the 'M002649397' IS NULL always false, or cannot it apply this optimization to the query when building its query plan?

+6
source share
2 answers

Actually, what is more interesting is that the documentation says that MySQL is smart enough to do this (see here ).

This seems to fall under the heading "8.2.1.2. Dead Code Exception."

I believe the reason is that the developers did not consider an expression such as "not null" when the code was written. The documentation provides numerous examples based on continuous distribution ( x1 = 2 and x2 = x1 becomes x1 = 2 and x2 = 2 ). is null probably arises in this situation.

+1
source

This is a verified MySQL error .

You cannot have one execution plan for this condition:

WHERE (0 = 1) OR p.personId = 'string_constant';

and another implementation plan for:

WHERE p.personId = 'string_constant';

because (0 = 1) always results in FALSE, which makes these two queries 100% identical.

In the error report itself, you can see that the execution plan, when present (0 = 1) OR, is much worse than the expression, where the expression is only the equality of the column constant.

* Please note that this is fixed in MariaDB .

+1
source

Source: https://habr.com/ru/post/946226/


All Articles