Does it use NON-EXISTING, which is considered bad SQL practice?

I have heard many people over the years say that:

Operators

"join" is preferred to "NOT EXISTS"

Why?

+6
source share
3 answers

In MySQL , Oracle , SQL Server and PostgreSQL , NOT EXISTS has the same efficiency or even more efficiency than LEFT JOIN / IS NULL .

Although it might seem that β€œan internal query must be executed for each record from an external query” (which seems wrong for NOT EXISTS and even worse for NOT IN , since the last query is not correlated), it can be optimized just like all others requests are optimized using appropriate anti-join methods.

In SQL Server , in fact, LEFT JOIN / IS NULL may be less efficient than NOT EXISTS / NOT IN in the case of a column without an indication or low power in the internal table.

It is often heard that MySQL "especially poor at handling subqueries."

This is rooted in the fact that MySQL not capable of any join methods other than nested loops, which greatly limits its optimization capabilities.

The only time a query is useful when rewriting a subquery as a join is as follows:

 SELECT * FROM big_table WHERE big_table_column IN ( SELECT small_table_column FROM small_table ) 

small_table will not be completely requested for each entry in big_table : although it does not seem to be correlated, it will be implicitly correlated by the query optimizer and actually rewritten to EXISTS (using index_subquery > to search first if necessary, if you need to index small_table_column )

But big_table will always be leading, which makes the request complete in big * LOG(small) , not small * LOG(big) .

It could be rewritten as

 SELECT DISTINCT bt.* FROM small_table st JOIN big_table bt ON bt.big_table_column = st.small_table_column 

However, this will not improve NOT IN (as opposed to IN ). In MySQL , NOT EXISTS and LEFT JOIN / IS NULL almost the same, since with nested loops the left table should always be specified in the LEFT JOIN .

You can read these articles:

+9
source

Perhaps this is due to the optimization process ... NOT EXISTS implies a subquery, and "optimizers" usually do not perform subqueries of justice. Joins, on the other hand, can be handled more easily ...

0
source

I think this is a specific case of MySQL. MySQL does not optimize the subquery in IN / not in / any / not exist clauses and actually does the subquery for each row matched by the outer query. Because of this, in MySQL you have to use join. However, in PostgreSQL you can just use a subquery.

0
source

Source: https://habr.com/ru/post/893273/


All Articles