Why does STRAIGHT_JOIN improve this query so much, and what does it mean when it's written after the SELECT keyword?

I have the following MySql query:

select t1.* from Table1 t1 inner join Table2 t2 on t1.CommonID = t2.CommonID where t1.FilterID = 1 

It took about 30 seconds to start, which was strange, because if I comment on the join or where clause, it will take less than a second: that is.

 select t1.* from Table1 t1 where t1.FilterID = 1 

or

 select t1.* from Table1 t1 inner join Table2 t2 on t1.CommonID = t2.CommonID 

each takes less than a second.

Then there is the STRAIGHT_JOIN keyword in which I can find one link here: http://dev.mysql.com/doc/refman/5.0/en/join.html

STRAIGHT_JOIN is similar to JOIN, except that always read the left table before the correct table. It can for those (several) cases for which the join optimizer puts the tables in the wrong order.

What? I can write:

 select t1.* from Table1 t1 STRAIGHT_JOIN Table2 t2 on t1.CommonID = t2.CommonID where t1.FilterID = 1 

and the request is completed in less than a second.

Even a stranger, I can write:

 select STRAIGHT_JOIN t1.* from Table1 t1 inner join Table2 t2 on t1.CommonID = t2.CommonID where t1.FilterID = 1 

and it takes less than a second, and this syntax does not even seem legal.

I would suggest that the second example means that STRAIGHT_JOIN will be used whenever an INNER JOIN is written, but I cannot find documentation about this.

What is going on here, and how does the "union optimizer" lead to such relatively low performance? Should I always use STRAIGHT_JOIN? How can I find out when to use it or not?

Table 1 and Table 2 have integer primary keys; FilterID is a foreign key for another table; CommonID ​​columns are foreign keys to the third table. They both have indexes. The core of the database is InnoDB.

thank

+46
join mysql
Apr 28 2018-11-12T00:
source share
1 answer

What is going on here, and how can the join optimizer lead to such a relatively low performance?

STRAIGHT_JOIN forces the join order of the tables, so table1 scanned in the outer loop and table2 in the inner loop.

The optimizer is not perfect (albeit pretty decent), and the most likely reason is outdated statistics.

Should I always use STRAIGHT_JOIN

No, only if the optimizer is wrong. This can be if your data distribution is severely distorted or cannot be correctly calculated (say, for spatial or full-text indexes).

How can I find out when to use it or not?

You must collect statistics, make plans for both ways, and understand what those plans mean.

If you see that:

  • An automatically generated plan is not optimal and cannot be improved in standard ways,

  • The STRAIGHT_JOIN version is better, you understand that it will always be and understands why it will always be

then use STRAIGHT_JOIN .

+40
Apr 28 '11 at 12:59
source share



All Articles