SQL performance: what's faster? IN () vs JOIN

This is a question that I never received a definitive answer. I am using MySQL in this example.

Given a fairly large set of values ​​(say, 500). Faster looking for a table using these values ​​with the IN () clause:

SELECT * FROM table WHERE field IN(values) 

Or creating a temporary table in memory, filling it with values ​​and attaching them to the desired table:

 CREATE TEMPORARY TABLE `temp_table` (`field` varchar(255) NOT NULL) ENGINE=MyISAM DEFAULT CHARSET=latin1; INSERT INTO temp_table VALUES (values) SELECT * FROM table t1 JOIN temp_table t2 ON t1.field = t2.field 

Both methods will create the same result set.

I did some of my own tests of the basic test and found that when working with more than 500 values, it becomes faster to use a temporary table than the IN () clause.

Can someone explain to me the inner workings of MySQL and what is the correct answer to this question?

Thanks Leo

+4
source share
2 answers

From the online documentation of MySql, IN () :

IN (value, ...)

If all values ​​are constant , they are evaluated according to the type of expression and sorted. Then, the object is searched using binary search. This means IN very quickly if the list of IN values ​​consists entirely of constants. Otherwise, type conversion occurs in accordance with the rules described in Section 11.2, “Type Conversion in Expression Evaluation”, but applies to all arguments.

Given that I believe it makes sense to use IN () with a set of constants, otherwise you should use a subquery in another table.

You can consider usign EXISTS () instead of JOIN when elements are retrieved from another table, this is significantly faster for a large dataset

 SELECT * FROM table t1 WHERE EXISTS ( SELECT * FROM temp_table t2 WHERE t1.field = t2.field ) 
+2
source

The correct answer depends on many things.

You have already done the work - if your benchmarking tells you that using a temporary table is faster, then this is the way to go.

Remember to check again if you change the hardware or change the circuit abruptly.

+1
source

Source: https://habr.com/ru/post/1379286/


All Articles