Alternative NOT IN On MySQL

I have a request

SELECT DISTINCT phoneNum FROM `Transaction_Register` WHERE phoneNum NOT IN (SELECT phoneNum FROM `Subscription`) LIMIT 0 , 1000000 

It takes too much time to execute the b / c table Transaction_Register has millions of records if there is an alternative to the above query, I will be grateful to you guys if they are.

+4
source share
2 answers

An alternative could be to use a LEFT JOIN:

 select distinct t.phoneNum from Transaction_Register t left join Subscription s on t.phoneNum = s.phoneNum where s.phoneNum is null LIMIT 0 , 1000000; 

See SQL Fiddle with Demo

+15
source

I doubt the LEFT JOIN really works better than NOT IN . I am just doing a few tests with the following table structure (if I am wrong, correct me):

 account (id, ....) [42,884 rows, index by id] play (account_id, playdate, ...) [61,737 rows, index by account_id] 

(1) Request with LEFT JOIN

 SELECT * FROM account LEFT JOIN play ON account.id = play.account_id WHERE play.account_id IS NULL 

(2) Request with NOT IN

 SELECT * FROM account WHERE account.id NOT IN (SELECT play.account_id FROM play) 

Speed ​​check with LIMIT 0, ...

 LIMIT 0,-> 100 150 200 250 ------------------------------------------------------------------------- LEFT 3.213s 4.477s 5.881s 7.472s NOT EXIST 2.200s 3.261s 4.320s 5.647s -------------------------------------------------------------------------- Difference 1.013s 1.216s 1.560s 1.825s 

When I increase the limit, the difference gets bigger and bigger


With EXPLAIN

(1) Request with LEFT JOIN

 SELECT_TYPE TABLE TYPE ROWS EXTRA ------------------------------------------------- SIMPLE account ALL 42,884 SIMPLE play ALL 61,737 Using where; not exists 

(2) Request with NOT IN

 SELECT_TYPE TABLE TYPE ROWS EXTRA ------------------------------------------------- SIMPLE account ALL 42,884 Using where DEPENDENT SUBQUERY play INDEX 61,737 Using where; Using index 

LEFT JOIN doesn't seem to use an index

LOGIC

(1) Request with LEFT JOIN

After the LEFT JOIN between the account and the game, 42,884 * 61,737 = 2,647,529,508 lines will be released. Then check if play.account_id is NULL in these lines.

(2) Request with NOT IN

Binary search takes log2 (N) for an element to exist. This means that 42.884 * log2 (61.737) = 686.144 steps

+2
source

Source: https://habr.com/ru/post/1484246/


All Articles