SQL query on large tables fast, then slow

Below, the query quickly returns the original result, and then becomes very slow.

SELECT A.Id , B.Date1 FROM A LEFT OUTER JOIN B ON A.Id = B.Id AND A.Flag = 'Y' AND (B.Date1 IS NOT NULL AND A.Date >= B.Date2 AND A.Date < B.Date1) 

Table A contains 24 million records, and Table B contains 500 thousand records.

The pointer for table A is in the columns: Id and Date

The index for table B is in the columns: Id, Date2, Date1 - Date1 - nullable - the index is unique

Frist 11m recordings come back pretty fast and it suddenly becomes extremely slow. The execution plan shows the indices.

However, when I remove the A.Date <B.Date1 condition, the query becomes fast again.

Do you know what needs to be done to increase productivity? Thanks

UPDATE: I updated the query to show that I need the fields of table B as a result. You might think why I used the left join when I have the condition β€œB.Date1 is not null”. This is because I posted a simplified query. My performance issue is even with this simplified version.

+6
source share
2 answers

You can try using EXISTS . It should be faster, since it stops looking for further lines if a match is found, unlike JOIN , where all lines must be extracted and concatenated.

 select id from a where flag = 'Y' and exists ( select 1 from b where a.id = b.id and a.date >= b.date2 and a.date < b.date1 and date1 is not null ); 
+6
source

In general, what I noticed with queries, and SQL performance is DATA, which you connect to, for example, the ONE to ONE relationship is much faster than the ONE to MANY relationship.

I noticed the ONE to MANY relationship on tables 3000 , joining a table with 30,000 items can take up to 11-15 seconds using LIMIT . But the same request, processed with all ONE to ONE relationships, will take less than 1 second.

So my suggestion is to speed up your request. According to Left Outer Join (desc), β€œLEFT JOIN and LEFT OUTER JOIN are the same,” so it doesn't matter which one you use.

But ideally, you should use INNER , because in your question you indicated B.Date1 IS NOT NULL

Based on this parent columns in the join selection (desc) , you can use the parent column in SELECT in JOIN.

 SELECT a.Id FROM A a INNER JOIN (SELECT b.Id AS 'Id', COUNT(1) as `TotalLinks` FROM B b WHERE ((b.Date1 IS NOT NULL) AND ((a.Date >= b.Date2) AND (a.Date < b.Date1)) GROUP BY b.Id) AS `ab` ON (a.Id = ab.Id) AND (a.Flag = 'Y') WHERE a.Flag = 'Y' AND b.totalLinks > 0 LIMIT 0, 500 

Try also LIMIT data you need; this will reduce the filtering needed by SQL.

0
source

Source: https://habr.com/ru/post/1015232/


All Articles