SQL query on large tables fast, then slow

Question

SQL query on large tables fast, then slow

Below, the query quickly returns the original result, and then becomes very slow.

SELECT A.Id , B.Date1 FROM A LEFT OUTER JOIN B ON A.Id = B.Id AND A.Flag = 'Y' AND (B.Date1 IS NOT NULL AND A.Date >= B.Date2 AND A.Date < B.Date1)

Table A contains 24 million records, and Table B contains 500 thousand records.

The pointer for table A is in the columns: Id and Date

The index for table B is in the columns: Id, Date2, Date1 - Date1 - nullable - the index is unique

Frist 11m recordings come back pretty fast and it suddenly becomes extremely slow. The execution plan shows the indices.

However, when I remove the A.Date <B.Date1 condition, the query becomes fast again.

Do you know what needs to be done to increase productivity? Thanks

UPDATE: I updated the query to show that I need the fields of table B as a result. You might think why I used the left join when I have the condition “B.Date1 is not null”. This is because I posted a simplified query. My performance issue is even with this simplified version.

+6

sql sql-server tsql

Bob Feb 24 '17 at 4:04

source share

2 answers

Gurv · Answer 1 · 2017-02-24T04:18:48+0000

You can try using EXISTS . It should be faster, since it stops looking for further lines if a match is found, unlike JOIN , where all lines must be extracted and concatenated.

 select id from a where flag = 'Y' and exists ( select 1 from b where a.id = b.id and a.date >= b.date2 and a.date < b.date1 and date1 is not null );

Killrawr · Answer 2 · 2017-02-24T04:26:43+0000

In general, what I noticed with queries, and SQL performance is DATA, which you connect to, for example, the ONE to ONE relationship is much faster than the ONE to MANY relationship.

I noticed the ONE to MANY relationship on tables 3000 , joining a table with 30,000 items can take up to 11-15 seconds using LIMIT . But the same request, processed with all ONE to ONE relationships, will take less than 1 second.

So my suggestion is to speed up your request. According to Left Outer Join (desc), “LEFT JOIN and LEFT OUTER JOIN are the same,” so it doesn't matter which one you use.

But ideally, you should use INNER , because in your question you indicated B.Date1 IS NOT NULL

Based on this parent columns in the join selection (desc) , you can use the parent column in SELECT in JOIN.

 SELECT a.Id FROM A a INNER JOIN (SELECT b.Id AS 'Id', COUNT(1) as `TotalLinks` FROM B b WHERE ((b.Date1 IS NOT NULL) AND ((a.Date >= b.Date2) AND (a.Date < b.Date1)) GROUP BY b.Id) AS `ab` ON (a.Id = ab.Id) AND (a.Flag = 'Y') WHERE a.Flag = 'Y' AND b.totalLinks > 0 LIMIT 0, 500

Try also LIMIT data you need; this will reduce the filtering needed by SQL.

SQL query on large tables fast, then slow

More articles: