How join order affects query performance

I am having big time differences in my query, and it seems that the order in which the connection occurs (inner and left outer) in the query makes a difference. Are there any “basic rules" in what order should the connections be?

Both of them are part of a big request. The difference between the two is that the left join is placed last in a faster request.

Slow request: (> 10 minutes)

SELECT [t0].[Ref], [t1].[Key], [t1].[Name], (CASE WHEN [t3].[test] IS NULL THEN CONVERT(NVarChar(250),@p0) ELSE CONVERT(NVarChar(250),[t3].[Key]) END) AS [value], (CASE WHEN 0 = 1 THEN CONVERT(NVarChar(250),@p1) ELSE CONVERT(NVarChar(250),[t4].[Key]) END) AS [value2] FROM [dbo].[tblA] AS [t0] INNER JOIN [dbo].[tblB] AS [t1] ON [t0].[RefB] = [t1].[Ref] LEFT OUTER JOIN ( SELECT 1 AS [test], [t2].[Ref], [t2].[Key] FROM [dbo].[tblC] AS [t2] ) AS [t3] ON [t0].[RefC] = ([t3].[Ref]) INNER JOIN [dbo].[tblD] AS [t4] ON [t0].[RefD] = ([t4].[Ref]) 

Faster request: (~ 30 seconds)

 SELECT [t0].[Ref], [t1].[Key], [t1].[Name], (CASE WHEN [t3].[test] IS NULL THEN CONVERT(NVarChar(250),@p0) ELSE CONVERT(NVarChar(250),[t3].[Key]) END) AS [value], (CASE WHEN 0 = 1 THEN CONVERT(NVarChar(250),@p1) ELSE CONVERT(NVarChar(250),[t4].[Key]) END) AS [value2] FROM [dbo].[tblA] AS [t0] INNER JOIN [dbo].[tblB] AS [t1] ON [t0].[RefB] = [t1].[Ref] INNER JOIN [dbo].[tblD] AS [t4] ON [t0].[RefD] = ([t4].[Ref]) LEFT OUTER JOIN ( SELECT 1 AS [test], [t2].[Ref], [t2].[Key] FROM [dbo].[tblC] AS [t2] ) AS [t3] ON [t0].[RefC] = ([t3].[Ref]) 
+6
source share
3 answers

Typically, the order of an INNER JOIN is not significant because internal joins are commutative and associative. In both cases, you still have t0 inner join t4 , so it doesn't matter.

Repeating this, SQL is declarative: you say "what you want," not "how." The optimizer works as an “how” and will reorder JOINs as needed, looking like WHERE, etc. Also in practice.

In complex queries, the cost-based query optimizer does not exhaust all permutations, so sometimes it can make a difference.

So, I would check on them:

  • You said that this is part of a larger request, so this section matters less because the whole request matters.
  • Complexity can be hidden using views if any of the tables actually represents
  • Is it repeatable no matter which order code works?
  • What are the differences in terms of request?

See other SO questions:

+9
source

If u has more than 2 tables, it is important to arrange the joins of the tables. This can make a big difference. The first table should get the main hint. The first table is an object with most sample rows. For example: if you have a member table with 1,000,000 people, and you only want to choose the female gender, and this is the first table, so you will attach 500,000 entries to the next table. If this table is at the end of the connection order (maybe table 4,5 or 6), then each record (worst case 1.000.000) will be combined. This includes internal and external connections.

Rule: start with the sample table itself, then attach the next logical most selective table.

The transformation of functions and decoration should continue. Sometimes it’s better to bind the SQL drill in parentheses and use expressions and functions in external select statements.

+1
source

At least in SQLite, I found this to be of great importance. Actually, it didn’t have to be a very complicated query of difference to show itself. My JOIN statements were inside an inline statement.

In principle, you must first begin with the most specific limitations, as the Christian pointed out.

0
source

Source: https://habr.com/ru/post/899598/


All Articles