LEFT JOIN gives different data depending on the position of the WHERE clause

See below 3 counts to give you a brief idea of ​​the table data. All skCitizen in [dbo].[LUEducation] are present in [dbo].[LUCitizen]

 SELECT COUNT(*) FROM [dbo].[LUCitizen] --115000 ROWS SELECT COUNT(*) FROM [dbo].[LUEducation] --201846 ROWS SELECT COUNT(*) --212695 ROWS FROM [dbo].[LUCitizen] C LEFT JOIN [dbo].[LUEducation] E ON C.skCitizen = E.skCitizen SELECT COUNT(*) FROM [dbo].[LUEducation] WHERE skSchool = 24417 --4 ROWS 

See below 2 queries,

 SELECT C.skCitizen,E.skCitizen FROM [dbo].[LUCitizen] C LEFT JOIN [dbo].[LUEducation] E ON C.skCitizen = E.skCitizen WHERE E.skSchool = 24417 --4 ROWS SELECT C.skCitizen,E.skCitizen FROM [dbo].[LUCitizen] C LEFT JOIN (SELECT * FROM [dbo].[LUEducation] WHERE skSchool = 24417) E ON C.skCitizen = E.skCitizen --115000 ROWS 

In the last two queries, the first question confuses me. There I expected 115000 rows , but only 4 rows displayed. In my opinion, the full lines from [dbo].[LUCitizen] will be shown, then 4 lines from [dbo].[LUEducation] will be LEFT Attached.

Why are the two requests different?

Forgive me if this is a duplicate question.

+5
source share
2 answers

If the left join cannot find a match in E , the columns from E get null . Then where clause:

 E.skSchool = 24417 

becomes:

 null = 24417 

This is not true. This way it will filter all rows.

+3
source

When you do this:

 SELECT C.skCitizen,E.skCitizen FROM [dbo].[LUCitizen] C LEFT JOIN [dbo].[LUEducation] E ON C.skCitizen = E.skCitizen WHERE E.skSchool = 24417; 

You turn left join into inner join because E.skSchool is NULL for non-matching strings. The correct way to put a condition in the second table in a left join is to use the on clause:

 SELECT C.skCitizen,E.skCitizen FROM [dbo].[LUCitizen] C LEFT JOIN [dbo].[LUEducation] E ON C.skCitizen = E.skCitizen AND E.skSchool = 24417; 
+7
source

Source: https://habr.com/ru/post/1201070/


All Articles