Avoid the subquery to select records from the same table depending on the date of the base record

I have a StudentScores table as below in SQL Server 2012 . The classification system is weighted using special rules. There will be one row for each student MATHS result in the result set. The row may or may not have an estimate for the SCIENCE and LITERATURE columns based on the availability of available scores "within two months from the date of the MATHS result for SCIENCE" and "within one month from the date of the MATHS result for LITERATURE".

Note. This is the script that I created to simplify my business domain problem.

I created the following query with subqueries. Is there a way to rewrite it without subqueries and more efficiently?

TABLE

DECLARE @StudentScores TABLE (StudentMarkID INT IDENTITY(1,1) NOT NULL, StudentID INT, SubjectCode VARCHAR(10), ResultDate DATETIME, Score DECIMAL(5,2)) INSERT INTO @StudentScores (StudentID,SubjectCode,ResultDate,Score) SELECT 1, 'MATHS','2016-01-10',35 UNION ALL SELECT 1, 'LITERATURE','2016-01-10',62 UNION ALL SELECT 1, 'SCIENCE','2016-01-30',65 UNION ALL SELECT 1, 'SCIENCE','2016-02-02',61 UNION ALL SELECT 1, 'LITERATURE','2016-02-03',60 UNION ALL SELECT 1, 'MATHS','2016-03-25',55 UNION ALL SELECT 2, 'LITERATURE','2016-01-10',12 UNION ALL SELECT 2, 'SCIENCE','2016-01-30',14 UNION ALL SELECT 2, 'SCIENCE','2016-02-14',12 UNION ALL SELECT 2, 'LITERATURE','2016-02-14',15 UNION ALL SELECT 2, 'MATHS','2016-03-25',18 

QUERY

 SELECT SS.StudentID, Score AS MathsScore, ResultDate AS MathsResultDate, (SELECT TOP 1 Score FROM @StudentScores S2 WHERE S2.StudentID = SS.StudentID AND S2.SubjectCode = 'SCIENCE' AND S2.ResultDate >= DATEADD(MONTH,-2,SS.ResultDate) ORDER BY s2.ResultDate DESC ) AS ScienceScore, (SELECT TOP 1 ResultDate FROM @StudentScores S2 WHERE S2.StudentID = SS.StudentID AND S2.SubjectCode = 'SCIENCE' AND S2.ResultDate >= DATEADD(MONTH,-2,SS.ResultDate) ORDER BY s2.ResultDate DESC ) AS ScienceResultDate, (SELECT TOP 1 Score FROM @StudentScores S2 WHERE S2.StudentID = SS.StudentID AND S2.SubjectCode = 'LITERATURE' AND S2.ResultDate >= DATEADD(MONTH,-1,SS.ResultDate) ORDER BY s2.ResultDate DESC ) AS LiteratureScore, (SELECT TOP 1 ResultDate FROM @StudentScores S2 WHERE S2.StudentID = SS.StudentID AND S2.SubjectCode = 'LITERATURE' AND S2.ResultDate >= DATEADD(MONTH,-1,SS.ResultDate) ORDER BY s2.ResultDate DESC ) AS LiteratureResultDate FROM @StudentScores SS WHERE SS.SubjectCode = 'MATHS' 

Expected Result

enter image description here

+5
source share
1 answer

I was able to reduce the query to two calls in the data table - one to receive Maths data, since their dates are used to retrieve parts for other objects, and the second for other objects:

 WITH DataSource_Maths AS ( SELECT SS.[StudentID] ,SS.[Score] AS [MathsScore] ,SS.[ResultDate] AS [MathsResultDate] -- we are using this interal ID later in the final join between the two CTEs -- in order to know which record, for which date period refers ,ROW_NUMBER() OVER(ORDER BY SS.[StudentID], SS.[ResultDate]) AS InternalID FROM @StudentScores SS WHERE SS.[SubjectCode] = 'MATHS' ), DataSource_Others AS ( SELECT DS.[StudentID] ,DS.[SubjectCode] ,DS.[Score] ,DS.[ResultDate] ,Ds.[RowID] ,SS.[InternalID] FROM DataSource_Maths SS OUTER APPLY ( SELECT * -- calculating row ID for each record across student and subject (we are going to take only the latest ones) -- this is achived using TOP in your example ,DENSE_RANK() OVER (PARTITION BY [StudentID], [SubjectCode] ORDER BY [ResultDate] DESC) AS [RowID] FROM @StudentScores WHERE ( [ResultDate] >= DATEADD(MONTH, -2, SS.[MathsResultDate]) AND [SubjectCode] = 'SCIENCE' OR [ResultDate] >= DATEADD(MONTH, -1, SS.[MathsResultDate]) AND [SubjectCode] = 'LITERATURE' ) AND [StudentID] = SS.[StudentID] ) DS ) SELECT FDS_M.[StudentID] ,FDS_M.[MathsScore] AS [MathsScore] ,FDS_M.[MathsResultDate] AS [MathsResultDate] ,FDS_S.[Score] AS [ScienceScore] ,FDS_S.[ResultDate] AS [ScienceResultDate] ,FDS_L.[Score] AS [LiteratureScore] ,FDS_L.[ResultDate] AS [LiteratureResultDate] FROM DataSource_Maths FDS_M LEFT JOIN DataSource_Others FDS_S ON FDS_M.[InternalID] = FDS_S.[InternalID] AND FDS_S.[SubjectCode] = 'SCIENCE' AND FDS_S.[RowID] = 1 LEFT JOIN DataSource_Others FDS_L ON FDS_M.[InternalID] = FDS_L.[InternalID] AND FDS_L.[SubjectCode] = 'LITERATURE' AND FDS_L.[RowID] = 1; 

Of course, in your more complex example, you can materialize CTE in temporary tables (for example) to simplify and optimize the query.

+1
source

Source: https://habr.com/ru/post/1272855/


All Articles