SQL Server (2014) Odd Behavior from Options in IN (List)

I looked for an attempt to optimize (or at least change) some EF code in C # for using stored procedures and found what seems anomalous (or something new for me) when searching for strings matching a constant list, Typical manual request manually will look like ...

SELECT Something FROM Table WHERE ID IN (one, two, others);

We had an EF request, which we replaced with a stored procedure call, so I looked at the result, saw that it was complex and thought that my simpler request (similar to the one above) would be better. This is not true. Here is a brief demo that reproduces this.

Can anyone explain why the execution plans for the final version are using

...WHERE EXISTS(... (SELECT 1 AS X) AS Alias UNION ALL...) AS Alias...)
Design

better, presumably because it omits the expensive SORT operation, although the plan includes a TWO index scan, not one of the simpler queries.

Here is a separate example script (hopefully) ...

USE SandBox;  -- a dummy database, not a live one!
-- create our dummy table, dropping first if it exists
IF EXISTS (SELECT NULL FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'Test')
    DROP TABLE Test;
CREATE TABLE Test (Id INT IDENTITY(1,1) NOT NULL PRIMARY KEY, FormId INT NOT NULL, DateRead DATE NULL);
-- populate with some data
INSERT INTO Test VALUES (1, NULL), (1, GETDATE()), (1, NULL), (4, NULL), (5, NULL), (6, GETDATE());

-- Simple query that I might typically use
-- how many un-read entries are there for a set of 'forms' of interest, 1, 5 and 6
-- (we're happy to omit forms with none)
SELECT  T.FormId, COUNT(*) AS TheCount
  FROM  Test AS T
 WHERE  T.FormId IN (1, 5, 6)
   AND  T.DateRead IS NULL
 GROUP BY T.FormId;

-- This is the first step towards the EF-generated code
-- using an EXISTS gives basically the same plan but with constants
SELECT  T.FormId, COUNT(*) AS TheCount
  FROM  Test T
 WHERE  EXISTS (    SELECT NULL
                      FROM (VALUES (1), (5), (6)
                            ) AS X(FormId) 
                     WHERE X.FormId = T.FormId
               )
   AND  T.DateRead IS NULL
 GROUP BY T.FormId;

-- A step closer, using UNION ALL instead of VALUES to generate the 'table'
-- still the same plan
SELECT  T.FormId, COUNT(*) AS TheCount
  FROM  Test T
 WHERE  EXISTS (    SELECT NULL
                      FROM (    SELECT 1 
                                UNION ALL
                                SELECT 5 
                                UNION ALL
                                SELECT 6 
                            ) AS X(FormId) 
                     WHERE X.FormId = T.FormId
               )
   AND  T.DateRead IS NULL
 GROUP BY T.FormId;

-- Now what the EF actually generated (cleaned up a bit)
-- Adding in the "FROM (SELECT 1 as X) AS alias" changes the execution plan considerably and apparently costs less to run
SELECT  T.FormId, COUNT(*) AS TheCount
  FROM  Test T
 WHERE  EXISTS (    SELECT NULL
                      FROM (    SELECT 1 FROM (SELECT 1 AS X) AS X1
                                UNION ALL
                                SELECT 5 FROM (SELECT 1 AS X) AS X2
                                UNION ALL
                                SELECT 6 FROM (SELECT 1 AS X) AS X3
                            ) AS X(FormId) 
                     WHERE X.FormId = T.FormId
               )
   AND  T.DateRead IS NULL
 GROUP BY T.FormId;

Can someone help me to understand why and if there is any benefit for wider use for such a request format?

I looked around for something special in the materials (SELECT 1 AS X), and although many show that it is common in the output of EF, I have not seen anything about this particular obvious benefit.

Thanks in advance,

Whale

+4
source share
1 answer

The predicates for each index are scanned in the last of these queries - this is the range identifier> = 1 and id <= 6 and DateRead IS NULL

range between 1 and 6

"select 1", . , ( 1) , UNION ALL, .

DateRead IS NULL, ORs

1 or 5 or 6

, :

declare @tmp table (formid int not null primary key)
insert into @tmp values (1),(5),(6);

SELECT  T.FormId, COUNT(*) AS TheCount
  FROM  Test T
 WHERE  EXISTS (    SELECT NULL
                      FROM @tmp X
                     WHERE X.FormId = T.FormId
               )
   AND  T.DateRead IS NULL
 GROUP BY T.FormId;

.

dbfiddle.uk xml ( ), , : dbfiddle

+1

Source: https://habr.com/ru/post/1688909/


All Articles