Entity Framework returning bad data

I have an Entity Framework 6.1 project that queries a SQL Server 2012 database table and returns incorrect results.

To illustrate what is happening, I created 2 queries that should have the same results. The ProjectTable table has 23 columns and 20500ish rows:

 var test1 = db.ProjectTable .GroupBy(t => t.ProjectOwner) .Select(g => g.Key) .ToArray(); var test2 = db.ProjectTable .ToArray() .GroupBy(t => t.ProjectOwner) .Select(g => g.Key) .ToArray(); 

Requests are intended to obtain a list of all individual project owners in the table. The first query does the hard work on SQL Server, where the second query loads the entire table into memory and then processes it on the client side.

The first variable test1 has a length of about 300 elements. The second variable test2 has a length of 5.

Here are the original SQL queries that EF generates:

 -- test1 SELECT [Distinct1].[ProjectOwner] AS [ProjectOwner] FROM ( SELECT DISTINCT [Extent1].[ProjectOwner] AS [ProjectOwner] FROM [dbo].[ProjectTable] as [Extent1] ) AS [Distinct1] -- test2 SELECT Col1, Col2 ... ProjectOwner, ... Col23 FROM [dbo].[ProjectTable] 

When I run this query and parse the returned objects, I notice that the full rows are 20500ish, but the ProjectOwner column gets overridden with one of 5 different users!

 var test = db.ProjectTable.ToArray(); 

I thought it might have been SQL Server, so I did a packet trace and filtered them on TDS. By randomly scanning raw streams, I see many names that are not in the list of 5, so I know that the data is transmitted through the cable correctly.

How to see the source data that EF receives? Is there something that can ruin the cache and output incorrect results?

If I run queries in SSMS or Visual Studio, the list returns correctly. This is only an EF problem.

EDIT

Well, I added another test to make sure my sanity is in control. I took a test2 raw sql query and did the following:

 var test3 = db.Database .SqlQuery<ProjectTable>(@"SELECT Col1..Col23") .ToArray() .Select(t => t.ProjectOwner) .Distict() .ToArray(); 

and I believe the correct 300ish names!

So, in short:

  • After sending the EF of the requested DISTINCT query to SQL Server will return the correct results
  • After selecting the EF of the entire table, and then using LINQ for the project and DISTINCT, the data returns incorrect results
  • Giving EF EXACTLY A VERY QUESTION !!! that bullet # 2 generates and executes a raw SQL query, returns the correct results
+6
source share
1 answer

After loading the Entity Framework source and switching to many of the Enumerator I found a problem.

In the Shaper.HandleEntityAppendOnly method ( found here ), on line 187, the Context.ObjectStateManager.FindEntityEntry method is Context.ObjectStateManager.FindEntityEntry . To my surprise, a nonzero value was returned! Wait a minute, there shouldn't be any cached results, since I'm returning all rows ?!

What when I found that my table does not have a primary key!

In my defense, the table is actually the view cache I'm working with, I just made SELECT * INTO CACHETABLE FROM USERVIEW

Then I looked at which column of the Entity Framework thought it was my primary key (they call it a singleton key), and it so happened that the column they selected had only ... a drum roll, please ... 5 unique values !

When I looked at the model that EF created, enough! This column has been specified as a primary key. I changed the key to the corresponding column, and now everything works as it should!

+5
source

Source: https://habr.com/ru/post/982956/


All Articles