EF Codefirst Bulk Insert

I need to insert about 2500 rows using EF Code First.

My original code looked something like this:

foreach(var item in listOfItemsToBeAdded) { //biz logic context.MyStuff.Add(i); } 

It took a lot of time. This was about 2.2 seconds for each DBSet.Add() call, which is approximately 90 minutes.

I reorganized the code:

 var tempItemList = new List<MyStuff>(); foreach(var item in listOfItemsToBeAdded) { //biz logic tempItemList.Add(item) } context.MyStuff.ToList().AddRange(tempItemList); 

It only takes about 4 seconds. However, .ToList() requests all the elements that are currently in the table, which is extremely necessary and can be dangerous or even more time-consuming. One way would be to do something like context.MyStuff.Where(x=>x.ID = *empty guid*).AddRange(tempItemList) , because then I know that nothing will return.

But I'm curious if anyone knows of an efficient mass insert method using EF Code First?

+4
source share
9 answers

Validation is usually a very expensive part of EF, I had big performance improvements by disabling it:

 context.Configuration.AutoDetectChangesEnabled = false; context.Configuration.ValidateOnSaveEnabled = false; 

I believe I found in a similar SO question - perhaps this was this answer

Another answer on this question rightly indicates that if you really need more insertion performance, you should look at using System.Data.SqlClient.SqlBulkCopy . The choice between EF and ADO.NET for this problem is really related to your priorities.

+13
source

I have a crazy idea, but I think this will help you.

After each addition of 100 elements, call SaveChanges. I have the feeling that track changes in EF have very poor performance with huge data.

+2
source

I would recommend this article on how to do massive inserts using EF.

Entity Framework and Slow Volume INSERTS

He explores these areas and compares performance:

  • Default EF (57 minutes to add 30,000 entries)
  • Replacement with ADO.NET code (25 seconds for the same 30,000)
  • Context Bloat. Keep the active context graph small with a new context for each unit of work (same thing, 30,000 inserts take 33 seconds).
  • Large Lists - Disable AutoDetectChangesEnabled (reduces time to 20 seconds)
  • Packing (up to 16 seconds)
  • DbTable.AddRange () - (performance is in the range of 12)
+2
source

EF is not really applicable for batch / bulk processing operations (I think that in general ORMs are not).

A special reason why it works so slowly is due to the change in the tracker in EF. Almost every EF API call calls TrackChanges () inside, including DbSet.Add (). When you add 2500, this function is called 2500 times. And each call becomes slower and slower the more data you add. Therefore disabling change tracking in EF should help a lot:

 dataContext.Configuration.AutoDetectChangesEnabled = false; 

The best solution would be to split a large bulk operation into 2500 small transactions, each of which will work with its own data context. You can use msmq or some other reliable messaging engine to initiate each of these smaller transactions.

But if your system is built around mass operations, I would suggest finding a different solution for your data access level than EF.

+1
source

As STW pointed out, the DetectChanges method, called every time you call the Add method, is VERY expensive.

Common decision:

  • Use AddRange over Add
  • SET AutoDetectChanges to false
  • SPLIT SaveChanges in multiple batches

See: Enhance Entity Framework Add Performance

It is important to note that using AddRange does not execute BulkInsert, it simply calls the DetecthChanges method once (after adding all entities), which significantly improves performance.

But I'm curious if anyone knows of an efficient mass insert method using EF Code First

There is some third-party library that supports Bulk Insert:

See: Entity Framework Volume Insert Library


Disclaimer I own Entity Platform Extensions

This library allows you to perform all the bulk operations necessary for your scripts:

  • Massive SaveChanges
  • Bulk insert
  • Bulk delete
  • Bulk Update
  • Mass merge

Example

 // Easy to use context.BulkSaveChanges(); // Easy to customize context.BulkSaveChanges(bulk => bulk.BatchSize = 100); // Perform Bulk Operations context.BulkDelete(customers); context.BulkInsert(customers); context.BulkUpdate(customers); // Customize Primary Key context.BulkMerge(customers, operation => { operation.ColumnPrimaryKeyExpression = customer => customer.Code; }); 
+1
source

Although a late reply, but I am sending an answer because I had the same pain. I created a new GitHub project just for this, at the moment it supports transparent download / update / delete for the Sql server using SqlBulkCopy.

https://github.com/MHanafy/EntityExtensions

There are other goodies, and I hope it will be expanded to do more along the track.

Using it is as simple as

 var insertsAndupdates = new List<object>(); var deletes = new List<object>(); context.BulkUpdate(insertsAndupdates, deletes); 

Hope this helps!

+1
source

EF6 beta 1 has an AddRange function that can suit your purpose:

INSERT many lines in Entity Framework 6 beta 1

EF6 will be released this year (2013)

0
source

Although it's a bit late, and the answers and comments posted above are very useful, I just leave it here and hope it will be useful to people who have the same problem as me and come to this post for answers. This post still holds a high place on Google (at the time this answer was posted) if you are looking for a way to bulk write entries using the Entity Framework.

I had a similar problem using Entity Framework and Code First in an MVC 5 application. I had a user who submitted a form that called tens of thousands of records to insert into a table. The user had to wait more than two and a half minutes, while 60,000 records were inserted into it.

After much searching, I came across BulkInsert-EF6 , which is also available as a NuGet package. Reset OP code:

 var tempItemList = new List<MyStuff>(); foreach(var item in listOfItemsToBeAdded) { //biz logic tempItemList.Add(item) } using (var transaction = context.Transaction()) { try { context.BulkInsert(tempItemList); transaction.Commit(); } catch (Exception ex) { // Handle exception transaction.Rollback(); } } 

My code went from 2 minutes to <1 second for 60,000 entries.

0
source
  public static void BulkInsert(IList list, string tableName) { var conn = (SqlConnection)Db.Connection; if (conn.State != ConnectionState.Open) conn.Open(); using (var bulkCopy = new SqlBulkCopy(conn)) { bulkCopy.BatchSize = list.Count; bulkCopy.DestinationTableName = tableName; var table = ListToDataTable(list); bulkCopy.WriteToServer(table); } } public static DataTable ListToDataTable(IList list) { var dt = new DataTable(); if (list.Count <= 0) return dt; var properties = list[0].GetType().GetProperties(); foreach (var pi in properties) { dt.Columns.Add(pi.Name, Nullable.GetUnderlyingType(pi.PropertyType) ?? pi.PropertyType); } foreach (var item in list) { DataRow row = dt.NewRow(); properties.ToList().ForEach(p => row[p.Name] = p.GetValue(item, null) ?? DBNull.Value); dt.Rows.Add(row); } return dt; } 
0
source

Source: https://habr.com/ru/post/1500376/


All Articles