I need to process as 1M objects to create facts. There should be about the same total facts (1 million).
The first problem I ran into was the mass insert, which was slow with the entity infrastructure. Therefore, I used this template. The fastest way to insert into the Entity Framework (answer from SLauma). And I can embed entities into reality quickly around 100K in one minute.
Another issue that I ran into is a lack of memory to handle everything. Therefore, I "laid out" the processing. To avoid an exception from memory, I get if I make a list of my 1 million summary facts.
The problem is that memory always grows even with paging, and I donโt understand why. After each batch, the memory is not freed. I think this is strange, because I get answers to the collection of facts and storing them in the database at each iteration of the cycle. Once the loop is complete, they should be freed from memory. But this does not look like because after each iteration the memory is not freed.
Could you tell me if you see something wrong before I dig anymore? More specifically, why memory is not freed after an iteration of the while loop.
static void Main(string[] args) { ReceiptsItemCodeAnalysisContext db = new ReceiptsItemCodeAnalysisContext(); var recon = db.Recons .Where(r => r.Transacs.Where(t => t.ItemCodeDetails.Count > 0).Count() > 0) .OrderBy( r => r.ReconNum); // used for "paging" the processing var processed = 0; var total = recon.Count(); var batchSize = 1000; //100000; var batch = 1; var skip = 0; var doBatch = true; while (doBatch) { // list to store facts processed during the batch List<ReconFact> facts = new List<ReconFact>(); // get the Recon items to process in this batch put them in a list List<Recon> toProcess = recon.Skip(skip).Take(batchSize) .Include(r => r.Transacs.Select(t => t.ItemCodeDetails)) .ToList(); // to process real fast Parallel.ForEach(toProcess, r => { // processing a recon and adding the facts to the list var thisReconFacts = ReconFactGenerator.Generate(r); thisReconFacts.ForEach(f => facts.Add(f)); Console.WriteLine(processed += 1); }); // saving the facts using pattern provided by Slauma using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required, new System.TimeSpan(0, 15, 0))) { ReceiptsItemCodeAnalysisContext context = null; try { context = new ReceiptsItemCodeAnalysisContext(); context.Configuration.AutoDetectChangesEnabled = false; int count = 0; foreach (var fact in facts.Where(f => f != null)) { count++; Console.WriteLine(count); context = ContextHelper.AddToContext(context, fact, count, 250, true); //context.AddToContext(context, fact, count, 250, true); } context.SaveChanges(); } finally { if (context != null) context.Dispose(); } scope.Complete(); } Console.WriteLine("batch {0} finished continuing", batch); // continuing the batch batch++; skip = batchSize * (batch - 1); doBatch = skip < total; // AFTER THIS facts AND toProcess SHOULD BE RESET // BUT IT LOOKS LIKE THEY ARE NOT OR AT LEAST SOMETHING // IS GROWING IN MEMORY } Console.WriteLine("Processing is done {} recons processed", processed); }
A method provided by Slauma to optimize a volume insert with an entity framework.
class ContextHelper { public static ReceiptsItemCodeAnalysisContext AddToContext(ReceiptsItemCodeAnalysisContext context, ReconFact entity, int count, int commitCount, bool recreateContext) { context.Set<ReconFact>().Add(entity); if (count % commitCount == 0) { context.SaveChanges(); if (recreateContext) { context.Dispose(); context = new ReceiptsItemCodeAnalysisContext(); context.Configuration.AutoDetectChangesEnabled = false; } } return context; } }