How to speed up LINQ insertion with SQL CE?

History

I have a list of "records" (3500) that I save in XML and compress when I exit the program. Because:

  • the number of entries is increasing
  • only need to update about 50 records on exit
  • Saving takes about 3 seconds.

I needed another solution - an embedded database. I chose SQL CE because it works with VS without any problems and the license is fine for me (I compared it with Firebird , SQLite , EffiProz , db4o and BerkeleyDB ).

Data

Record structure: 11 fields, of which 2 are the primary key (nvarchar + byte). Other entries are bytes, data, double and ints.

I do not use any relationships, joins, indexes (except the primary key), triggers, views, etc. This flat dictionary is actually Key + Value pairs. I modify some of them, and then I have to update them in the database. From time to time I add new "records", and I need to store (insert) them. It's all.

LINQ Approach

I have an empty database (file), so I do 3500 inserts in a loop (one after the other). I donโ€™t even check if the record exists because db is empty.

Lead time? 4 minutes, 52 seconds. I fainted (note: XML + compress = 3 seconds).

SQL CE Raw Approach

I did a bit of work with Google, and despite such claims as here: LINQ to SQL (CE) speed compared to SqlCe saying that it was SQL CE itself, I gave it a try.

The same cycle, but this time the inserts are performed using the SqlCeResultSet (DirectTable mode, see Bulk Insertion in SQL Server CE ) and SqlCeUpdatableRecord.

Result? Are you sitting still? Well ... 0.3 seconds (yes, a split second!).

Problem

LINQ is very readable, and raw operations are exactly the opposite. I could write a mapper that translates all column indices into meaningful names, but it seems that it is inventing the wheel - after all, this has already been done in ... LINQ.

So maybe this is a way to tell LINQ to speed things up? QUESTION - how to do it?

The code

LINQ

foreach (var entry in dict.Entries.Where(it => it.AlteredByLearning)) { PrimLibrary.Database.Progress record = null; record = new PrimLibrary.Database.Progress(); record.Text = entry.Text; record.Direction = (byte)entry.dir; db.Progress.InsertOnSubmit(record); record.Status = (byte)entry.LastLearningInfo.status.Value; // ... and so on db.SubmitChanges(); } 

Unprocessed operations

SqlCeCommand cmd = conn.CreateCommand ();

cmd.CommandText = "Progress"; cmd.CommandType = System.Data.CommandType.TableDirect; SqlCeResultSet rs = cmd.ExecuteResultSet (ResultSetOptions.Updatable);

 foreach (var entry in dict.Entries.Where(it => it.AlteredByLearning)) { SqlCeUpdatableRecord record = null; record = rs.CreateRecord(); int col = 0; record.SetString(col++, entry.Text); record.SetByte(col++,(byte)entry.dir); record.SetByte(col++,(byte)entry.LastLearningInfo.status.Value); // ... and so on rs.Insert(record); } 
+6
source share
2 answers

Do more work per transaction.

Commits are usually very expensive operations for a typical relational database, since the database must wait for disk flushes to ensure that data will not be lost ( ACID guarantees and all that). A regular hard disk with a hard disk without special controllers is very slow in this form: data must be flushed to a physical disk - perhaps only 30-60 commits can occur per second with IO synchronization between!

See the SQLite FAQ: INSERT is very slow - I can only do a few dozen INSERTs per second . Ignoring another database engine, this is exactly the same problem.

Normally, LINQ2SQL creates a new implicit transaction inside SubmitChanges . To avoid this implicit transaction / commit (commits are expensive operations):

  • The SubmitChanges call SubmitChanges less (say, once outside the loop) or;

  • Set an explicit transaction scope (see TransactionScope ).

One example of using a larger transaction context:

 using (var ts = new TransactionScope()) { // LINQ2SQL will automatically enlist in the transaction scope. // SubmitChanges now will NOT create a new transaction/commit each time. DoImportStuffThatRunsWithinASingleTransaction(); // Important: Make sure to COMMIT the transaction. // (The transaction used for SubmitChanges is committed to the DB.) // This is when the disk sync actually has to happen, // but it only happens once, not 3500 times! ts.Complete(); } 

However, the semantics of an approach using a single transaction or a single SubmitChanges call are different from the semantics of the above code, calling SubmitChanges 3,500 times and creating 3,500 different implicit transactions. In particular, the size of atomic operations (with respect to the database) is different and may not be suitable for all tasks.

For LINQ2SQL updates, changing an optimistic concurrency model (such as disabling or simply using a timestamp field) can lead to small performance improvements. The greatest improvement, however, will be due to a decrease in the number of commits that must be executed.

Happy coding.

+8
source

I'm not sure about this, but it seems that the call to db.SubmitChanges() should be done outside the loop. maybe this will speed up the process?

+5
source

Source: https://habr.com/ru/post/887058/


All Articles