Recommendations for working with files through C #

I am working on creating several hundred files (csv) in 15 minutes. and the back of the application takes these files and processes them (updates the database with these values). One problem is locking the database.

What are the best practices for working with several thousand files to avoid blocking and efficiently processing these files?

Would it be more efficient to create one file and process it? or process one file at a time?

What are some general guidelines?

Edit: The database is not relational dbms. These are s nosql, object oriented dbms that work in memory.

+4
source share
6 answers

So, assuming you have N-Machines files, and each file is similar in the sense that it usually ends up in the same tables in the database ...

I would set up a queue, all the machines wrote their files to the queue, and then on the other side collected things from the queue, and then processed them in the database. So, one file at a time. You can probably even optimize file operations by directly writing them to the queue.

+3
source

If you have problems with locks, probably the updated database tables do not have the corresponding indexes. Get the SQL code that is performing the update and find out what the execution plan is for; if you use MSSQL, you can do it in SSMS; if UPDATE triggers a table scan, you need to add an index that helps isolate the updated records (if you do not update each individual record in the table, this can be a problem).

+2
source

With limited knowledge of your exact scenario ...

Performance, closing a file, is probably the most expensive operation that you perform in terms of time, so my advice would be that you can go along the same file path - then this will be the most efficient approach.

+1
source

A lock will protect files from processing until the first is complete.

class ThreadSafe { static readonly object _locker = new object(); static int _val1, _val2; static void Go() { lock (_locker) { if (_val2 != 0) Console.WriteLine (_val1 / _val2); _val2 = 0; } } } 
0
source

It looks like you either want to create a single file mechanism, or download all the files from a single single directory, which constantly checks the oldest csv file and runs it through your code. In any case, this may be the β€œcheapest” solution. If you actually generate more files that you can process, I would probably rethink the general system architecture instead of the band-aid approach.

0
source

You can try to take care of concurrency problems at the level of your application code and cause dbms not to block objects during updates.

(In a DBMS, you would set the transaction isolation level to the lowest (consider uncommitted))

If you can do this, another option is to trim all old objects and insert new values ​​into them.

0
source

Source: https://habr.com/ru/post/1308674/


All Articles