Multiple threads populate their result in one DataTable C #

I am just starting to study the concept of carving, and I’m kind of stuck in this one problem, driving it crazy ...

What I really need to do is

I have about 300 text files in a local directory that need to be analyzed for specific values ​​... After I find these “values” in each text file, I need to save them in a database. So I followed a simple approach to accessing every text file in the directory - parsing and updating the resulting values ​​as a string in the local DataTable, and when I finished parsing all the files and saving 300 lines in the DataTable, I would make SQLBulkCopy from the DataTable into my a database. This approach works just fine, except it takes me about 10 minutes to run my code!

What am I trying to do now -

Create a new thread for each file and keep the number of threads below 4 at any given time ... then each thread will parse the file and return a string to update the local DataTable

Where I am stuck - I don’t understand how to update this single Datatable that gets rows from multiple threads ...

That’s not enough explanation .. I hope someone here can offer a good idea for this ...

Thanks Nidhi

+3
source share
5 answers

, . ( , ), datatable 25% .

, - , :

lock(YourTable.Rows.SyncRoot){
  // add rows to table
}

, , , @David B.

+4

, , .

, . , DataTable , DataTable. DataTable ( ), - .

, , , . - : .

, , , .

:

  • , .
  • ( ), , , .
  • .

... , - , .

, . read/process , . , . .

, :

Queue<string> _filesToProcess = new Queue<string>();
Queue<string> _results = new Queue<string>();
Thread _fileProcessingThread = new Thread( ProcessFiles );
Thread _databaseUpdatingThread = new Thread( UpdateDatabase );
bool _finished = false;

static void Main()
{
    foreach( string fileName in GetFileNamesToProcess() )
    {
       _filesToProcess.Enqueue( fileName );
    }

    _fileProcessingThread.Start();
    _databaseUpdatingThread.Start();

    // if we want to wait until they're both finished
    _fileProcessingThread.Join();
    _databaseUpdatingThread.Join();

    Console.WriteLine( "Done" );
}

void ProcessFiles()
{
   bool filesLeft = true;

   lock( _filesToProcess ){ filesLeft = _filesToProcess.Count() > 0; }

   while( filesLeft )
   {
      string fileToProcess;
      lock( _filesToProcess ){ fileToProcess = _filesToProcess.Dequeue(); }

      string resultAsString = ProcessFileAndGetResult( fileToProcess );

      lock( _results ){ _results.Enqueue( resultAsString ); }

      Thread.Sleep(1); // prevent the CPU from being 100%

      lock( _filesToProcess ){ filesLeft = _filesToProcess.Count() > 0; }
   }

   _finished = true;
}

void UpdateDatabase()
{
   bool pendingResults = false;

   lock( _results ){ pendingResults = _results.Count() > 0; }

   while( !_finished || pendingResults )
   {
      if( pendingResults )
      {
         string resultsAsString;
         lock( _results ){ resultsAsString = _results.Dequeue(); }

         InsertIntoDatabase( resultsAsString ); // implement this however
      }

      Thread.Sleep( 1 ); // prevents the CPU usage from being 100%

      lock( _results ){ pendingResults = _results.Count() > 0; }
   }
}

, "", , , .

, , ( ) Start().

, , . , , . , , , , , Queues.

, .

+6

, ? , .

, , . .

+1

SQLBulkCopy - 300 .

Smart Thread Pool. , 4 . 300 , SQL , .

0

, . #:

private object tableLock;

/*
Later in code.
*/

private void UpdateDataTable(object data)
{
    lock(tableLock)
    {
          //Add or update table rows
    }
}

As for the methods of actually managing and maintaining threads in a line, just use the ThreadPool object, set the maximum threads to your limit, and the queue can take care of things. For additional control, you can throw some kind of logic that uses an array of WaitHandle objects. This may actually be a good idea, given that you want to queue 300 separate objects.

0
source

Source: https://habr.com/ru/post/1710372/


All Articles