This is a C # / VSTO program. I am working on a data collection project. Volume is basically "process Excel files sent by various third-party companies." In practice, this means:
- Find the columns containing the data I want using the search method.
- Extract data from books
- Clear data, do some calculations, etc.
- Output cleared data to a new book
The program I wrote is great for small data sets, ~ 25 books with a total of ~ 1000 lines of relevant data. I grab 7 columns of data from these books. However, one edge case that I have, sometimes I need to run a much larger data set, ~ 50 books with a total of ~ 8000 rows of relevant data (and maybe another 2000 duplicate data that I also need to delete).
Currently, I put the list of files through a loop Parallel.ForEach, inside which I open new Excel.Application()to process each file with multiple ActiveSheets. A parallel process runs much faster in a smaller data set than through each sequential one. But on a large dataset, I seem to hit a wall.
I start to receive the message: Microsoft Excel is waiting for another application to complete an OLE actionand in the end it just fails. Going to sequential foreachallows the program to finish, but it just grinds - starting from 1-3 minutes for parallel average size data set to 20+ minutes for a sequential large data set. If I ParallelOptions.MaxDegreeOfParallelismrun into set to 10, it will complete the loop, but still take 15 minutes. If I set it to 15, it fails. I also really don't like messing with TPL settings if I don't need it. I also tried pasting Thread.Sleepto just slow things down manually, but it only made the failure longer.
I close the book, exit the application, then ReleaseComObjectto the Excel object GC.Collectand GC.WaitForPendingFinalizersat the end of each cycle.
My ideas at the moment:
new Excel.Application() , Excel ( # 1, )- /, ,
:
- , ( ,
Process.Id ?) - - , "" , .
: http://reedcopsey.com/2010/01/26/parallelism-in-net-part-5-partitioning-of-work/, : " , , Partitioner". , / .
!
UPDATE
, , Excel 2010, 2010, 2013 . 2013 , - 4 , , . 2010 , ? 2010 - 64- 64- Office, 2013 - 64- 32- Office. ?