Huge memory allocation when using the EPPlus Excel library

Context

I used EPPLUS as my Excel report automation tool, using C # as the library client language.

Problem:

After you try to write a really big report (SQL Query answer), with pivot tables, charts, etc., I get an Out of Memory Exception .

Problem solving

To fix the problems, I decided to open an existing report with 138 MB and use the GC object to try to look at what is happening with my memory, and here are the results.

 ExcelPackage pkg = new ExcelPackage (new FileInfo (@"PATH TO THE REPORT.xlsx")); ExcelWorkbook wb = pkg.Workbook; 

Garbage collection results before and after the second line of code.

The ammount of memory in use is too damn high

So, I have no idea what to do from now on. All I do is open a report that consumes almost 10 (9.98%) times the size of the report from memory.

~ 138 MB of the excel file takes 1,370.817.264 bytes of RAM.

Update one:

There is a fairly recent one that has a change log on it:

 New Cell store * Less memory consumtion * Insert columns (not on the range level) * Faster row inserts 

After updating Nuget, I still have the same exception that is thrown after the first line, instead of being raised in the second line.

+6
source share
2 answers

Modern Excel files i.e. Xlsx files with zip compression and often achieve compression of up to 10%. I just uncompressed 1.6 MB file generated using a similar tool and found it extracted up to 18.8 MB of data.

You have a 0.138 GB file that uses 1.370 GB of memory, which is almost exactly 10%. An uncompressed view in memory is what eats your memory.

If you're interested, you can use a tool like 7-Zip to extract the Xlsx files, or you can rename the file to the end of the .zip and view it on Windows.

+4
source

Since I also came across this and did not find a real solution, I had to come up with this solution myself. It comes as a new library: https://github.com/danielgindi/SpreadsheetStreams.net

This is based on taking my very old code that supports csv and xml, interface refactoring, adding xlsx support and publishing as a separate library.

This is not a replacement for EPPlus or other spreadsheet libraries, it is just a streaming report generation. Not all Excel features are there.

0
source

Source: https://habr.com/ru/post/971726/


All Articles