EPPlus Lack of Memory Large Data Set Problem

Removing a system from memory. I see that Stream memory is only cleared when saved. We have 1.5 - 2GB Datasets.

I am using EPPlus Version 3.1.3.0

In the code we do the following.

We loop through

--> Create a Package --> each table in the datareader --> Add WorkSheet to the Package --> Dispose Each table. --> Save the Package. 

Each Datatable has a size of 300Mg to 15 tables from the system.

This causes a problem, I wrote it down in detail @ https://epplus.codeplex.com/workitem/15085

I still want to be able to use EPPlus with a very nice API. but is there a better way to free the worksheet as soon as we add it to the package.

Thanks for the help.

+6
source share
4 answers

Unfortunately, this is apparently the main limitation of EPPlus - you can find others who posted about this on their Codeplex page. I faced a similar problem when exporting a large dataset - single tables with 115+ columns wide and 60K + rows. Usually about 30-35 thousand. Lines - this is when he runs out of memory. What happens is each cell created is its own object, which is suitable for a small data set, but in my case it will be 115x60K = ~ 7 million. Since each cell is an object with content (mostly rows), its memory capacity is added up quickly.

At some point in the future, I planned to create XML files manually using Linq2Xml. Xlsx is just a zip file renamed to XML files that make up the contents of the workbook and worksheets. This way you can create an empty xlsx using EPP, save it, open it as zip, pull out sheet1.xml and add the contents of the data using string manipulations. You will also have to work with the sharedstring.xml file that Excel uses to reduce the file size. Perhaps there are other xml files that need updating, as well as with keys or a name.

If you rename any xlxs to a .zip extension, you can see it.

Example sheet1.xml:

Simple Excel File Example

 <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" mc:Ignorable="x14ac" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac"> <dimension ref="A1:C2"/> <sheetViews> <sheetView tabSelected="1" workbookViewId="0"> <selection activeCell="C5" sqref="C5"/> </sheetView> </sheetViews> <sheetFormatPr defaultRowHeight="15" x14ac:dyDescent="0.25"/> <sheetData> <row r="1" spans="1:3" x14ac:dyDescent="0.25"> <cr="A1" t="s"> <v>0</v> </c><cr="B1" t="s"> <v>1</v> </c><cr="C1" t="s"> <v>0</v> </c> </row> <row r="2" spans="1:3" x14ac:dyDescent="0.25"> <cr="A2" t="s"> <v>1</v> </c><cr="B2" t="s"> <v>0</v> </c><cr="C2" t="s"> <v>1</v> </c> </row> </sheetData> <pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/> </worksheet> 

Example sharedstrings.xml:

 <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="6" uniqueCount="2"> <si> <t>AA</t> </si> <si> <t>BB</t> </si> </sst> 

You can see how I manipulated xml in another post:

Create Pivot Table Filters with EPPLUS

Sorry, I could not give you a better answer, but I hope this helps.

+7
source

I had this problem, but I fixed it by switching the " Platform target " option, from x86 to x64 or " Any CPU ". (right-click on the project, then select "Properties", then the "Assembly" tab, then in the "Target platform" select "x64")

The problem is that for the x86 platform you can only use about 1.8 GB of RAM. For the x64 platform, you do not have this limitation.

+7
source

@Ernie correctly refers to some limitations of the current version of EPPlus. They recognized this and worked to fix it. This gives you one of two possible options for working:

1) Go to the beta version of EPPlus 4.0, where they fixed this problem, as well as some other things (although you will use the beta version).

2) The ExcelPackage and ExcelWorksheet implement IDisposable , so you can start to get better performance if you used them in the using() statement.

0
source

Please note if you pass streams to ExcelPackage. In my case, I had a windows service downloading packages using memystream. Now the service has worked several times with OutOfMemory exception.

Reason: ExcelPackage utility does not kill the stream!

Decision:

 using (MemoryStream ms = new MemoryStream(Convert.FromBase64String(excelSheetBase64))) using (ExcelPackage excelPackage = new ExcelPackage(ms)) { // Your code } 
0
source

Source: https://habr.com/ru/post/977848/


All Articles