Copy part of byte [] array to PDFReader

This is a continuation of the ongoing struggle to reduce the mention of memory loading in How to replenish a byte array using SqlDataReader?

So, I have an array of bytes whose size is set, for this example I will say a new byte [400000]. Inside this array I will place PDFs of different sizes (less than 400,000).

Psuedo code:

public void Run() { byte[] fileRetrievedFromDatabase = new byte[400000]; foreach (var document in documentArray) { // Refill the file with data from the database var currentDocumentSize = PopulateFileWithPDFDataFromDatabase(fileRetrievedFromDatabase); var reader = new iTextSharp.text.pdf.PdfReader(fileRetrievedFromDatabase.Take((int)currentDocumentSize ).ToArray()); pageCount = reader.NumberOfPages; // DO ADDITIONAL WORK } } private int PopulateFileWithPDFDataFromDatabase(byte[] fileRetrievedFromDatabase) { // DataAccessCode Goes here int documentSize = 0; int bufferSize = 100; // Size of the BLOB buffer. byte[] outbyte = new byte[bufferSize]; // The BLOB byte[] buffer to be filled by GetBytes. myReader = logoCMD.ExecuteReader(CommandBehavior.SequentialAccess); Array.Clear(fileRetrievedFromDatabase, 0, fileRetrievedFromDatabase.Length); if (myReader == null) { return; } while (myReader.Read()) { documentSize = myReader.GetBytes(0, 0, null, 0, 0); // Reset the starting byte for the new BLOB. startIndex = 0; // Read the bytes into outbyte[] and retain the number of bytes returned. retval = myReader.GetBytes(0, startIndex, outbyte, 0, bufferSize); // Continue reading and writing while there are bytes beyond the size of the buffer. while (retval == bufferSize) { Array.Copy(outbyte, 0, fileRetrievedFromDatabase, startIndex, retval); // Reposition the start index to the end of the last buffer and fill the buffer. startIndex += retval; retval = myReader.GetBytes(0, startIndex, outbyte, 0, bufferSize); } } return documentSize; } 

The problem with the above code is that I keep getting the error "Rebuild trailer not found. Original error: PDF startxref not found" error when trying to access PDF Reader. I believe because the byte array is too long and has an end of 0. But since I use an byte array, so I do not constantly build new objects in LOH, I need to do this.

So, how do I get only the part of the array that I need and send it to PDFReader?

Update

So, I looked at the source and realized that I have some variables from my actual code that was confusing. I basically reuse the fileRetrievedFromDatabase object in each iteration of the loop. Since it is passed by reference, it is cleared (set to all zero) and then populated in the PopulationFileWithPDFDataFrom database. This object is then used to create a new PDF file.

If I didnโ€™t do this, a new array of large bytes would be created at each iteration, and the Large Object Heap heap is full and ultimately throws an OutOfMemory exception.

+4
source share
2 answers

Apparently, since the while loop is currently structured, it did not copy the data in the last iteration. You must add this:

 if (outbyte != null && outbyte.Length > 0 && retval > 0) { Array.Copy(outbyte, 0, currentDocument.Data, startIndex, retval); } 

Now it works, although I will definitely need to reorganize.

+1
source

You have at least two options:

  • Regarding your buffer, like a circular buffer with two indices for the start and end positions. you need the index of the last byte written in outByte , and you need to stop reading when you reach this index.
  • Just read the same number of bytes as in your data array, so as not to read the โ€œunknownโ€ parts of the buffer that do not belong to the same file.

In other words, instead of bufferSize , use bufferSize as the last parameter.

 // Read the bytes into outbyte[] and retain the number of bytes returned. retval = myReader.GetBytes(0, startIndex, outbyte, 0, data.Length); 

If the data length is 10 and your outByte buffer is 15, you should only read data.Length , not bufferSize .

However, I still don't see how you reuse the "outbyte" buffer if that is what you are doing ... I just don't stick to what you indicated in your answer. Maybe you can clarify what exactly is being reused.

+1
source

Source: https://habr.com/ru/post/1394475/


All Articles