Can I use .NET DeflateStream to create a PDF?

I play with the ability to create pdf files using C # code. I studied the PDF specifications and was able to create a working PDF file made by inputting data strings and encoding them into byte arrays using UTF8 encoding.

The problem I am facing is that I am trying to use DeflateStream for PDF stream objects. It just doesn't work:

Here is the text version of the PDF object in question ( \ r \ n is at the end of each line, just not visible here):

 5 0 obj <</Length 45>> stream BT 70 50 TD /F1 12 Tf (Hello, world!) Tj ET endstream endobj 

When I try to use the DeflateStream class to compress the string BT 70 50 TD /F1 12 Tf (Hello, world!) Tj ET , the pdf file does not work. I noticed that many other libraries, such as iTextSharp, use their own Deflate compression implementation.

Is there any reason Microsoft is working with the DeflateStream class not working? Am I using it incorrectly or incorrectly implemented or what?


I know that PDF files are binary (not text), but if I don't encrypt anything, you can view it all as text. Here is the entire PDF file for reference (in plain text, also \ r \ n is at the end of each line, it just doesn't appear here):

 %PDF-1.7 1 0 obj <</Type /Catalog /Pages 2 0 R>> endobj 2 0 obj <</Type /Pages /MediaBox [ 0 0 200 200 ] /Count 1 /Kids [ 3 0 R ]>> endobj 3 0 obj <</Type /Page /Parent 2 0 R /Resources <</Font <</F1 4 0 R>>>> /Contents 5 0 R>> endobj 4 0 obj <</Type /Font /Subtype /Type1 /BaseFont /Times-Roman>> endobj 5 0 obj <</Length 45>> stream BT 70 50 TD /F1 12 Tf (Hello, world!) Tj ET endstream endobj xref 0 6 0000000000 65535 f 0000000017 00000 n 0000000067 00000 n 0000000153 00000 n 0000000252 00000 n 0000000325 00000 n trailer <</Size 6/Root 1 0 R>> startxref 422 %%EOF 
+4
source share
1 answer

Is there any reason Microsoft is working with the DeflateStream class not working? Am I using it incorrectly or incorrectly implemented or what?

DeflateStream actually implements RFC 1951 (DEFLATE), where PDF is compressed using a compression method compatible with RFC 1950 . This is described in detail in this related Microsoft Connect error report .

A simple workaround would be to use a third-party compression library such as DotNetZip , which will support the correct format. At the same time, the Connect report says that skipping the first two bytes can lead to the fact that this will work in most cases.

+10
source

Source: https://habr.com/ru/post/1499006/


All Articles