PDF debugging for error

I create PDF files using the PDFClown java library.

Sometimes when I open these files using Adobe Acrobat Reader, I get a known error message:

"There is an error on this page. Acrobat may not display the page correctly. Please contact the person who created the PDF to fix the problem."

The error is displayed when reading (with Adobe) the attached file only when scrolling to the 8th page, and then scrolling the page to 3 pages. Alternatively, display a message up to 33.3% will also be a message.

For write-only purposes, the Foxit reader reads the file flawlessly, as well as other PDF readers such as browsers.

My questions:

  • What happened to my file? ( file attached )

  • How can I find out what's wrong with him? is there a tool that tells you where the lie lies

Thanks!

+6
source share
3 answers

Well, it wasn’t easy -

Due to an error in PDFClown, my main stream of information on the PDF page was corrupted. After that, he had a copy of the last copy. This caused a partial text section without the “BT” start command — which left one “ET” without “BT” at the end of the stream.

As soon as I fixed it, he did a great job.

Thank you all for your help. It would be much harder for me to debug it without the RUPS tool that @Bruno suggested.

change

The error was in Buffer.java:clone () (line 217)

instead of line:

clone.append (data);

should be:

clone.append (data, 0, this length);

Without this correction, it clones the entire data buffer and sets the cloned buffer length to the data length []. This is very problematic if Buffer.length is less than the length of the data []. The result in my case was that there was garbage at the end of the stream.

+4
source

The error is displayed when reading (with Adobe) the attached file only when scrolling to the 8th page, and then scrolling the page to 3 pages. Alternatively, display a message up to 33.3% will also be a message.

Well, it gets easier for me, I just open the PDF file and scroll down using the arrow keys. As soon as the top 2 cm of page 3 appears, a message appears.

What happened to my file?

The contents of pages 1 and 2 look fine, so let's look at the contents of page 3.

My original reason for using text operations (especially Tf and Tw ) outside a text object was incorrect, as Stefano Cizzolini pointed out: external text objects, namely text state operations, are indeed allowed, cf. Figure 9 from the PDF specification:

Graphics Objects

Thus, while the less common text state operations at the page description level are completely fine.

After my wrong attempt to explain the problem, the OP's own answer showed that

The main stream of information on the PDF page has been corrupted. After that, he had a copy of the last copy. This caused a partial text section without the “BT” start command — which left one “ET” without “BT” at the end of the stream.

An ET without a previous BT will indeed be a mistake and most likely will be accompanied by operations at the wrong level ... Checking the contents of the stream is this third page (the focused page of this problem), however, I could not find any unmatched ET . However, during this check, I found that the content stream contains more than 2000 reverse 0 bytes! Adobe Reader does not seem to handle these 0 bytes.

Detected OP error may explain the problem:

in the .java buffer: clone() (line 217)

instead of line:

 clone.append(data); 

should be:

 clone.append(data, 0, this.length); 

Without this correction, it clones the entire data buffer and sets the cloned Buffer length to data[].length . This is very problematic if Buffer.length,, is less than data[].length .

Trailing 0 bytes may be the result of such a buffer copy error.

In addition, the symptoms detected by the OP (after they ended, he had a copy of the last instance) can also be the result of such an error. Therefore, I assume that the OP detected these symptoms on a different page, and not on page 3, but fixing the error caused all the symptoms.

How can I find out what's wrong with him? is there a tool that tells you where the error lies

There are PDF parses, for example. Preflight tool included with Adobe Acrobat. but even that fails in your file.

Thus, you need to extract the contents of the page (using a PDF browser, such as RUPS ) and check manually using the PDF specification on another screen.

+3
source

A general pdf debugging message could also help, as rups / pdfstreamdump, etc. are mentioned here. How do you debug PDF files?

0
source

Source: https://habr.com/ru/post/953859/


All Articles