PDFBox 1.8.10: Fill and Sign PDF creates invalid signatures

I fill out (programmatically) a form (AcroPdf) in a PDF document and sign the document later. I start with doc.pdf, create doc_filled.pdf using the setFields.java PDFBox example. Then I sign doc_filled.pdf, creating a doc? Filled_signed.pdf, using some code based on signature examples and opening pdf in Acrobat Reader. The entered field data is visible and the signature panel informs me

"There are errors in the formatting or the information contained in this signature (invalid array of signature bytes)"

So far I know that:

  • the just signed code (i.e. directly creating the doc_signed.pdf file) creates a valid signature
  • There is a problem for “invisible signatures”, visible signatures, and visible signatures added to existing signature fields.
  • The problem even arises if I do not fill out the form, but only open it and save it, i.e.:

    PDDocument doc = PDDocument.load(new File("doc.pdf")); doc.save(new File("doc_filled.pdf")); doc.close(); 

enough to break the subsequent signature code.

On the other hand, if I take the same doc.pdf file, enter the field values ​​manually in Adobe, the signature code will issue valid signatures.

What am I doing wrong?

Update:

@mkl asked me to provide files, I say (I do not have enough reputation at present to publish all files as links, sorry for this inconvenience):

the latter was created by signing and filling out a document at a time, using

  doc.saveIncremental(); 

As I wrote in the commentary, some

  setNeedToBeUpdate(true); 

seems to be missing. Regarding the second comment of @mkl, I found this SO question: The value of the saved text field does not display correctly in PDF generated using PDFBOX , which also applies to some entered text that does not display. I gave him my first attempt by applying

  setBoolean(COSName.getPDFName("NeedAppearances"), true); 

in the dictionary of fields and forms, which then shows the context of the fields, but the signature is not added at the end. Nevertheless, I have to look further.

Update: The story continues here: PDFBox 1.8.10: Fill out and sign the document, filling out the filling again

+5
source share
1 answer

The reason for the original OP problem, that is, after downloading the PDF file (to fill out the form) using the PDFBox and then saving it, this new PDF code cannot be successfully signed using the PDFBox signature code, is already explained in detail in this answer , in a word:

  • When saving documents regularly, the PDFBox does this using a cross-reference table.

    • If the document you want to save regularly was downloaded from a PDF with a cross-reference stream, all dictionary entries with a cross-reference stream are saved in the trailer dictionary.
  • When saving documents during the application of the signature, the PDFBox creates an incremental update; since such incremental updates require the update to use the same cross-reference as the original version, the PDFBox in this case tries to use the same technique.

    • To recognize the method used, the PDFBox looks at the type of dictionary entry in its representation of the document into which the dictionary of the trailer or the cross reference stream has been loaded: If there is a Type with the value XRef (which is specified for cross-references), a stream is assumed, otherwise a table.

Thus, in the case of OP original doc.pdf PDF, which has a cross-reference flow:

  • After downloading and filling out the form, the document is saved regularly, i.e. using the cross-reference table, but all previous cross-reference entries, including Type , are copied to the trailer. ( doc_filled.pdf )

  • After downloading this saved PDF using the cross-reference table for signature, it is saved again using an incremental update. PDFBox assumes (because of the Type trailer entry) that the existing file has a cross-reference stream and therefore also uses a cross-reference stream at the end of an incremental update. ( doc_filled_signed.pdf )

  • Thus, as a result, the completed and then signed PDF has two revisions: an internal one with a cross reference table, an external one with a cross reference stream.

  • Since this is not true, Adobe Reader, when loading a PDF file, restores it to the internal representation of the document. Recovery modifies the bytes of a document. Thus, the signature in the eyes of Adobe Reader is broken.

  • Most other signature validators do not perform such repairs, but verify the signature of the document as is. They successfully confirm the signature.

The answer mentioned above also offers several ways:

  • A: After downloading the PDF to fill out the form, delete the Type entry from the trailer before saving it regularly. If a signature is applied to this file, the PDFBox will consider the crosstab (since there is no misleading Type . Thus, updating the incremental signature will be valid.

  • B: use incremental update to save changes in filling out the form, either in a separate run, or in the same run as when signing. This also leads to a valid incremental update.

In general, I would suggest the latter option, because the former option will probably break if the PDFBox save programs are ever compatible with each other.

Unfortunately, the latter option requires marking the added and changed objects as updated, including the path from the document catalog. If this is not possible, or at least too cumbersome, the first option may be preferable.


In this case, the OP tried to use the latter option ( doc_filled_and_signed.pdf ):

At a point in time, the contents of the text field is visible only when the text field is selected (using the Acrobat reader and previewing the same behavior). I mark the PDF file, all its parents, AcroForm, the directory, as well as the page on which it is displayed.

He noted the changed field as an updated, but not related to it appearance stream, which is automatically created by the PDFBox when the form field value is set.

Thus, in the PDF file of the result, the field has a new meaning, but the old empty stream appears. Only when you click in a field does Adobe Reader create a new look based on the value to be edited.

Thus, the OP must also mark a new normal appearance stream (the dictionary of the form word contains an AP entry referencing a dictionary in which N refers to the normal appearance stream). Alternatively (if searching for modified or added records becomes too cumbersome), he may try another option.

+2
source

Source: https://habr.com/ru/post/1232693/


All Articles