Problem
As already explained in this answer , the problem here is that
when non-incremental storage of a document with an added image, PDFBox 1.8.9 does this using a cross-reference table, regardless of whether the source file used a table or stream; if the source file used the stream, the entries in the dictionaries of the cross reference stream are copied to the dictionary trailer ;
... 0000033667 00000 n 0000033731 00000 n trailer << /DecodeParms << /Columns 4 /Predictor 12 >> /Filter /FlateDecode /ID [<5BD95916CAE5E84E9D964396022CBDCD> <6420B4547602C943AF37DD6C77496BE8>] /Info 6 0 R /Length 61 /Root 1 0 R /Size 35 /Type /XRef /W [1 2 1] /Index [20 22] >> startxref 35917 %%EOF
(Most of these trailer entries here are useless or even misleading, see below.)
while gradually saving the signature, COSWriter.doWriteXRefInc uses COSDocument.isXRefStream to determine if the existing document (the one we saved as described above) is used for cross-reference flow. As mentioned above, this is not the case. Unfortunately, however, COSDocument.isXRefStream in PDFBox 1.8.9 is implemented as
public boolean isXRefStream() { if (trailer != null) { return COSName.XREF.equals(trailer.getItem(COSName.TYPE)); } return false; }
Therefore, the misleading record trailer Type made in the PDFBox suggests that it should use a cross-reference stream.
The result is a document whose initial version ends with a cross-reference table and strange trailer entries and the second version of which ends with a cross reference stream. This is not true.
Bypass
However, fortunately, understanding how the problem arises is a workaround: deleting a difficult recording trailer , for example. eg:
inputBytes = os.toByteArray(); pdDocument = PDDocument.load(new ByteArrayInputStream(inputBytes)); pdDocument.getDocument().getTrailer().removeItem(COSName.TYPE);
As part of this work, both versions in the signed document use cross-reference tables, and the signature is valid.
Beware if upcoming versions of PDFBox change to preserve documents downloaded from sources using cross-references using xref streams, you also need to delete the crawl again.
I would suggest that this does not happen in versions 1.xx, and version 2.0.0 will introduce a fundamentally changed API, so the source code will not work out of the box, anyway.
Other ideas
I tried other ways to get around this problem by trying
- save the first manipulation as an incremental update, or
- add the image during the same incremental update as the signature,
Wed SignLikeUnOriginalToo.java but failed. It seems that incremental updates to PDFBox 1.8.9 work correctly for adding signatures.
Other ideas revised
Having studied additional versions using PDFBox, I tried other ideas again and now succeeded!
The most important part is to mark the added and changed objects as updated, including the path from the document catalog.
Applying the first idea (adding an image as an explicit intermediate revision) makes up this change in doSign :
... FileOutputStream fos = new FileOutputStream(intermediateDocument); FileInputStream fis = new FileInputStream(intermediateDocument); byte inputBytes[] = IOUtils.toByteArray(inputStream); PDDocument pdDocument = PDDocument.load(new ByteArrayInputStream(inputBytes)); PDJpeg ximage = new PDJpeg(pdDocument, ImageIO.read(logoStream)); PDPage page = (PDPage) pdDocument.getDocumentCatalog().getAllPages().get(0); PDPageContentStream contentStream = new PDPageContentStream(pdDocument, page, true, true); contentStream.drawXObject(ximage, 50, 50, 356, 40); contentStream.close(); pdDocument.getDocumentCatalog().getCOSObject().setNeedToBeUpdate(true); pdDocument.getDocumentCatalog().getPages().getCOSObject().setNeedToBeUpdate(true); page.getCOSObject().setNeedToBeUpdate(true); page.getResources().getCOSObject().setNeedToBeUpdate(true); page.getResources().getCOSDictionary().getDictionaryObject(COSName.XOBJECT).setNeedToBeUpdate(true); ximage.getCOSObject().setNeedToBeUpdate(true); fos.write(inputBytes); pdDocument.saveIncremental(fis, fos); pdDocument.close(); pdDocument = PDDocument.load(intermediateDocument); PDSignature signature = new PDSignature(); ...
(as in SignLikeUnOriginalToo.java method doSignTwoRevisions )
Applying the second idea (adding an image as part of a signature revision) makes up this change in doSign :
... byte inputBytes[] = IOUtils.toByteArray(inputStream); PDDocument pdDocument = PDDocument.load(new ByteArrayInputStream(inputBytes)); PDJpeg ximage = new PDJpeg(pdDocument, ImageIO.read(logoStream)); PDPage page = (PDPage) pdDocument.getDocumentCatalog().getAllPages().get(0); PDPageContentStream contentStream = new PDPageContentStream(pdDocument, page, true, true); contentStream.drawXObject(ximage, 50, 50, 356, 40); contentStream.close(); page.getResources().getCOSObject().setNeedToBeUpdate(true); page.getResources().getCOSDictionary().getDictionaryObject(COSName.XOBJECT).setNeedToBeUpdate(true); ximage.getCOSObject().setNeedToBeUpdate(true); PDSignature signature = new PDSignature(); ...
(as in SignLikeUnOriginalToo.java method doSignOneStep )
Both options are clearly preferable to the initial approach.