In my work, sometimes I have to combine from several to several hundred pdf files. All the time I used the Writer and ImportedPages classes. But when I combined all the files into one, the file size becomes huge, the sum of all the combined file sizes, because the fonts are attached to each page and are not reused (fonts are embedded in every page, not the entire document).
Not so long ago, I learned about the PdfSmartCopy class, which repeats embedded fonts and images. And here the problem works. Very often, before combining files, I have to add additional content (images, text) to them. For this, I usually use the PdfContentByte from the Writer object.
Document doc = new Document(); PdfWriter writer = PdfWriter.GetInstance(doc, new FileStream("C:\test.pdf", FileMode.Create)); PdfContentByte cb = writer.DirectContent; cb.Rectangle(100, 100, 100, 100); cb.SetColorStroke(BaseColor.RED); cb.SetColorFill(BaseColor.RED); cb.FillStroke();
When I do a similar thing with the PdfSmartCopy object, the pages are merged, but additional content is added. Full code of my test with PdfSmartCopy :
using (Document doc = new Document()) { using (PdfSmartCopy copy = new PdfSmartCopy(doc, new FileStream(Path.GetDirectoryName(pdfPath[0]) + "\\testas.pdf", FileMode.Create))) { doc.Open(); PdfContentByte cb = copy.DirectContent; for (int i = 0; i < pdfPath.Length; i++) { PdfReader reader = new PdfReader(pdfPath[i]); for (int ii = 0; ii < reader.NumberOfPages; ii++) { PdfImportedPage import = copy.GetImportedPage(reader, ii + 1); copy.AddPage(import); cb.Rectangle(100, 100, 100, 100); cb.SetColorStroke(BaseColor.RED); cb.SetColorFill(BaseColor.RED); cb.FillStroke(); doc.NewPage();
Now I have a few questions:
- Is it possible to edit the
PdfSmartCopy DirectContent object? - If not, is there another way to combine multiple PDF files into one without increasing its size, and still be able to add additional content to the pages when merging?