Adding information to pdf, merging PyPDF2 too slow

I want the text on each pdf page. This text is html code that looks like <p style="color: #ff0000">blabla</p> to look red in the final document, I convert it to pdf (html2pdf lib), then I combine it ( PyPDF2 lib) with every page of my pdf .... but the merge is very slow!

My question is: Is there a faster way to merge pdf than the page.mergePage method for PyPDF2? (Or maybe there is a faster way to add my text to this pdf?)

Thanks! (using python 2.7.5 on Windows 8)

+4
source share
1 answer

Since all you do is add some text to the page, you can probably speed up the process by simply editing the content flows of the pages directly. The merger is associated with fonts, other resources, clipping boxes, etc., which significantly slows down the process. If you really need to change some of these things, the solution becomes more complex. Code example:

 TEXT_STREAM = [] # The PS operations describing the creation of your text def add_text(page): "Add the required text to the page." contents = page.getContents() if contents is None: stream = ContentStream(TEXT_STREAM, page.pdf) else: contents.operations.extend(TEXT_STREAM) 
0
source

Source: https://habr.com/ru/post/1495751/


All Articles