Is there a way to programmatically remove all blank pages from a PDF file?

It’s more practical to purchase an e-book now than a dead version. But PDF files often contain blank pages used in print. I usually see between 10-30 blank pages (or pages with the text "This page intentionally left blank.") On the e-book. Is it possible to programmatically delete these blank pages? Currently, I manually identify blank pages and then run them through:

pdftops orig.pdf - | psselect "$range_of_non_blank_pages" | ps2pdf - new.pdf

Thus, the hard part is identifying blank pages. pdftotext will work for the most part, unless the page has only images and text.

In addition, even after deleting many pages and reducing the size of the resulting file after reducing both the source file and the new version (using various methods found in boarding schools), the source file is usually less than a few hundred KB or more. Thus, it seems that the method that I use to remove blank pages does not create the optimal pdf. I have also tried various gui programs and see the same results in this regard.

+3
source share
2 answers

Partial answer: you do not need to go through the postscript (this is probably the reason that you get a larger file). One of the possibilities is

pdftk orig.pdf cat "$ range_of_non_blank_pages" output new.pdf

, , . , CAM:: PDF PDF:: API2 Perl.

+1

, . , Apago commercial PDF Enhancer - , . , , , , .

0

Source: https://habr.com/ru/post/1754519/


All Articles