Acrobat PDF Optimization vs. Ghostscript

I have a PDF file that I would like to optimize. I get the file from an external source, so I don’t have the means to recreate it from the very beginning.

When I open a file in Acrobat and request resources, it says that the fonts in the file occupy 90% + space. If I save the file as a postscript and then save the postscript file in an optimized PDF file, the file will be significantly smaller (80% smaller) and the fonts will still be embedded.

I am trying to recreate these results using ghostscript. I have tried various parameter permutations with pswrite and pdfwrite, but what happens when I do the initial conversion from PDF to Postscript, the text is converted to an image. When I convert back to PDF, the links to the fonts disappear, so I get a PDF file with the text "imaged" and not the actual fonts.

The file contains 22 Type1 built-in custom fonts that I have. I added fonts to the ghostscript search path and proved that ghostscript can find them with:

gs \ -I/home/nauc01 -sFONTPATH=/home/nauc01/fonts/Type1 \ -o 3783QP.pdf \ -sDEVICE=pdfwrite \ -g5950x8420 \ -c "200 700 moveto" \ -c "/3783QP findfont 60 scalefont setfont" \ -c "(TESTING !!!!!!) show showpage" 

The resulting file has a correctly embedded font.

I also tried using ghostscript to go from PDF to PDF as follows:

 gs \ -sDEVICE=pdfwrite \ -sNOPAUSE \ -I/home/nauc01 \ -dBATCH \ -dCompatibilityLevel=1.4 \ -dPDFSETTINGS=/printer \ -CompressFonts=true \ -dSubsetFonts=true \ -sOutputFile=output.pdf \ input.pdf 

but the output is usually larger than the input, and I can’t view the file with anything other than ghostscript (adobe reader gives "the object label is poorly formatted").

I cannot provide the source file because it contains confidential information, but I will try to answer any questions that need to be answered regarding them.

Any ideas? Thanks in advance.

+4
source share
2 answers

Do not use pswrite. As you discovered, the text will do this. instead use a ps2write device that saves fonts and text.

You do not say which version of Ghostscript you are using, but I would recommend you use a recent one.

One point; Ghostscript does not “optimize” PDF as Acrobat does by recreating it. The original PDF is fully interpreted to create a sequence of operations that mark the page, pdfwrite (and ps2write), and then create a new file that contains only those operations inside.

If you select a subset of fonts, then only the necessary glyphs will be included. If the original PDF contains extraneous information (for example, Adobe Illustrator typically includes a full copy of the .ai file), then this will be discarded. This may result in a smaller file, or it may not be so.

Please note that pdfwrite does not support compressed xref and some other functions later, so some files can grow significantly.

I personally would not go through ps2write, as this simply adds another layer of prcoessing and discarding information. I would just use pdfwrite to create a new PDF file. If you find files for which this does not work (using the current code), you should raise a bug report at http://bugs.ghostscript.com so that someone can address the problem.

+2
source

You might want to try Multivalent Compress . It has a (experimental) version of a subset of the embedded fonts that can make your PDF much smaller. It also contains many switches that provide better compression, sometimes at the expense of quality (such as JPEG bitmap compression).

Unfortunately, the latest version of Multivalent no longer includes tools. Google for Multivalent20060102.jar , this version still includes them. To perform compression:

 java -classpath /path/to/Multivalent20060102.jar tool.pdf.Compress [options] <pdf file> 
0
source

Source: https://habr.com/ru/post/1381817/


All Articles