Image Conversion Library: Word, PDF, Excel for Images

We have a requirement to convert any incoming documents that are in Excel, PDF and Word into images. Any recommendations?

I am NOT sure if ImageMagik will do this, but I understand that it is ONLY for converting image formats, and I think it processes PDF. What about Excel and Word?

Thank you in advance

+4
source share
3 answers

First you can convert everything to pdf using:

$ libreoffice --headless --invisible --convert-to pdf *. libreofficeextension

and then use imagemagick ...

you may have some formatting problems in words and especially in powerpoint

+6
source

You are right - imagemagick will not process MS Office formats, since it only handles image format conversion.

For PDF files, you can simply use imagemagick directly:

 convert -density 400 filename.pdf filename.jpeg 

It will provide you files:

  • file_name [0] .jpg
  • file name [2] .jpg
  • ...
  • file_name [N-1] .jpg

Where N is the number of pages in your document. pdf2ps will achieve the same, but you will need to play with command line options to get the same output quality.

For MS Office products, I remember that there is some kind of API that allows you to access set functions (it was MS Office 2007 from memory), for example, open a file and export it to PDF. If you can get the information in PDF format, you can use the method above to convert it to images. Some negative points:

  • It was many years ago at my previous work, and I can’t remember what exactly it was caused or how to use it.
  • I remember that the output PDF formatting was not big (not 100% like on the screen), but it is readable. Perhaps this has improved since the last time I used it.
  • I have a vague memory of how it launches an Excel window in the background, so this is not exactly a command line solution (may not be suitable for servers).
+1
source

Pretty old question, so here is how I decided:

Hope this helps someone.

0
source

Source: https://habr.com/ru/post/1336057/


All Articles