Extract images from pdf with php

I am trying to extract images from a PDF using PHP.

I partially succeeded - I have a grayscale image ...

gray scale image from pdf

... and I know that I need to apply the formula to get the flowers!

But first, I need to convert the binary image data to numbers, and then apply the formula from the Adobe PDF specification .

Basically, suppose you have an attached image (with all the data from the PDF, no change) and

1. CYMK image 2. 8 bit for each component 

and you need to convert it to a color image using PHP using the Adobe spec application in the "Image" section.

What can I do to solve this problem?

+4
source share
2 answers

You can use pdfimages .

The package is installed during xpdf installation. The manpage describes:

Pdfimages saves images from the Portable Document Format (PDF) as portable images (PPM), portable bitmap (PBM), or JPEG files.

Pdfimages reads a PDF file, scans one or more pages, a PDF file and writes one PPM, PBM or JPEG file for each image, image-root-nnn.xxx, where nnn is the image number and xxx is the image type (.ppm ,. pbm, .jpg).

NB: pdfimages extracts raw image data from a PDF file without any additional conversion. Any rotation, cropping, color inversion, etc. performed by the flow of PDF content is ignored.

+3
source

The image you are showing does not have shades of gray, it is just flipped. Try to invert color bytes and you will get good colors. There is a problem with CMYK color memory in JPEG images because Photoshop saves 100% of the colors as 0x0.

edit: how to invert to PHP taken from this blogpost It works with rgb data and needs to be adapted to work with CMYK

 <?php function image_filter_invert(&$image){ $width = imagesx($image); $height = imagesy($image); for($x = 0; $x < $width; $x++){ for($y = 0; $y < $height; $y++){ $rgb = imagecolorat($image, $x, $y); $r = 0xFF-(($rgb>>16)&0xFF); $g = 0xFF-(($rgb>>8)&0xFF); $b = 0xFF-($rgb&0xFF); $color = imagecolorallocate($image, $r, $g, $b); imagesetpixel($image, $x, $y, $color); } } } ?> 
+1
source

Source: https://habr.com/ru/post/1443383/


All Articles