Reading JPEG in Python (PIL) with broken header

I am trying to open a jpeg file in Python 2.7,

from PIL import Image
im = Image.open(filename)

What didn't work for me

>>> im = Image.open(filename)
Traceback (most recent call last):
  File "<pyshell#810>", line 1, in <module>
    im = Image.open(filename)
  File "/usr/lib/python2.7/dist-packages/PIL/Image.py", line 1980, in open
    raise IOError("cannot identify image file")
IOError: cannot identify image file

although when trying to view it on external viewers, it worked fine. It turns out that the method JpegImageFile._openfrom the file PIL JpegImagePlugin.pythrows an exception SyntaxErrordue to several extraneous 0x00bytes in front of the marker 0xFFDAin the header of the JPEG file,

Corrupt JPEG data: 5 extraneous bytes before marker 0xda

That is, while other programs that I tried simply ignored the unknown marker 0x00at the end of the header, PILthey preferred to raise an exception, preventing me from opening the image.

QUESTION . Besides direct code editing PIL, is there a way to bypass JPEG files with problematic headers?

JpegImageFile, , , :

def _open(self):

    s = self.fp.read(1)

    if ord(s[0]) != 255:
        raise SyntaxError("not a JPEG file")

    # Create attributes
    self.bits = self.layers = 0

    # JPEG specifics (internal)
    self.layer = []
    self.huffman_dc = {}
    self.huffman_ac = {}
    self.quantization = {}
    self.app = {} # compatibility
    self.applist = []
    self.icclist = []

    while 1:

        s = s + self.fp.read(1)

        i = i16(s)

        if MARKER.has_key(i):
            name, description, handler = MARKER[i]
            # print hex(i), name, description
            if handler is not None:
                handler(self, i)
            if i == 0xFFDA: # start of scan
                rawmode = self.mode
                if self.mode == "CMYK":
                    rawmode = "CMYK;I" # assume adobe conventions
                self.tile = [("jpeg", (0,0) + self.size, 0, (rawmode, ""))]
                # self.__offset = self.fp.tell()
                break
            s = self.fp.read(1)
        elif i == 0 or i == 65535:
            # padded marker or junk; move on
            s = "\xff"
        else:
            raise SyntaxError("no marker found")
+4
1

PIL , .

( PIL), .

, , , 2.5.0 . : https://github.com/python-imaging/Pillow/pull/647

- ImageMagick, png, PIL/Pillow.

+4

Source: https://habr.com/ru/post/1533169/


All Articles