When decoding a PDF image as FlateDecode via iTextSharp, the image is distorted and I cannot understand why.
The recognized bpp is Format1bppIndexed . If I change the PixelFormat value to Format4bppIndexed , the image can be recognized to some extent (compression, coloring off, but readable) and repeated 4 times in a horizontal manner. If I set the pixel format to Format8bppIndexed , it will also be recognized to some extent and will be duplicated 8 times in a horizontal manner.
Below is the image after Format1bppIndexed pixel format. Unfortunately, I cannot show others due to security restrictions.

Below is the code, which is essentially the only solution I came across, littered around SO and the Internet.
int xrefIdx = ((PRIndirectReference)obj).Number; PdfObject pdfObj = doc.GetPdfObject(xrefIdx); PdfStream str = (PdfStream)(pdfObj); byte[] bytes = PdfReader.GetStreamBytesRaw((PRStream)str); string filter = ((PdfArray)tg.Get(PdfName.FILTER))[0].ToString(); string width = tg.Get(PdfName.WIDTH).ToString(); string height = tg.Get(PdfName.HEIGHT).ToString(); string bpp = tg.Get(PdfName.BITSPERCOMPONENT).ToString(); if (filter == "/FlateDecode") { bytes = PdfReader.FlateDecode(bytes, true); System.Drawing.Imaging.PixelFormat pixelFormat; switch (int.Parse(bpp)) { case 1: pixelFormat = System.Drawing.Imaging.PixelFormat.Format1bppIndexed; break; case 8: pixelFormat = System.Drawing.Imaging.PixelFormat.Format8bppIndexed; break; case 24: pixelFormat = System.Drawing.Imaging.PixelFormat.Format24bppRgb; break; default: throw new Exception("Unknown pixel format " + bpp); } var bmp = new System.Drawing.Bitmap(Int32.Parse(width), Int32.Parse(height), pixelFormat); System.Drawing.Imaging.BitmapData bmd = bmp.LockBits(new System.Drawing.Rectangle(0, 0, Int32.Parse(width), Int32.Parse(height)), System.Drawing.Imaging.ImageLockMode.WriteOnly, pixelFormat); Marshal.Copy(bytes, 0, bmd.Scan0, bytes.Length); bmp.UnlockBits(bmd); bmp.Save(@"C:\temp\my_flate_picture-" + DateTime.Now.Ticks.ToString() + ".png", ImageFormat.Png); }
What do I need to do so that my image extraction works as desired when working with FlateDecode ?
NOTE I do not want to use another library to extract images. I am looking for a solution using ONLY iTextSharp and .NET FW. If the solution exists through Java (iText) and is easily ported to .NET FW bits, that would also be sufficient.
UPDATE The ImageMask property is ImageMask to true, which implies the absence of color space and therefore implicitly black and white. When bpp enters 1, the value of the PixelFormat should be Format1bppIndexed , which, as mentioned earlier, creates the inline image seen above.
UPDATE To get the image size, I extracted it using Acrobat X Pro, and the image size for this particular example was specified as 2403x3005. When retrieving via iTextSharp, the size was specified as 2544x3300. I resized the image in the debugger so that the mirror is 2403x3005, however when calling Marshal.Copy(bytes, 0, bmd.Scan0, bytes.Length); I get an exception.
Attempted to read or write protected memory. It is often that other memory is corrupted.
My guess is that this is due to resizing and therefore no longer matches the byte data used.
UPDATE : as recommended by Jimmy, I confirmed that calling PdfReader.GetStreamBytes returns a byte length [] equal to / 8 width, since GetStreamBytes should call FlateDecode . In the manual call of FlateDecode and the call of PdfReader.GetStreamBytes , the byte length [] 1049401 was created, and the width / 8 was 2544 * 3300/8 or 1049400, so there is a difference of 1. Not sure if this will be the root cause or not, is disabled by one; however, I'm not sure how to decide if this is true.
UPDATE . When trying the approach mentioned by kuujinbo, I encounter an IndexOutOfRangeException when I try to call renderInfo.GetImage(); inside the RenderImage . The fact that the * height / 8 width, as mentioned above, is disabled by 1 compared to the byte length [] when calling FlateDecode , makes me think that they are all the same; however, the solution is still eluding me.
at System.util.zlib.Adler32.adler32(Int64 adler, Byte[] buf, Int32 index, Int32 len) at System.util.zlib.ZStream.read_buf(Byte[] buf, Int32 start, Int32 size) at System.util.zlib.Deflate.fill_window() at System.util.zlib.Deflate.deflate_slow(Int32 flush) at System.util.zlib.Deflate.deflate(ZStream strm, Int32 flush) at System.util.zlib.ZStream.deflate(Int32 flush) at System.util.zlib.ZDeflaterOutputStream.Write(Byte[] b, Int32 off, Int32 len) at iTextSharp.text.pdf.codec.PngWriter.WriteData(Byte[] data, Int32 stride) at iTextSharp.text.pdf.parser.PdfImageObject.DecodeImageBytes() at iTextSharp.text.pdf.parser.PdfImageObject..ctor(PdfDictionary dictionary, Byte[] samples) at iTextSharp.text.pdf.parser.PdfImageObject..ctor(PRStream stream) at iTextSharp.text.pdf.parser.ImageRenderInfo.PrepareImageObject() at iTextSharp.text.pdf.parser.ImageRenderInfo.GetImage() at cyos.infrastructure.Core.MyImageRenderListener.RenderImage(ImageRenderInfo renderInfo)
UPDATE . Trying to modify the various methods listed here in my original solution, as well as the solution offered by kuujinbo with another page in the PDF, creates images; however, problems always arise when the filter type is /FlateDecode , and no image is created for this instance.