Jpeg from Tiff (jpeg-compressed)

How can I extract an image from a compressed TIFF JPEG file?

I read bytes according to the StripOffests and StripBytesCount fields, but I could not load the image from them.

+6
source share
4 answers

The old TIFF-JPEG style (compression type 6) basically fills a regular JFIF file inside a TIFF wrapper. The new TIFF-JPEG style (compression type 7) allows you to store JPEG table data (Huffman, quantization) in a separate tag (0x015B JPEGTables). This allows you to put JPEG data bands with SOI / EOI markers in a file without having to repeat the Huffman and Quantization tables. This is probably what you see with your file. Individual bands begin with the FFD8 sequence, but there are no Huffman and quantization tables. Thus, Photoshop products usually write files.

+3
source

Using JAI:

int TAG_COMPRESSION = 259; int TAG_JPEG_INTERCHANGE_FORMAT = 513; int COMP_JPEG_OLD = 6; int COMP_JPEG_TTN2 = 7; SeekableStream stream = new ByteArraySeekableStream(imageData); TIFFDirectory tdir = new TIFFDirectory(stream, 0); int compression = tdir.getField(TAG_COMPRESSION).getAsInt(0); // Decoder name String decoder2use = "tiff"; if (compression == COMP_JPEG_OLD) { // Special handling for old/unsupported JPEG-in-TIFF format: // {@link: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4929147 } stream.seek(tdir.getField(TAG_JPEG_INTERCHANGE_FORMAT).getAsLong(0)); decoder2use = "jpeg"; } // Decode image ImageDecoder dec = ImageCodec.createImageDecoder(decoder2use, stream, null); RenderedImage img = dec.decodeAsRenderedImage(); 

Great solution, it helped me a lot. Just add, if you have several pages in TIFF, you need to repeat reading the stream with a different directory number in the TIFFDirectory object and repeat all of the above.

 TIFFDirectory tdir = new TIFFDirectory(stream, 1); 
+2
source

The problem with the mentioned libtiff library is that it extracts the image and then saves it again, which means loss of quality in the case of jpg. However, I can do the same without even using a third-party lib, just by calling the GDI + .NET Framework methods.

The original author of this thread is trying to get the jpeg binary without having to compress it again, and that is exactly what I'm trying to do.

This is a possible solution if you can live with a loss of quality and do not want to use any classes of the .NET .NET library.

  public static int SplitMultiPage(string sourceFileName, string targetPath) { using (Image multipageTIFF = Image.FromFile(sourceFileName)) { int pageCount = multipageTIFF.GetFrameCount(FrameDimension.Page); if (pageCount > 1) { string sFileName = Path.GetFileNameWithoutExtension (sourceFileName); for (int i = 0; i < pageCount; i++) { multipageTIFF.SelectActiveFrame(FrameDimension.Page, i); // ein einzelner Frame könnte auch ein anderes Format haben, zB JPG, PNG, BMP, etc. // Damit die Datei die korrekte Endung bekommt, holen wir uns eine Endung aus der Beschreibung des Codecs // Interessanterweise liefert uns das RawFormat im Fall TIFF (der einzige Multiframefall) immer den Codec für TIFF, // statt den des Frames ImageCodecInfo codec = Helpers.GetEncoder(multipageTIFF.RawFormat); string sExtension = codec.FilenameExtension.Split(new char[] { ';' })[0]; sExtension = sExtension.Substring(sExtension.IndexOf('.') + 1); string newFileName = Path.Combine(targetPath, string.Format("{0}_{1}.{2}", sFileName, i + 1, sExtension)); EncoderParameters encoderParams = new EncoderParameters(2); encoderParams.Param[0] = new EncoderParameter(System.Drawing.Imaging.Encoder.SaveFlag, (long)EncoderValue.LastFrame); // für TIF 1 Bit machen wir CompressionCCITT4 Kompression, da das die besten Ergebnisse liefert switch (GetCompressionType(multipageTIFF)) { case 1: // No compression -> BMP? encoderParams.Param[1] = new EncoderParameter(System.Drawing.Imaging.Encoder.Compression, (long)EncoderValue.CompressionNone); break; case 2: // CCITT modified Huffman RLE 32773 = PackBits compression, aka Macintosh RLE encoderParams.Param[1] = new EncoderParameter(System.Drawing.Imaging.Encoder.Compression, (long)EncoderValue.CompressionRle); break; case 3: // CCITT Group 3 fax encoding encoderParams.Param[1] = new EncoderParameter(System.Drawing.Imaging.Encoder.Compression, (long)EncoderValue.CompressionCCITT3); break; case 4: // CCITT Group 4 fax encoding encoderParams.Param[1] = new EncoderParameter(System.Drawing.Imaging.Encoder.Compression, (long)EncoderValue.CompressionCCITT4); break; case 5: // LZW encoderParams.Param[1] = new EncoderParameter(System.Drawing.Imaging.Encoder.Compression, (long)EncoderValue.CompressionLZW); break; case 6: //JPEG ('old-style' JPEG, later overriden in Technote2) case 7: // Technote2 overrides old-style JPEG compression, and defines 7 = JPEG ('new-style' JPEG) { codec = Helpers.GetEncoder(ImageFormat.Jpeg); encoderParams.Param[1] = new EncoderParameter(System.Drawing.Imaging.Encoder.Quality, 90); } break; } multipageTIFF.Save(newFileName, codec, encoderParams); } } return pageCount; } } 

helper method used:

  public static ImageCodecInfo GetEncoder(ImageFormat format) { ImageCodecInfo[] codecs = ImageCodecInfo.GetImageDecoders(); foreach (ImageCodecInfo codec in codecs) { if (codec.FormatID == format.Guid) { return codec; } } return null; } 

Reading the compression flag:

  public static int GetCompressionType(Image image) { /* TIFF Tag Compression IFD Image Code 259 (hex 0x0103) Name Compression LibTiff name TIFFTAG_COMPRESSION Type SHORT Count 1 Default 1 (No compression) Description Compression scheme used on the image data. The specification defines these values to be baseline: 1 = No compression 2 = CCITT modified Huffman RLE 32773 = PackBits compression, aka Macintosh RLE Additionally, the specification defines these values as part of the TIFF extensions: 3 = CCITT Group 3 fax encoding 4 = CCITT Group 4 fax encoding 5 = LZW 6 = JPEG ('old-style' JPEG, later overriden in Technote2) Technote2 overrides old-style JPEG compression, and defines: 7 = JPEG ('new-style' JPEG) Adobe later added the deflate compression scheme: 8 = Deflate ('Adobe-style') The TIFF-F specification (RFC 2301) defines: 9 = Defined by TIFF-F and TIFF-FX standard (RFC 2301) as ITU-T Rec. T.82 coding, using ITU-T Rec. T.85 (which boils down to JBIG on black and white). 10 = Defined by TIFF-F and TIFF-FX standard (RFC 2301) as ITU-T Rec. T.82 coding, using ITU-T Rec. T.43 (which boils down to JBIG on color). */ int compressionTagIndex = Array.IndexOf(image.PropertyIdList, 0x103); PropertyItem compressionTag = image.PropertyItems[compressionTagIndex]; return BitConverter.ToInt16(compressionTag.Value, 0); } 
+2
source

If you are trying to extract the actual image from TIFF, JPEG or otherwise, you are better off using a library such as libtiff for this. TIFF is a very complex specification, and although you can do it yourself and get one or two classes of images, most likely you will not be able to handle other cases that arise often, especially the "old style" JPEG, which is a subformat that was imposed TIFF and does not fit very well into the overall.

My company, Atalasoft , is making a .NET product that contains a very good TIFF codec. If you only need to worry about single page images, our free product will work just fine for you.

In the .NET domain, you can also see the Bit Miracle managed version of libtiff. This is a pretty decent library port.

+1
source

Source: https://habr.com/ru/post/886331/


All Articles