How to check if InputStream is Gzipped?

Is there any way to check if gzip has enabled InputStream? Here is the code:

public static InputStream decompressStream(InputStream input) { try { GZIPInputStream gs = new GZIPInputStream(input); return gs; } catch (IOException e) { logger.info("Input stream not in the GZIP format, using standard format"); return input; } } 

I tried this way, but it does not work properly - the values ​​read from the stream are invalid. EDIT: Added method that I use to compress data:

 public static byte[] compress(byte[] content) { ByteArrayOutputStream baos = new ByteArrayOutputStream(); try { GZIPOutputStream gs = new GZIPOutputStream(baos); gs.write(content); gs.close(); } catch (IOException e) { logger.error("Fatal error occured while compressing data"); throw new RuntimeException(e); } double ratio = (1.0f * content.length / baos.size()); if (ratio > 1) { logger.info("Compression ratio equals " + ratio); return baos.toByteArray(); } logger.info("Compression not needed"); return content; } 
+45
java inputstream gzip
Jan 27 '11 at 15:43
source share
10 answers

It is not reliable, but it is probably the simplest and does not depend on any external data. Like all decent formats, GZip also starts with a magic number, which can be quickly checked without reading the entire stream.

 public static InputStream decompressStream(InputStream input) { PushbackInputStream pb = new PushbackInputStream( input, 2 ); //we need a pushbackstream to look ahead byte [] signature = new byte[2]; int len = pb.read( signature ); //read the signature pb.unread( signature, 0, len ); //push back the signature to the stream if( signature[ 0 ] == (byte) 0x1f && signature[ 1 ] == (byte) 0x8b ) //check if matches standard gzip magic number return new GZIPInputStream( pb ); else return pb; } 

(Source for magic number: GZip file format specification )

Update: I just confirmed that there is a constant in GZipInputStream called GZIP_MAGIC that contains this value, so if you really want to, you can use the bottom two bytes.

+58
Jan 27 '11 at 16:24
source share

InputStream comes from HttpURLConnection # getInputStream ()

In this case, you need to check if the HTTP response Content-Encoding matches the gzip header.

 URLConnection connection = url.openConnection(); InputStream input = connection.getInputStream(); if ("gzip".equals(connection.getContentEncoding())) { input = new GZIPInputStream(input); } // ... 

All of this is clearly stated in the HTTP specification .




Update : according to how you compressed the source of the stream: this relationship check is pretty ... crazy. Get rid of this. The same length does not necessarily mean that the bytes are the same. Let it always return a gzipped stream, so you can always expect a gzipped stream and just apply the GZIPInputStream without disgusting checks.

+36
Jan 27 '11 at 15:58
source share

I found this useful example that provides a clean implementation of isCompressed() :

 /* * Determines if a byte array is compressed. The java.util.zip GZip * implementaiton does not expose the GZip header so it is difficult to determine * if a string is compressed. * * @param bytes an array of bytes * @return true if the array is compressed or false otherwise * @throws java.io.IOException if the byte array couldn't be read */ public boolean isCompressed(byte[] bytes) throws IOException { if ((bytes == null) || (bytes.length < 2)) { return false; } else { return ((bytes[0] == (byte) (GZIPInputStream.GZIP_MAGIC)) && (bytes[1] == (byte) (GZIPInputStream.GZIP_MAGIC >> 8))); } } 

I experienced this with success:

 @Test public void testIsCompressed() { assertFalse(util.isCompressed(originalBytes)); assertTrue(util.isCompressed(compressed)); } 
+19
Dec 23 '11 at 9:28 a.m.
source share

I believe this is the easiest way to check if the gzip byte array is formatted or not, it does not depend on the support of any type of HTTP or mime

 public static boolean isGzipStream(byte[] bytes) { int head = ((int) bytes[0] & 0xff) | ((bytes[1] << 8) & 0xff00); return (GZIPInputStream.GZIP_MAGIC == head); } 
+8
Mar 11 2018-11-11T00:
source share

Wrap the source stream in a BufferedInputStream, and then wrap it in a GZipInputStream. Then try extracting ZipEntry. If this works, this is a zip file. Then you can use "mark" and "reset" in the BufferedInputStream to return to the original position in the stream after checking.

+1
Jan 27 '11 at 15:50
source share

Not quite what you are asking for, but may be an alternative approach if you are using HttpClient:

 private static InputStream getInputStream(HttpEntity entity) throws IOException { Header encoding = entity.getContentEncoding(); if (encoding != null) { if (encoding.getValue().equals("gzip") || encoding.getValue().equals("zip") || encoding.getValue().equals("application/x-gzip-compressed")) { return new GZIPInputStream(entity.getContent()); } } return entity.getContent(); } 
+1
Jan 27 2018-11-11T00:
source share

This function works fine in Java :

 public static boolean isGZipped(File f) { val raf = new RandomAccessFile(file, "r") return GZIPInputStream.GZIP_MAGIC == (raf.read() & 0xff | ((raf.read() << 8) & 0xff00)) } 

In scala :

 def isGZip(file:File): Boolean = { int gzip = 0 RandomAccessFile raf = new RandomAccessFile(f, "r") gzip = raf.read() & 0xff | ((raf.read() << 8) & 0xff00) raf.close() return gzip == GZIPInputStream.GZIP_MAGIC } 
+1
Aug 22 '16 at 13:21
source share

Based on @biziclop's answer - this version uses the GZIP_MAGIC header and is additionally safe for empty or single-byte data streams.

 public static InputStream maybeDecompress(InputStream input) { final PushbackInputStream pb = new PushbackInputStream(input, 2); int header = pb.read(); if(header == -1) { return pb; } int b = pb.read(); if(b == -1) { pb.unread(header); return pb; } pb.unread(new byte[]{(byte)header, (byte)b}); header = (b << 8) | header; if(header == GZIPInputStream.GZIP_MAGIC) { return new GZIPInputStream(pb); } else { return pb; } } 
+1
Dec 28 '16 at 19:56
source share

Here's how to read a file that can be gzipped:

 private void read(final File file) throws IOException { InputStream stream = null; try (final InputStream inputStream = new FileInputStream(file); final BufferedInputStream bInputStream = new BufferedInputStream(inputStream);) { bInputStream.mark(1024); try { stream = new GZIPInputStream(bInputStream); } catch (final ZipException e) { // not gzipped OR not supported zip format bInputStream.reset(); stream = bInputStream; } // USE STREAM HERE } finally { if (stream != null) { stream.close(); } } } 
0
Nov 03 '15 at 11:19
source share

SimpleMagic is a Java library for resolving content types:

 <!-- pom.xml --> <dependency> <groupId>com.j256.simplemagic</groupId> <artifactId>simplemagic</artifactId> <version>1.8</version> </dependency> 



 import com.j256.simplemagic.ContentInfo; import com.j256.simplemagic.ContentInfoUtil; import com.j256.simplemagic.ContentType; // ... public class SimpleMagicSmokeTest { private final static Logger log = LoggerFactory.getLogger(SimpleMagicSmokeTest.class); @Test public void smokeTestSimpleMagic() throws IOException { ContentInfoUtil util = new ContentInfoUtil(); InputStream possibleGzipInputStream = getGzipInputStream(); ContentInfo info = util.findMatch(possibleGzipInputStream); log.info( info.toString() ); assertEquals( ContentType.GZIP, info.getContentType() ); } 
0
Sep 28 '16 at 16:59
source share



All Articles