How to unzip the lzo_deflate file?

I used LZO for compression, reducing output. I tried this: Kevin Weil's Hadoop-LZO project , and then used the LzoCodec class with my work:

 TextOutputFormat.setOutputCompressorClass(job, LzoCodec.class); 

Now compression works just fine.

My problem is that the result of the compression is a .lzo_deflate file, which I just cannot unzip.
The Lzop utility does not seem to support this file type.
LzopCodec should provide the .lzo file, but it doesn’t work, however it is in the same package as LzoCodec ( org.apache.hadoop.io.compress ), which may refer to a compatibility issue, since I used the old API (0.19) ) to do the compression.

The answers to this question offer Python solutions, however I need it in Java.
I am using Hadoop 1.1.2 and Java 6.

+4
source share
2 answers

.lzo_deflate means LZO stream without the usual header and trailer. Thus, you will need to wrap the raw .lzo_deflate stream .lzo_deflate header and trailer expected by lzop. Or at least a title, and then ignore errors from the missing trailer. You need to look at the title and documentation of the trailer .

“Deflation” in the title is an odd choice, but it refers to the gzip analogy, where the raw compressed data format without the gzip header and trailer is called deflate.

+6
source

I ran into the same problem. The reason is that I did not use the correct codec. Check your codec in the job configuration.

 job.getConfiguration().set("mapred.output.compression.codec","com.hadoop.compression.lzo.LzopCodec"); 
+2
source

Source: https://habr.com/ru/post/1482018/


All Articles