Raw Stream has data, Deflate returns Zero Bytes

I am reading data (adCenter report, as it happens), which must be archived. Reading the contents in a regular stream, I get a couple of thousand gibberish points, so this seems reasonable. So I feed the stream to DeflateStream.

Firstly, he reports that the length of the block does not correspond to its complement. A quick search reveals that there is a two-byte prefix, and indeed, if I call ReadByte () twice before opening DeflateStream, the exception will disappear.

However, DeflateStream now returns nothing. I spent most of the day chasing it, no luck. Help me, Stackoverflow, you are my only hope! Can someone tell me what I am missing?

Here is the code. Naturally, I only included one of the two comment blocks at the time of testing.

_results = new List<string[]>(); using (Stream compressed = response.GetResponseStream()) { // Skip the zlib prefix, which conflicts with the deflate specification compressed.ReadByte(); compressed.ReadByte(); // Reports reading 3,000-odd bytes, followed by random characters /*byte[] buffer = new byte[4096]; int bytesRead = compressed.Read(buffer, 0, 4096); Console.WriteLine("Read {0} bytes.", bytesRead.ToString("#,##0")); string content = Encoding.ASCII.GetString(buffer, 0, bytesRead); Console.WriteLine(content);*/ using (DeflateStream decompressed = new DeflateStream(compressed, CompressionMode.Decompress)) { // Reports reading 0 bytes, and no output /*byte[] buffer = new byte[4096]; int bytesRead = decompressed.Read(buffer, 0, 4096); Console.WriteLine("Read {0} bytes.", bytesRead.ToString("#,##0")); string content = Encoding.ASCII.GetString(buffer, 0, bytesRead); Console.WriteLine(content);*/ using (StreamReader reader = new StreamReader(decompressed)) while (reader.EndOfStream == false) _results.Add(reader.ReadLine().Split('\t')); } } 

As you can guess from the last line, the unpacked content should be TDT.

Just for fun, I tried unzipping using GZipStream, but it reports that the magic number is incorrect. MS 'docs simply say: "The downloaded report is compressed using zip compression. You must unzip the report before you can use its contents."


Here is the code that finally worked. I had to save the contents to a file and read it back. This does not seem reasonable, but for a small amount of data that I work with, this is acceptable, I will take it!

 WebRequest request = HttpWebRequest.Create(reportURL); WebResponse response = request.GetResponse(); _results = new List<string[]>(); using (Stream compressed = response.GetResponseStream()) { // Save the content to a temporary location string zipFilePath = @"\\Server\Folder\adCenter\Temp.zip"; using (StreamWriter file = new StreamWriter(zipFilePath)) { compressed.CopyTo(file.BaseStream); file.Flush(); } // Get the first file from the temporary zip ZipFile zipFile = ZipFile.Read(zipFilePath); if (zipFile.Entries.Count > 1) throw new ApplicationException("Found " + zipFile.Entries.Count.ToString("#,##0") + " entries in the report; expected 1."); ZipEntry report = zipFile[0]; // Extract the data using (MemoryStream decompressed = new MemoryStream()) { report.Extract(decompressed); decompressed.Position = 0; // Note that the stream does NOT start at the beginning using (StreamReader reader = new StreamReader(decompressed)) while (reader.EndOfStream == false) _results.Add(reader.ReadLine().Split('\t')); } } 
+4
source share
2 answers

You will find that DeflateStream is extremely limited in what data it parses. In fact, if you expect whole files, it will be useless. There are crisp (mostly small) variations of ZIP files, and DeflateStream will only work with two or three of them.

The best way is probably to use a special library for reading Zip files / streams such as DotNetZip or SharpZipLib (somewhat unsupported).

+1
source

You can write the stream to a file and try its Precomp . If you use it as follows:

 precomp -c- -v [name of input file] 

any ZIP / gZip files inside the file will be detected and some detailed information will be reported (position and stream length). In addition, if they can be decompressed and recombined bit to bit identical, the output file will contain the unpacked streams.

Precomp detects ZIP / gZip streams (and some others) anywhere in the file, so you don't have to worry about header or garbage bytes at the beginning of the file.

If it does not detect such a stream, try adding -slow , which detects the descent streams, even if they do not have a ZIP / gZip header. If this fails, you can try -brute , which even detects deflating streams that lack two byte headers, but it will be very slow and can cause false positives.

After that, you will find out if there is a (valid) deflation flow in the file, and if so, additional information should help you unpack other reports using zLib decompression procedures or similar.

0
source

Source: https://habr.com/ru/post/1340400/


All Articles