DotNetZip creates zip from a subset of other zip files

I have a large zip file that I need to split into several ZIP files. In the method I'm creating now, I have a List object.

This is the code I have:

//All files have the same basefilename/ string basefilename = Path.GetFileNameWithoutExtension(entries[0].FileName); MemoryStream memstream = new MemoryStream(); ZipFile zip = new ZipFile(); foreach (var entry in entries) { string newFileName = basefilename + Path.GetExtension(entry.FileName); zip.AddEntry(newFileName, entry.OpenReader()); } zip.Save(memstream); //this will later go in an file-io handler class. FileStream outstream = File.OpenWrite(@"c:\files\"+basefilename+ ".zip"); memstream.WriteTo(outstream); outstream.Flush(); outstream.Close(); 

And this is the error I get when save () is called:

{Ionic.Zlib.ZlibException: poor state (invalid block type) in Ionic.Zlib.InflateManager.Inflate (FlushType flush) in Ionic.Zlib.ZlibCodec.Inflate (FlushType flush) in Ionic.Zlib.ZlibBaseStream.Read (buffer ], offset Int32, Int32 count) in Ionic.Zlib.DeflateStream.Read (byte [] buffer, Int32 offset, Int32) in Ionic.Crc.CrcCalculatorStream.Read (byte [] buffer, Int32 offset, Int32 count) in Ionic. Zip.SharedUtilities.ReadWithRetry (stream s, byte [] buffer, Int32 offset, Int32 count, String FileName) in Ionic.Zip.ZipEntry._WriteEntryData (stream s) in Ionic.Zip.ZipEntry.Write (stream s) in Ionic. Zip.ZipFile.Save () at Ionic.Zip.ZipFile.Save (stream outputStream) in

What am I doing wrong?

+3
source share
3 answers

here's what you are doing wrong: you have several pending calls for ZipEntry.OpenReader () in one instance of ZipFile. You can get no more than one pending ZipEntry.OpenReader ().

Here's why: there is only one Stream object created when instantiating the specified zip file using ZipFile.Read () or a new ZipFile () that passes the name of an existing file. When you call ZipEntry.OpenReader (), it causes Seek () in the Stream object to move the file pointer to the beginning of the compressed stream for this particular record. When you call ZipEntry.OpenReader () again, it leads to another search () in another place in the stream. Thus, adding records and calling OpenReader () in a row, you call Seek () several times, but only the latter will be valid. The cursor stream will be placed at the beginning of the data for the record corresponding to the last call to ZipEntry.OpenReader ().

To fix this: cancel your approach. The easiest way to create a new zip file with fewer entries than an existing zip file is as follows: create an instance of ZipFile by reading the existing file, then delete the entries you do not need, then call ZipFile.Save () on the new path.

 using (var zip = ZipFile.Read("c:\\dir\\path\\to\\existing\\zipfile.zip")) { foreach (var name in namesToRemove) // IEnumerable<String> { zip[name].Remove(); } zip.Save("c:\\path\\to\\new\\Archive.zip"); } 

EDIT
What it does at the time of calling Save (): the library reads the raw compressed data for the records that you did NOT delete from the file system file, and writes them to a new archive file. This is very fast, because it does not decompress or recompress each entry to place it in a new, smaller zip file. It basically reads fragments of binary data from the source zip file and combines them together to form a new, smaller zip file.

To create multiple smaller files, you can do this again with the original zip file; just wrap it in a loop and change the files you delete and the file name of the new, smaller archive. Reading an existing zip file is also pretty fast.


Alternatively, you can unzip and extract each entry, and then recompress and write the entry to a new zip file. This is far, but it is possible. In this case, for each smaller zip file you want to create, you will need to create two instances of ZipFile. Open the first by reading the original zip archive. For each record that you want to save, create a MemoryStream, extract the contents of the record into this MemoryStream and do not forget to call Seek () in the mem stream to reset the cursor to the memory stream. Then, using the second instance of ZipFile, call AddEntry (), using this MemoryStream as the source for the added record. Call ZipFile.Save () only in the second instance.

 using (var orig = ZipFile.Read("C:\\whatever\\OriginalArchive.zip")) { using (var smaller = new ZipFile()) { foreach (var name in entriesToKeep) { var ms = new MemoryStream(); orig[name].Extract(ms); // extract into stream ms.Seek(0,SeekOrigin.Begin); smaller.AddEntry(name,ms); } smaller.Save("C:\\location\\of\\SmallerZip.zip"); } } 

This works, but it involves decompressing and recompressing every record that falls into a smaller zip, which is inefficient and unnecessary.


If you don't mind the inefficiency of decompression and recompression, there is an alternative you can use: call overload ZipFile.AddEntry (), which accepts the opener and closer delegates . This means deferring the call to OpenReader () until the record is written to a new, smaller zip file. The effect is that you only have one pending OpenReader () at a time.

 using(ZipFile original = ZipFile.Read("C:\\path.to\\original\\Archive.zip"), smaller = new ZipFile()) { foreach (var name in entriesToKeep) { zip.AddEntry(zipEntryName, (name) => original[name].OpenReader(), null); } smaller.Save("C:\\path.to\\smaller\\Archive.zip"); } 

It is still inefficient because each record is unpacked and re-compressed, but it is slightly less inefficient.

+7
source

Chaseo pointed out to me that I cannot open a few readers. Although his removal decision was not what I needed. Therefore, I tried to solve the problem with new knowledge, and this is what I created.

 string basefilename = Path.GetFileNameWithoutExtension(entries[0].FileName); ZipFile zip = new ZipFile(); foreach (var entry in entries){ CrcCalculatorStream reader = entry.OpenReader(); MemoryStream memstream = new MemoryStream(); reader.CopyTo(memstream); byte[] bytes = memstream.ToArray(); string newFileName = basefilename + Path.GetExtension(entry.FileName); zip.AddEntry(newFileName, bytes); } zip.Save(@"c:\files\" + basefilename + ".zip"); 
+1
source

EDIT 2: I think you need a double backslash when specifying a path. I updated my code to reflect this. Double backslash code for a regular backslash in a string.

EDIT: Does the variable "newFileName" represent the path to which the file is currently located? If this variable is something else, then this may be your problem. Seeing no more surrounding code, I'm not sure.

I use the same libraries to do .zip all the time in my code, but I never did it exactly the same as you try. I don’t know why your code gives you an exception, but maybe this will work? (Assuming your lines / paths are all correct, and the zip library is really what caused the problem)

 using (ZipFile zip = new ZipFile()) { zip.CompressionLevel = CompressionLevel.BestCompression; foreach (var entry in entries) { try { string newFileName = basefilename + Path.GetExtension(entry.FileName); zip.AddFile(newFileName, ""); } catch (Exception) { } } zip.Save("c:\\files\\"+basefilename+ ".zip"); } 
0
source

Source: https://habr.com/ru/post/917005/


All Articles