Removing files from a ZIP archive without decompression in Java or possibly Python

Delete files from a ZIP archive without unpacking using Java (preferred) or Python

Hello,

I work with large ZIP files containing hundreds of compressed text files. When I unzip a ZIP file, it can take some time and easily consume up to 20 GB of disk space. I would like to delete specific files from these ZIP files without having to unpack and recompile only the files I want.

Of course, of course, this can be done a long way, but it is very inefficient.

I would rather do it in Java, but I will consider Python

+4
source share
5 answers

I have no code for this, but the basic idea is simple and should be translated into almost any language the same way. A ZIP file layout is just a series of blocks representing files (a header followed by compressed data), completed by a central directory that contains only all metadata. Here's the process:

  • Scan forward in the file until you find the first file you want to delete.
  • Scan forward in the file until you find the first file that you do not want to delete or go to the central directory.
  • Scan forward in the file until you find the first file you want to delete or go to the central directory.
  • Copy all the data found in step 3 to the data that you skipped in step 2 until you find another file that you want to delete or go to the central directory.
  • Go to step 2 if you are not in the central directory.
  • Copy the central directory to where you stopped copying, leaving entries for deleted files and changing offsets to indicate how much you moved each file.

For more information on ZIP file structures, see http://en.wikipedia.org/wiki/ZIP_%28file_format%29 .

As bestsss suggests, you can copy to another file to prevent data loss in the event of a failure.

+2
source

I found this on the Internet

a clean solution with only a standard library, but I'm not sure if it is included in the android sdk that can be found.

import java.util.*; import java.net.URI; import java.nio.file.Path; import java.nio.file.*; import java.nio.file.StandardCopyOption; public class ZPFSDelete { public static void main(String [] args) throws Exception { /* Define ZIP File System Properies in HashMap */ Map<String, String> zip_properties = new HashMap<>(); /* We want to read an existing ZIP File, so we set this to False */ zip_properties.put("create", "false"); /* Specify the path to the ZIP File that you want to read as a File System */ URI zip_disk = URI.create("jar:file:/my_zip_file.zip"); /* Create ZIP file System */ try (FileSystem zipfs = FileSystems.newFileSystem(zip_disk, zip_properties)) { /* Get the Path inside ZIP File to delete the ZIP Entry */ Path pathInZipfile = zipfs.getPath("source.sql"); System.out.println("About to delete an entry from ZIP File" + pathInZipfile.toUri() ); /* Execute Delete */ Files.delete(pathInZipfile); System.out.println("File successfully deleted"); } } } 
+2
source

Well, I think I found a potential solution from www.javaer.org. It definitely deletes files inside zip, and I don't think it is unpacking anything. Here is the code:

 public static void deleteZipEntry(File zipFile, String[] files) throws IOException { // get a temp file File tempFile = File.createTempFile(zipFile.getName(), null); // delete it, otherwise you cannot rename your existing zip to it. tempFile.delete(); tempFile.deleteOnExit(); boolean renameOk=zipFile.renameTo(tempFile); if (!renameOk) { throw new RuntimeException("could not rename the file "+zipFile.getAbsolutePath()+" to "+tempFile.getAbsolutePath()); } byte[] buf = new byte[1024]; ZipInputStream zin = new ZipInputStream(new FileInputStream(tempFile)); ZipOutputStream zout = new ZipOutputStream(new FileOutputStream(zipFile)); ZipEntry entry = zin.getNextEntry(); while (entry != null) { String name = entry.getName(); boolean toBeDeleted = false; for (String f : files) { if (f.equals(name)) { toBeDeleted = true; break; } } if (!toBeDeleted) { // Add ZIP entry to output stream. zout.putNextEntry(new ZipEntry(name)); // Transfer bytes from the ZIP file to the output file int len; while ((len = zin.read(buf)) > 0) { zout.write(buf, 0, len); } } entry = zin.getNextEntry(); } // Close the streams zin.close(); // Compress the files // Complete the ZIP file zout.close(); tempFile.delete(); 

}

0
source

Yes, for JAVA you can use a library called TRUEZIP .

TrueZIP is a Java-based virtual file system (VFS) that allows an application client to perform CRUD operations (create, read, update, delete) on archive files as if they were virtual directories, even with embedded archive files in multi-threaded environments

see link below for more information https://truezip.java.net/

0
source

It may be old, but here is one way. And it works because I use it all the time, and it works great.

 public boolean deleteFile(String zip_dir, String subfile){ delete(new File(zipdir, subfile)); } private void delete(File file) { if(file == null || !file.exists()) return; if(file.isFile()) { file.delete(); return; } File children[] = file.listFiles(); for(int i = 0; i < children.length; i++) { File child = children[i]; if(child.isFile()) child.delete(); else delete(child); } file.delete(); } 
-2
source

Source: https://habr.com/ru/post/1342978/


All Articles