Replacing a huge folder with ZipFileSystem results in an OutOfMemoryError

The java.nio package has a great way to process zip files, treating them like file systems. This allows us to process the contents of a zip file like regular files. Thus, the firmware of the entire folder can be achieved simply by using Files.copy to copy all the files to a zip file. Since the subfolders must also be copied, we need a visitor:

  private static class CopyFileVisitor extends SimpleFileVisitor<Path> { private final Path targetPath; private Path sourcePath = null; public CopyFileVisitor(Path targetPath) { this.targetPath = targetPath; } @Override public FileVisitResult preVisitDirectory(final Path dir, final BasicFileAttributes attrs) throws IOException { if (sourcePath == null) { sourcePath = dir; } else { Files.createDirectories(targetPath.resolve(sourcePath .relativize(dir).toString())); } return FileVisitResult.CONTINUE; } @Override public FileVisitResult visitFile(final Path file, final BasicFileAttributes attrs) throws IOException { Files.copy(file, targetPath.resolve(sourcePath.relativize(file).toString()), StandardCopyOption.REPLACE_EXISTING); return FileVisitResult.CONTINUE; } } 

This is a simple recursively "recursive" visitor. It is used to copy recursively. However, with the help of ZipFileSystem we can also use it to copy a directory to a zip file, for example:

 public static void zipFolder(Path zipFile, Path sourceDir) throws ZipException, IOException { // Initialize the Zip Filesystem and get its root Map<String, String> env = new HashMap<>(); env.put("create", "true"); URI uri = URI.create("jar:" + zipFile.toUri()); FileSystem fileSystem = FileSystems.newFileSystem(uri, env); Iterable<Path> roots = fileSystem.getRootDirectories(); Path root = roots.iterator().next(); // Simply copy the directory into the root of the zip file system Files.walkFileTree(sourceDir, new CopyFileVisitor(root)); } 

This is what I call the elegant way of zipping an entire folder. However, when using this method in a huge folder (about 3 GB) I get OutOfMemoryError (empty space). When using the normal zip processing library, this error does not occur. Thus, it seems that the ZipFileSystem method handles the copy is very inefficient: too many files that need to be written are stored in memory, so OutOfMemoryError appears.

Why is this so? Is the ZipFileSystem as a rule inefficient (in terms of memory consumption), or am I doing something wrong here?

+9
java zip nio
May 25 '14 at 18:42
source share
2 answers

I looked at ZipFileSystem.java and I believe that I found a source of memory consumption. By default, the implementation uses ByteArrayOutputStream as a buffer for compressing files, which means that it is limited by the amount of memory allocated by the JVM.

There is an (undocumented) environment variable that we can use to make the implementation use temporary files ( "useTempFile" ). It works as follows:

 Map<String, Object> env = new HashMap<>(); env.put("create", "true"); env.put("useTempFile", Boolean.TRUE); 

More details here: http://www.docjar.com/html/api/com/sun/nio/zipfs/ZipFileSystem.java.html , interesting lines - 96, 1358 and 1362.

+17
May 26 '14 at 2:32
source share

You must prepare jvm to allow these amounts of memory -Xms {memory} -Xmx {memory} .

I recommend that you check the directory that calculates the disk space and set a limit, when using the file system of the file system 1 GB, more than 1 GB to use the file system on the disk.

Another thing, check the concurrency method, you will not like more than 1 stream soaking 3Gb files

-2
May 25 '14 at 18:59
source share



All Articles