ZipEntry.STORED for files that are already compressed?

I use ZipOutputStreamto pin a bunch of files, which are a combination of already encrypted formats, as well as many large compressible formats, such as plain text.

Most of the already encrypted formats are large files, and there is no point in wasting the CPU and memory for recompressing them, since they never get smaller, and sometimes they get a little big in the rare case.

I try to use .setMethod(ZipEntry.STORED)it when I discover a pre-compressed file, but it complains what I need to provide size, compressedSize and crcfor these files.

I can get it to work with the following approach, but this requires me to read the file twice . Once to calculate CRC32, then copy the file again to ZipOutputStream.

// code that determines the value of method omitted for brevity
if (STORED == method)
{
    fze.setMethod(STORED);
    fze.setCompressedSize(fe.attributes.size());
    final HashingInputStream his = new HashingInputStream(Hashing.crc32(), fis);
    ByteStreams.copy(his,ByteStreams.nullOutputStream());
    fze.setCrc(his.hash().padToLong());
}
else
{
    fze.setMethod(DEFLATED);
}
zos.putNextEntry(fze);
ByteStreams.copy(new FileInputStream(fe.path.toFile()), zos);
zos.closeEntry();

Is there a way to provide this information without having to read the input stream twice?

+4
source share
1 answer

Short answer:

I could not determine a way to read files only once and compute CRCwith the standard library, given the time I had to solve this problem.

I found an optimization that reduced the average time by about 50%.

CRC , ExecutorCompletionService, Runtime.getRuntime().availableProcessors(), . , CRC. , .

.postVisitDirectories() a ZipOutputStream a PipedOutputStream PipedInputStream/PipedOutputStream, Thread, ZipOutputStream InputStream, HttpRequest, ZipOutputStream , ZipEntry/Path.

300+GB , 10TB, , .

- , .

:

ZipOutputStream, zip, vs STORE CRC , .


ZipOutputStream.setLevel() :

ZipOutputStream.setLevel(NO_COMPRESSION/DEFAULT_COMPRESSION) . , , . CRC STORED NO_COMPRESSION. !

, , CRC, ZipOutputStream, , DEFLATED .setLevel() ZipOutputStream.

. , - .

, - . NO_COMPRESSION , .

0

Source: https://habr.com/ru/post/1626932/


All Articles