Java zip character encoding

I use the following method to compress a file into a zip file:

import java.util.zip.CRC32; import java.util.zip.ZipEntry; import java.util.zip.ZipOutputStream; public static void doZip(final File inputfis, final File outputfis) throws IOException { FileInputStream fis = null; FileOutputStream fos = null; final CRC32 crc = new CRC32(); crc.reset(); try { fis = new FileInputStream(inputfis); fos = new FileOutputStream(outputfis); final ZipOutputStream zos = new ZipOutputStream(fos); zos.setLevel(6); final ZipEntry ze = new ZipEntry(inputfis.getName()); zos.putNextEntry(ze); final int BUFSIZ = 8192; final byte inbuf[] = new byte[BUFSIZ]; int n; while ((n = fis.read(inbuf)) != -1) { zos.write(inbuf, 0, n); crc.update(inbuf); } ze.setCrc(crc.getValue()); zos.finish(); zos.close(); } catch (final IOException e) { throw e; } finally { if (fis != null) { fis.close(); } if (fos != null) { fos.close(); } } } 

My problem is that I have flat text files with the contents of N°TICKET , for example, the encrypted result gives some numbered characters with an uncompressed N° TICKET . Symbols such as é and à also not supported.

I assume this is due to character encoding, but I don't know how to set it in my zip method on ISO-8859-1 ?

(I am running on windows 7, java 6)

+4
source share
3 answers

You use streams that accurately write the bytes that they specify. Writers interpret personal data and convert them to corresponding bytes, while readers do the opposite. Java (at least in version 6) does not provide an easy way to mix and match encrypted data operations and to write characters.

This method will work. This, however, is a bit awkward.

 File inputFile = new File("utf-8-data.txt"); File outputFile = new File("latin-1-data.zip"); ZipEntry entry = new ZipEntry("latin-1-data.txt"); BufferedReader reader = new BufferedReader(new FileReader(inputFile)); ZipOutputStream zipStream = new ZipOutputStream(new FileOutputStream(outputFile)); BufferedWriter writer = new BufferedWriter( new OutputStreamWriter(zipStream, Charset.forName("ISO-8859-1")) ); zipStream.putNextEntry(entry); // this is the important part: // all character data is written via the writer and not the zip output stream String line = null; while ((line = reader.readLine()) != null) { writer.append(line).append('\n'); } writer.flush(); // i've used a buffered writer, so make sure to flush to the // underlying zip output stream zipStream.closeEntry(); zipStream.finish(); reader.close(); writer.close(); 
+3
source

Afaik is not available in Java 6.

But I believe that http://commons.apache.org/compress/ can provide a solution.

Switching to Java 7 provides a new constructor, which is encoding as an additional parameter.

https://blogs.oracle.com/xuemingshen/entry/non_utf_8_encoding_in

 zipStream = new ZipInputStream( new BufferedInputStream(new FileInputStream(archiveFile), BUFFER_SIZE), Charset.forName("ISO-8859-1") 
+4
source

try using org.apache.commons.compress.archivers.zip.ZipFile; and not your own java library, so you can give this encoding:

import org.apache.commons.compress.archivers.zip.ZipFile;

ZipFile zipFile = new ZipFile (file path, encoding);

0
source

Source: https://habr.com/ru/post/1438465/


All Articles