I have some general questions regarding the java.util.zip library. We mainly import and export many small components. Previously, these components were imported and exported using one large file, for example:
<component-type-a id="1"/> <component-type-a id="2"/> <component-type-a id="N"/> <component-type-b id="1"/> <component-type-b id="2"/> <component-type-b id="N"/>
Note that the order of the components during import matters.
Now each component should take its own file, which should be an external version, QA-ed, bla, bla. We decided that the output of our export should be a zip file (with all these files), and the input of our import should be a similar zip file. We do not want to explode the zip code in our system. We do not want to open separate streams for each of the small files. My current questions are:
Q1. Can ZipInputStream guarantee that zip records (small files) will be read in the same order as they were inserted by our export, which uses ZipOutputStream ? I assume that reading is something like:
ZipInputStream zis = new ZipInputStream(new BufferedInputStream(fis)); ZipEntry entry; while((entry = zis.getNextEntry()) != null) { //read from zis until available }
I know that the central zip directory is placed at the end of the zip file, but, nevertheless, the entries in the file inside are in sequential order. I also know that relying on order is an ugly idea, but I just want to keep all the facts in mind.
Q2. If I use ZipFile (which I prefer), what is the performance impact of getInputStream() call hundreds of times? Will it be much slower than ZipInputStream solution? The zip code is opened only once, and the ZipFile supported by RandomAccessFile - is this correct? I assume that reading is something like:
ZipFile zipfile = new ZipFile(argv[0]); Enumeration e = zipfile.entries();//TODO: assure the order of the entries while(e.hasMoreElements()) { entry = (ZipEntry) e.nextElement(); is = zipfile.getInputStream(entry)); }
Q3. Are input streams obtained from the same thread-safe ZipFile (for example, I can read different records in different streams at the same time)? Any penalties for performance?
Thank you for your responses!