I tried to unzip / unzip / extract zip files using Python, as well as "create a ZipFile object, scroll through its .namelist (), read files and write them to the file system" the low-level approach does not seem very Python. Therefore, I started digging zipfile objects , which, in my opinion, are not very well documented and cover all methods of the object:
>>> from zipfile import ZipFile >>> filepath = '/srv/pydocfiles/packages/ebook.zip' >>> zip = ZipFile(filepath) >>> dir(zip) ['NameToInfo', '_GetContents', '_RealGetContents', '__del__', '__doc__', '__enter__', '__exit__', '__init__', '__module__', '_allowZip64', '_didModify', '_extract_member', '_filePassed', '_writecheck', 'close', 'comment', 'compression', 'debug', 'extract', 'extractall', 'filelist', 'filename', 'fp', 'getinfo', 'infolist', 'mode', 'namelist', 'open', 'printdir', 'pwd', 'read', 'setpassword', 'start_dir', 'testzip', 'write', 'writestr']
Here we go to the "extractall" method, like tarfile extractall ! (on python 2.6 and 2.7, but not 2.5)
Then productivity is concerned; the ebook.zip file is 84.6 MB (mostly pdf files), and the uncompressed folder is 103 MB, the default is “Archive Utility” in MacOSx 10.5. So I did the same with the Python timeit module:
>>> from timeit import Timer >>> t = Timer("filepath = '/srv/pydocfiles/packages/ebook.zip'; \ ... extract_to = '/tmp/pydocnet/build'; \ ... from zipfile import ZipFile; \ ... ZipFile(filepath).extractall(path=extract_to)") >>> >>> t.timeit(1) 1.8670060634613037
which took less than 2 seconds on a heavily loaded machine, in which 90% of the memory is used by other applications.
Hope this helps someone.