How to read the contents of a 7z file using python

Question

How to read the contents of a 7z file using python

How can I read and save the contents of 7z. I am using Python 2.7.9, I can extract or archive like this, but I cannot read the contents in python, I only list the contents of the file in CMD

import subprocess import os source = 'filename.7z' directory = 'C:\Directory' pw = '123456' subprocess.call(r'"C:\Program Files (x86)\7-Zip\7z.exe" x '+source +' -o'+directory+' -p'+pw)

+14

python-2.7 7zip

Ken kem Sep 26 '15 at 13:45

source share

3 answers

mr nick · Answer 1 · 2015-09-26T14:02:43+0000

You can use libarchive or pylzma . If you can upgrade to python3.3 +, you can use lzma , which is in the standard library.

Kyle heuton · Answer 2 · 2018-12-01T22:12:26+0000

I found myself in a situation where I had to use 7z, and I also needed to know exactly which files were extracted from each zip archive. To handle this, you can check the output of the 7z call and look for the file names. Here is the output of 7z:

 $ 7z l sample.zip 7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21 p7zip Version 16.02 (locale=utf8,Utf16=on,HugeFiles=on,64 bits,8 CPUs x64) Scanning the drive for archives: 1 file, 472 bytes (1 KiB) Listing archive: sample.zip -- Path = sample.zip Type = zip Physical Size = 472 Date Time Attr Size Compressed Name ------------------- ----- ------------ ------------ ------------------------ 2018-12-01 17:09:59 ..... 0 0 sample1.txt 2018-12-01 17:10:01 ..... 0 0 sample2.txt 2018-12-01 17:10:03 ..... 0 0 sample3.txt ------------------- ----- ------------ ------------ ------------------------ 2018-12-01 17:10:03 0 0 3 files

and how to parse this output using Python:

 import subprocess def find_header(split_line): return 'Name' in split_line and 'Date' in split_line def all_hyphens(line): return set(line) == set('-') def parse_lines(lines): found_header = False found_first_hyphens = False files = [] for line in lines: # After the header is a row of hyphens # and the data ends with a row of hyphens if found_header: is_hyphen = all_hyphens(''.join(line.split())) if not found_first_hyphens: found_first_hyphens = True # now the data starts continue # Finding a second row of hyphens means we're done if found_first_hyphens and is_hyphen: return files split_line = line.split() # Check for the column headers if find_header(split_line): found_header=True continue if found_header and found_first_hyphens: files.append(split_line[-1]) continue raise ValueError("We parsed this zipfile without finding a second row of hyphens") byte_result=subprocess.check_output('7z l sample.zip', shell=True) str_result = byte_result.decode('utf-8') line_result = str_result.splitlines() files = parse_lines(line_result)

EyePeaSea · Answer 3 · 2015-09-26T13:56:28+0000

Shelling and calling 7z will extract the files, and then you can open() these files.

If you want to look inside the 7z archive directly in Python, you need to use the library. Here is one of them: https://pypi.python.org/pypi/libarchive - I cannot vouch for this, as I said - I am not a Python user - but using a third-party library is usually quite simple in all languages.

Generally, 7z support seems limited. If you can use alternative formats (zip / gzip), then I think that you will find the range of Python libraries (and sample code) more complete.

Hope this helps.

How to read the contents of a 7z file using python

More articles: