The python documentation recommends not extracting the tar archive without first checking. What is the best way to make sure the archive is safe using the python tarfile module? Should I just iterate over the entire file name and check if they contain absolute paths?
Is something like the following enough?
import sys import tarfile with tarfile.open('sample.tar', 'r') as tarf: for n in tarf.names(): if n[0] == '/' or n[0:2] == '..': print 'sample.tar contains unsafe filenames' sys.exit(1) tarf.extractall()
Edit
This script is incompatible with versions prior to 2.7. cf c and tarfile .
Now I iterate over the elements:
target_dir = "/target/" with closing(tarfile.open('sample.tar', mode='r:gz')) as tarf: for m in tarf: pathn = os.path.abspath(os.path.join(target_dir, m.name)) if not pathn.startswith(target_dir): print 'The tar file contains unsafe filenames. Aborting.' sys.exit(1) tarf.extract(m, path=tdir)
source share