Check tar archive before extracting

The python documentation recommends not extracting the tar archive without first checking. What is the best way to make sure the archive is safe using the python tarfile module? Should I just iterate over the entire file name and check if they contain absolute paths?

Is something like the following enough?

import sys import tarfile with tarfile.open('sample.tar', 'r') as tarf: for n in tarf.names(): if n[0] == '/' or n[0:2] == '..': print 'sample.tar contains unsafe filenames' sys.exit(1) tarf.extractall() 

Edit

This script is incompatible with versions prior to 2.7. cf c and tarfile .

Now I iterate over the elements:

 target_dir = "/target/" with closing(tarfile.open('sample.tar', mode='r:gz')) as tarf: for m in tarf: pathn = os.path.abspath(os.path.join(target_dir, m.name)) if not pathn.startswith(target_dir): print 'The tar file contains unsafe filenames. Aborting.' sys.exit(1) tarf.extract(m, path=tdir) 
+4
source share
1 answer

Almost, although it is still possible to have a path like foo/../../ .

It would be better to use os.path.join and os.path.abspath , which together will correctly handle the leading / and .. anywhere in the path:

 target_dir = "/target/" # trailing slash is important with tarfile.open(…) as tarf: for n in tarf.names: if not os.path.abspath(os.path.join(target_dir, n)).startswith(target_dir): print "unsafe filenames!" sys.exit(1) tarf.extractall(path=target_dir) 
+4
source

Source: https://habr.com/ru/post/1381006/


All Articles