How can I programmatically create a tarball of nested directories and files exclusively from Python strings and without temporary files?

I want to create a tarball with a hierarchical directory structure from Python using strings for the contents of files. I read this question , which shows a way to add strings as files, but not as directories. How can I add directories on the fly to the tar archive without creating them?

Sort of:

archive.tgz: file1.txt file2.txt dir1/ file3.txt dir2/ file4.txt 
+4
source share
3 answers

Continuing with the example provided in the related question, you can do this as follows:

 import tarfile import StringIO import time tar = tarfile.TarFile("test.tar", "w") string = StringIO.StringIO() string.write("hello") string.seek(0) info = tarfile.TarInfo(name='dir') info.type = tarfile.DIRTYPE info.mode = 0755 info.mtime = time.time() tar.addfile(tarinfo=info) info = tarfile.TarInfo(name='dir/foo') info.size=len(string.buf) info.mtime = time.time() tar.addfile(tarinfo=info, fileobj=string) tar.close() 

Be careful with the mode attribute, because the default value may not include execute permissions for the directory owner, which is necessary to change it and get its contents.

+10
source

Looking at the tar file format seems doable. Files that are included in each subdirectory receive a relative path as the name (for example, dir1/file3.txt ).

The only trick is that you must define each directory before the files that enter it ( tar will not create the necessary subdirectories on the fly). There is a special flag that you can use to identify the tarfile entry as a directory, but for legacy purposes, tar also accepts file entries with names that end in / as representing directories, so you just need to add dir1/ as a file from a zero-length line, using the same technique.

+1
source

A small modification of the useful accepted answer so that it works with both Python 3 and Python 2 (and a little closer to the OP example):

 from io import BytesIO import tarfile import time # create and open empty tar file tar = tarfile.open("test.tgz", "w:gz") # Add a file file1_contents = BytesIO("hello 1".encode()) finfo1 = tarfile.TarInfo(name='file1.txt') finfo1.size = len(file1_contents.getvalue()) finfo1.mtime = time.time() tar.addfile(tarinfo=finfo1, fileobj=file1_contents) # create directory in the tar file dinfo = tarfile.TarInfo(name='dir') dinfo.type = tarfile.DIRTYPE dinfo.mode = 0o755 dinfo.mtime = time.time() tar.addfile(tarinfo=dinfo) # add a file to the new directory in the tar file file2_contents = BytesIO("hello 2".encode()) finfo2 = tarfile.TarInfo(name='dir/file2.txt') finfo2.size = len(file2_contents.getvalue()) finfo2.mtime = time.time() tar.addfile(tarinfo=finfo2, fileobj=file2_contents) tar.close() 

In particular, I updated the octal syntax according to PEP 3127 - Integer literal support and syntax , switched to BytesIO from io , used getvalue instead of buf and used open instead of TarFile to show TarFile output, as in the example. (Using a context handler ( with... as tar: :) would also work in both python2 and python3, but cutting and pasting did not work with my repthon python2, so I did not switch it.) Tested on python 2.7. [CN10] and python 3.7.3.

0
source

Source: https://habr.com/ru/post/1388196/


All Articles