Python gzipped fileinput returns binary string instead of text string

When I iterate over lines from a set of gzipped files with a module file system as follows:

for line in fileinput.FileInput(files=gzipped_files,openhook=fileinput.hook_compressed):

These lines are then byte strings, not text strings.

When using the gzip module, this can be prevented by opening files with "rt" instead of "rb": http://bugs.python.org/issue13989

Is there a similar fix for the module fileinput file, so I can make it return text strings instead of byte strings? I tried adding mode = 'rt', but then I get this error:

ValueError: FileInput opening mode must be one of 'r', 'rU', 'U' and 'rb'
+4
source share
1

openhook, :

import os

def hook_compressed_text(filename, mode, encoding='utf8'):
    ext = os.path.splitext(filename)[1]
    if ext == '.gz':
        import gzip
        return gzip.open(filename, mode + 't', encoding=encoding)
    elif ext == '.bz2':
        import bz2
        return bz2.open(filename, mode + 't', encoding=encoding)
    else:
        return open(filename, mode, encoding=encoding)
+3

Source: https://habr.com/ru/post/1525060/


All Articles