A quick way to read from StringIO before some byte appears

Suppose I have StringIO (from cStringIO ). I want to read the buffer from it until some character / byte is encountered, say "Z", therefore:

 stringio = StringIO('ABCZ123') buf = read_until(stringio, 'Z') # buf is now 'ABCZ' # strinio.tell() is now 4, pointing after 'Z' 

What is the fastest way to do this in Python? Thanks you

+6
source share
2 answers

I am very disappointed that this question receives only one answer to the stack overflow, because it is an interesting and relevant question. In any case, since only ovgolovin give a solution, and I thought it might be slow, I thought a faster solution:

 def foo(stringio): datalist = [] while True: chunk = stringio.read(256) i = chunk.find('Z') if i == -1: datalist.append(chunk) else: datalist.append(chunk[:i+1]) break if len(chunk) < 256: break return ''.join(datalist) 

This is an io reading in chunks (maybe the end char was not found in the first fragment). This is very fast because no Python function called for each character, but rather the maximum use of C-written Python functions.

It runs 60x faster than the ovgolovin solution. I checked timeit to check it out.

+4
source
 i = iter(lambda: stringio.read(1),'Z') buf = ''.join(i) + 'Z' 

Here iter used in this mode: iter(callable, sentinel) -> iterator .

''.join(...) pretty effective. The last operation of adding 'Z' ''.join(i) + 'Z' not so good. But it can be solved by adding 'Z' to the iterator:

 from itertools import chain, repeat stringio = StringIO.StringIO('ABCZ123') i = iter(lambda: stringio.read(1),'Z') i = chain(i,repeat('Z',1)) buf = ''.join(i) 

Another way to do this is to use a generator:

 def take_until_included(stringio): while True: s = stringio.read(1) yield s if s=='Z': return i = take_until_included(stringio) buf = ''.join(i) 

I conducted several performance tests. The performance of the methods described is approximately the same:

http://ideone.com/dQGe5

+2
source

Source: https://habr.com/ru/post/902452/


All Articles