The idiomatic way of struct.unpack from BytesIO?

I have some byte data that wants to be parsed as a stream, since the bytes previously used in the sequence control the interpretation of children bytes. So BytesIO looks the way I want. But I also want to use the tools provided by the structural module. But structural interfaces are not threads. Is there a smart / idiomatic way to marry two?

As an example, here is an example piece of data:

b'\n\x00\x02\x90\x10\x00\n\x00\x02`\x10\x00\n\x00\x02\x80\x10\x00' 

I want to pull the first 4 bytes as an unsigned large endian int (for example, struct.unpack(fmt='>I' ). Since the next byte is 0x10, I know there must be another byte that turns out to be 0x00. And then it starts again, read the following 4 (0x0A000290), erase, rinse, repeat. Byte (s), immediately following each 4 byte identifier, starts a lot of downstream readings (some bytes, some shorts).

I could do something like

 stream = b'\n\x00\x02\x90\x10\x00\n\x00\x02`\x10\x00\n\x00\x02\x80\x10\x00' while stream: id = struct.unpack('>I', stream[:4]) stream = stream[4:] ... 

But it seems less elegant.

+4
source share
2 answers

What I usually do:

 def unpack(stream, fmt): size = struct.calcsize(fmt) buf = stream.read(size) return struct.unpack(fmt, buf) 

For instance:

 >>> b = io.BytesIO(b'\n\x00\x02\x90\x10\x00\n\x00\x02`\x10\x00\n\x00\x02\x80\x10\x00') >>> print(unpack(b, '>I')) (167772816,) >>> print(unpack(b, '>I')) (268438016,) >>> print(unpack(b, '>I')) (39849984,) >>> print(unpack(b, '>I')) (167772800,) >>> print(unpack(b, '>H')) (4096,) 

If you want to know if you are consuming the entire stream, you can always do this:

 buf = stream.read(1) if buf: raise ValueError("Stream not consumed") 

But it is probably easier to just call the same function that you are already using:

 >>> def ensure_finished(stream): ... try: ... unpack(stream, 'c') ... except struct.error: ... pass ... else: ... raise ValueError('Stream not consumed') >>> ensure_finished(b) 

If you use a stream that can read less than the requested number of bytes, you want to use a while to continue reading and adding to EOF or to get enough bytes. Otherwise, thatโ€™s all you need.

+5
source

Use the struct API:

 buf = b'\n\x00\x02โ€ฆ' offset = 0 id = struct.unpack_from('>I', buf, offset); offset += 4 โ‹ฎ x = struct.unpack_from('โ€ฆ', buf, offset) 

If you do not want to indicate the offset after each operation, you can write a small wrapper, for example:

 class unpacker(object): def __init__(self, buf): self._buf = buf self._offset = 0 def __call__(self, fmt): result = struct.unpack_from(fmt, self._buf, self._offset) self._offset += struct.calcsize(fmt) return result โ‹ฎ unpack = unpacker(buf) id = unpack('>I') โ‹ฎ x = unpack('โ€ฆ') 
+2
source

Source: https://habr.com/ru/post/1490382/


All Articles