Getting the source binary representation of a file in Python

I would like to get the exact sequence of bits from a file to a string using Python 3. There are several questions on this topic that are approaching but not quite answering this. So far I have this:

>>> data = open('file.bin', 'rb').read() >>> data '\xa1\xa7\xda4\x86G\xa0!e\xab7M\xce\xd4\xf9\x0e\x99\xce\xe94Y3\x1d\xb7\xa3d\xf9\x92\xd9\xa8\xca\x05\x0f$\xb3\xcd*\xbfT\xbb\x8d\x801\xfanX\x1e\xb4^\xa7l\xe3=\xaf\x89\x86\xaf\x0e8\xeeL\xcd|*5\xf16\xe4\xf6a\xf5\xc4\xf5\xb0\xfc;\xf3\xb5\xb3/\x9a5\xee+\xc5^\xf5\xfe\xaf]\xf7.X\x81\xf3\x14\xe9\x9fK\xf6d\xefK\x8e\xff\x00\x9a>\xe7\xea\xc8\x1b\xc1\x8c\xff\x00D>\xb8\xff\x00\x9c9...' >>> bin(data[:][0]) '0b11111111' 

OK, I can get the base-2 number, but I don’t understand why the data is [:] [x], and I still have the leading 0b. It also seemed to me that I needed to iterate over the entire string and perform casting and parsing to get the correct output. Is there an easier way to just get a sequence of 01 without loops, parsing and concatenating strings?

Thanks in advance!

+4
source share
4 answers

First, I precommuted the string representation for all values ​​of 0..255

 bytetable = [("00000000"+bin(x)[2:])[-8:] for x in range(256)] 

or, if you prefer bits in LSB order for MSB

 bytetable = [("00000000"+bin(x)[2:])[-1:-9:-1] for x in range(256)] 

then the whole file in binary format can be obtained using

 binrep = "".join(bytetable[x] for x in open("file", "rb").read()) 
+5
source

It is not clear what the sequence of bits should be. I think it would be most natural to start from byte 0 with bit 0, but actually it depends on what you want.

So, here is some code to access a sequence of bits starting with bit 0 in byte 0:

 def bits_from_char(c): i = ord(c) for dummy in range(8): yield i & 1 i >>= 1 def bits_from_data(data): for c in data: for bit in bits_from_char(c): yield bit for bit in bits_from_data(data): # process bit 

(Another note: you won't need data[:][0] in your code. Just data[0] will do the trick, but it won’t copy the entire line first.)

+2
source

If you are using an external module, use bitstring :

 >>> import bitstring >>> bitstring.BitArray(filename='file.bin').bin[2:] '110000101010000111000010101001111100...' 

and what is he. It simply takes a binary string representation of the entire file and truncates only one initial "0b".

+2
source

To convert raw binary data, such as b'\xa1\xa7\xda4\x86' , into a bit string that represents data as a number in binary (base-2) in Python 3:

 >>> data = open('file.bin', 'rb').read() >>> bin(int.from_bytes(data, 'big'))[2:] '1010000110100111110110100011010010000110...' 

See Convert Binary Files to ASCII and vice versa .

+1
source

Source: https://habr.com/ru/post/958152/


All Articles