Convert string to binary in python

I need a way to get a binary representation of a string in python. eg

st = "hello world" toBinary(st) 

Is there any neat way to do this?

+73
python string binary
Sep 15 '13 at 18:19
source share
7 answers

Something like that?

 >>> st = "hello world" >>> ' '.join(format(ord(x), 'b') for x in st) '1101000 1100101 1101100 1101100 1101111 100000 1110111 1101111 1110010 1101100 1100100' #using `bytearray` >>> ' '.join(format(x, 'b') for x in bytearray(st)) '1101000 1100101 1101100 1101100 1101111 100000 1110111 1101111 1110010 1101100 1100100' 
+93
Sep 15 '13 at 18:24
source share

As a more pythonic way, you can first convert the string to a byte array, and then use the bin function in map :

 >>> st = "hello world" >>> map(bin,bytearray(st)) ['0b1101000', '0b1100101', '0b1101100', '0b1101100', '0b1101111', '0b100000', '0b1110111', '0b1101111', '0b1110010', '0b1101100', '0b1100100'] 

Or you can join it:

 >>> ' '.join(map(bin,bytearray(st))) '0b1101000 0b1100101 0b1101100 0b1101100 0b1101111 0b100000 0b1110111 0b1101111 0b1110010 0b1101100 0b1100100' 

Note that in python3 you need to specify the encoding for the bytearray function:

 >>> ' '.join(map(bin,bytearray(st,'utf8'))) '0b1101000 0b1100101 0b1101100 0b1101100 0b1101111 0b100000 0b1110111 0b1101111 0b1110010 0b1101100 0b1100100' 

You can also use the binascii module in python 2:

 >>> import binascii >>> bin(int(binascii.hexlify(st),16)) '0b110100001100101011011000110110001101111001000000111011101101111011100100110110001100100' 

hexlify returns the hexadecimal representation of binary data, then you can convert to int by specifying 16 as the base, then convert it to binary with bin .

+37
Jun 04 '15 at 10:58
source share

You can access the code values ​​for the characters in your string with the built-in ord() function. If you need to format this in binary format, the string.format() method string.format() do the job.

 a = "test" print(' '.join(format(ord(x), 'b') for x in a)) 

(Thanks to Ashwini Chaudhary for posting this piece of code.)

Although the above code works in Python 3, this question gets complicated if you accept any encoding other than UTF-8. In Python 2, strings are byte sequences, and the default is ASCII encoding. Python 3 assumes strings are Unicode, and there is a separate bytes type that looks more like a Python 2 string. If you want to accept any encoding other than UTF-8, you need to specify an encoding.

In Python 3, you can do something like this:

 a = "test" a_bytes = bytes(a, "ascii") print(' '.join(["{0:b}".format(x) for x in a_bytes])) 

The differences between UTF-8 and ascii will not be obvious for simple alphanumeric strings, but will become important if you are processing text that includes characters that do not contain the ascii character set.

+15
Sep 15 '13 at 18:23
source share

We just need to code this.

 'string'.encode('ascii') 
+14
Oct 11 '18 at 13:51 on
source share

This is an update for existing answers that used bytearray() and can no longer work like this:

 >>> st = "hello world" >>> map(bin, bytearray(st)) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: string argument without an encoding 

Because, as explained in the link above, if the source is a string, you must also specify the encoding:

 >>> map(bin, bytearray(st, encoding='utf-8')) <map object at 0x7f14dfb1ff28> 
+1
May 11 '18 at 11:13
source share
 def method_a(sample_string): binary = ' '.join(format(ord(x), 'b') for x in sample_string) def method_b(sample_string): binary = ' '.join(map(bin,bytearray(sample_string,encoding='utf-8'))) if __name__ == '__main__': from timeit import timeit sample_string = 'Convert this ascii strong to binary.' print( timeit(f'method_a("{sample_string}")',setup='from __main__ import method_a'), timeit(f'method_b("{sample_string}")',setup='from __main__ import method_b') ) # 9.564299999998184 2.943955828988692 

method_b is significantly more efficient when converting to an array of bytes, because it makes low level function calls instead of manually converting each character to an integer and then converting that integer to its binary value.

0
Jul 31 '18 at 13:31
source share

In Python version 3.6 and later, you can use 'f-string' to format the result.

 str = "hello world" print(" ".join(f"{ord(i):08b}" for i in str)) 01101000 01100101 01101100 01101100 01101111 00100000 01110111 01101111 01110010 01101100 01100100 
  • The left side of the colon, ord (i), is the actual object whose value will be formatted and inserted into the output. Using ord () gives a base-10 code point for a single str character.

  • The right side of the colon is the format specifier. 08 means width 8, padded with 0, and b acts as a sign to output the resulting number to base 2 (binary).

0
Jun 20 '19 at 19:23
source share



All Articles