How to convert 8-bit Hebrew to utf-8 in python

I have Hebrew data such that \ xe0 is a Hebrew Aleph and want to convert it to utf-8

+3
source share
2 answers

In general, in Python, if you have a byte string, you need to use decoding first to convert it to an internal representation, after which you can encode it to UTF-8. Of course you need to know the encoding \xe0for this to work (I assume your character is encoded using ISO-8859-8):

'\xe0'.decode('iso-8859-8').encode('utf-8')

EDIT: Note:

Be sure to use the internal representation in your program as long as possible. In general: decoding first (at the input), encoding the latter (at the output).

+7

"", unicode

y = x.decode('iso8859-8')

x - 8- , y - utf-8 encode

z = y.encode('utf-8')
0

Source: https://habr.com/ru/post/1791795/


All Articles