Pass the Unicode character to ord() to get its code point, and then split that code point into separate bytes with int.to_bytes() , and then format the output as you want:
list(map(lambda b: hex(b)[2:], ord('\u4132').to_bytes(4, 'big')))
returns: ['0', '0', '41', '32']
list(map(lambda b: hex(b)[2:], ord('\N{PILE OF POO}').to_bytes(4, 'big')))
returns: ['0', '1', 'f4', 'a9']
As I mentioned in another comment, utf16 code point encoding will not work properly for code points outside BMP (the base multilingual plane), since UTF16 will require a surrogate pair to encode these code points.
Danilo Souza MorΓ£es Jul 02 '18 at 19:43 2018-07-02 19:43
source share