You have binary data that is not ASCII encoded. The code pages \xhh indicate that your data is encoded using a different codec, and you see that Python creates a representation of the data using the repr() function , which can be reused as a Python literal that allows you to accurately recreate the same value. This view is very useful when debugging a program.
In other words, the escape sequences \xhh represent individual bytes, and hh is the hexadecimal value of this byte. You have 4 bytes with the hexadecimal values C3, A7, C3, and B5 that are not mapped to printable ASCII characters, so Python uses the \xhh note instead.
Instead, you have UTF-8 data, decode it as such:
>>> 'Demais Subfun\xc3\xa7\xc3\xb5es 12'.decode('utf8') u'Demais Subfun\xe7\xf5es 12' >>> print 'Demais Subfun\xc3\xa7\xc3\xb5es 12'.decode('utf8') Demais Subfunções 12
bytes C3 A7 together encode U + 00E7 LATIN SMALL LETTER C WITH CEDILLA , and bytes C3 B5 encode U + 00F5 LATIN SMALL LETTER O TILDE .
ASCII is a subset of the UTF-8 codec, so all other letters can be represented as such in the Python repr() output.
source share