I would like to convert HTML objects back to my human-readable format, for example. '£' to '£', '°' to '°', etc.
I read some posts on this subject
Convert html source content to readable format using Python 2.x
Decode HTML objects in Python string?
Convert XML / HTML objects to Unicode string in Python
and according to them, I decided to use the undocumented unescape () function, but it does not work for me ...
My sample code is similar:
import HTMLParser htmlParser = HTMLParser.HTMLParser() decoded = htmlParser.unescape('© 2013') print decoded
When I ran this python script, the output anyway:
© 2013
instead
© 2013
I use Python 2.X while running on the Windows 7 and Cygwin console. I googled and did not find similar problems. Can anyone help me with this?
source share