UnicodeEncodeError Google App Engine

I am very familiar with:

UnicodeEncodeError: ascii codec cannot encode u '\ xe8' character at position 24: serial number not in range (128)

I checked several posts on SO and they recommend - variable.encode ('ascii', 'ignore')

however this does not work. Even after that I get the same error ...

Stack trace:

'ascii' codec can't encode character u'\x92' in position 18: ordinal not in range(128)
Traceback (most recent call last):
  File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 513, in __call__
    handler.post(*groups)
  File "/base/data/home/apps/autominer1/1.343038273644030157/siteinfo.py", line 2160, in post
    imageAltTags.append(str(image["alt"]))
UnicodeEncodeError: 'ascii' codec can't encode character u'\x92' in position 18: ordinal not in range(128)

Code responsible for the same:

siteUrl = urlfetch.fetch("http://www."+domainName, headers = { 'User-Agent' : 'Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9b5) Gecko/2008032620 Firefox/3.0b5' } )


 webPage = siteUrl.content.decode('utf-8', 'replace').encode('ascii', 'replace')


 htmlDom = BeautifulSoup(webPage)

 imageTags = htmlDom.findAll('img', { 'alt' : True } )


 for image in imageTags :
                        if len(image["alt"]) > 3 :
                                imageAltTags.append(str(image["alt"]))

Any help would be greatly appreciated. Thank you

+3
source share
1 answer

, Python - "" "unicode". . , , . , - unicode - .decode() .

str() unicode, - Python unicode . , ascii, 128 .

:

  • 'imageAltTags' unicode , , str() - , ,
  • str (x) x.encode(). , , utf-8 - , x.encode('utf-8').
+8

Source: https://habr.com/ru/post/1752497/


All Articles