Update:
I found the answer here: Python UnicodeDecodeError - Am I misunderstanding the encoding?
I needed to explicitly decode my input file in Unicode when I read it. Because it had characters that were not ascii or unicode acceptable. Thus, the encoding did not work when it fell into these characters.
Original question
So, I know something that I just don't get here.
I have an array of unicode strings, some of which contain non-Ascii characters.
I want to encode this as json with
json.dumps(myList)
It gives an error
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb4 in position 13: ordinal not in range(128)
How am I supposed to do this? I tried setting the secure_ascii parameter for both True and False, but does not fix this problem.
I know that I pass unicode strings to json.dumps. I understand that json string is for unicode. Why doesn't he just sort it for me?
What am I doing wrong?
Update: Don Question reasonably suggests providing a stack trace. Here:
Traceback (most recent call last): File "importFiles.py", line 69, in <module> x = u"%s" % conv File "importFiles.py", line 62, in __str__ return self.page.__str__() File "importFiles.py", line 37, in __str__ return json.dumps(self.page(),ensure_ascii=False) File "/usr/lib/python2.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/usr/lib/python2.7/json/encoder.py", line 204, in encode return ''.join(chunks) UnicodeDecodeError: 'ascii' codec can't decode byte 0xb4 in position 17: ordinal not in range(128)
Pay attention to python 2.7, and the error still occurs with security_ascii = False
Update 2: Andrew Walker's useful link (in the comments) makes me think that I can force my data into a convenient byte format before trying json.encode to do something like:
data.encode("ascii","ignore")
Unfortunately, this causes the same error.