Unicode decoding error: how to skip invalid characters

Question

Is there a way to preprogram text files and skip these characters?

UnicodeDecodeError: 'utf8' codec can't decode byte 0xa1 in position 1395: invalid start byte

0

Maximus s Dec 12 '14 at 23:47

2 answers

I think your text file has a special character, so "utf-8" cannot be decoded.

You need to try using "ISO-8859-1" instead of "utf-8". eg:

   import sys
   reload(sys).setdefaultencoding("ISO-8859-1")

   # put your code here

+2

Ve pham Dec 13 '14 at 7:20

Irshad bhat · Accepted Answer · 2014-12-13T00:00:43+0000

Try the following:

str.decode('utf-8',errors='ignore')