I'm currently trying to use some simple regular expression in a very large .txt file (a couple of million lines of text). The simplest code that causes the problem:
file = open("exampleFileName", "r")
for line in file:
pass
Error message:
Traceback (most recent call last):
File "example.py", line 34, in <module>
example()
File "example.py", line 16, in example
for line in file:
File "/usr/lib/python3.4/codecs.py", line 319, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 7332: invalid continuation byte
How can i fix this? Is utf-8 the wrong encoding? And if so, how do I know which one is right?
Thanks and best regards!
source
share