I am trying to extract some data from a JSON file that contains tweets and write it to csv. The file contains all kinds of characters, I assume that is why I get this error message:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026'
I think I need to convert the output to utf-8 before writing the csv file, but I could not do it. I found similar questions here in stackoverflow, but could not adapt the solutions to my problem (I must add that I am not very familiar with python. I am a sociologist, not a programmer)
import csv
import json
fieldnames = ['id', 'text']
with open('MY_SOURCE_FILE', 'r') as f, open('MY_OUTPUT', 'a') as out:
writer = csv.DictWriter(
out, fieldnames=fieldnames, delimiter=',', quoting=csv.QUOTE_ALL)
for line in f:
tweet = json.loads(line)
user = tweet['user']
output = {
'text': tweet['text'],
'id': tweet['id'],
}
writer.writerow(output)
5mark source
share