Python ASCII codec cannot encode character error while writing to CSV

I'm not quite sure what I need to do with this error. I suggested that this is due to the need to add .encode ('utf-8'). But I'm not quite sure what I need to do, and where I should apply it.

Error:

line 40, in <module> writer.writerows(list_of_rows) UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 1 7: ordinal not in range(128) 

This is the base of my python script.

 import csv from BeautifulSoup import BeautifulSoup url = \ 'https://dummysite' response = requests.get(url) html = response.content soup = BeautifulSoup(html) table = soup.find('table', {'class': 'table'}) list_of_rows = [] for row in table.findAll('tr')[1:]: list_of_cells = [] for cell in row.findAll('td'): text = cell.text.replace('[','').replace(']','') list_of_cells.append(text) list_of_rows.append(list_of_cells) outfile = open("./test.csv", "wb") writer = csv.writer(outfile) writer.writerow(["Name", "Location"]) writer.writerows(list_of_rows) 
+5
source share
2 answers

Python 2.x CSV library is broken. You have three options. In order of difficulty:

  • Edit: see below Use the fixed library https://github.com/jdunck/python-unicodecsv ( pip install unicodecsv ). Use as a replacement for a replacement - Example:

     with open("myfile.csv", 'rb') as my_file: r = unicodecsv.DictReader(my_file, encoding='utf-8') 

Strike>

  1. Read the CSV manual regarding Unicode: https://docs.python.org/2/library/csv.html (see examples below)

  2. Manually encode each element as UTF-8:

     for cell in row.findAll('td'): text = cell.text.replace('[','').replace(']','') list_of_cells.append(text.encode("utf-8")) 

Edit, I found that python-unicodecsv is also corrupted when reading UTF-16 . It complains about any 0x00 bytes.

Instead, use https://github.com/ryanhiebert/backports.csv , which more closely resembles the Python 3 implementation and uses the io module.

Installation:

 pip install backports.csv 

Using:

 from backports import csv import io with io.open(filename, encoding='utf-8') as f: r = csv.reader(f): 
+15
source

I found the simplest option, besides Alastair's great suggestions, to use python3 instead of python 2. all that was required in my script was to change wb in the open statement to just w in according to Python3 syntax .

0
source

Source: https://habr.com/ru/post/1232920/


All Articles