B'Recode from ANSI 1252 to UTF-8 failed with the error: "Invalid argument." python surveying

I am trying to read a shapefile in a GeoDataFrame.

Usually I just do this and it works:

import pandas as pd import geopandas as gpd from shapely.geometry import Point df = gpd.read_file("wild_fires/nbac_2016_r2_20170707_1114.shp") 

But this time it throws an error: b'Recode from ANSI 1252 to UTF-8 failed with the error: "Invalid argument".'

Full error:

 --------------------------------------------------------------------------- CPLE_AppDefinedError Traceback (most recent call last) <ipython-input-14-adcad0275d30> in <module>() ----> 1 df_wildfires_2016 = gpd.read_file("wild_fires/nbac_2016_r2_20170707_1114.shp") /usr/local/lib/python3.6/site-packages/geopandas/io/file.py in read_file(filename, **kwargs) 19 """ 20 bbox = kwargs.pop('bbox', None) ---> 21 with fiona.open(filename, **kwargs) as f: 22 crs = f.crs 23 if bbox is not None: /usr/local/lib/python3.6/site-packages/fiona/__init__.py in open(path, mode, driver, schema, crs, encoding, layer, vfs, enabled_drivers, crs_wkt) 163 c = Collection(path, mode, driver=driver, encoding=encoding, 164 layer=layer, vsi=vsi, archive=archive, --> 165 enabled_drivers=enabled_drivers) 166 elif mode == 'w': 167 if schema: /usr/local/lib/python3.6/site-packages/fiona/collection.py in __init__(self, path, mode, driver, schema, crs, encoding, layer, vsi, archive, enabled_drivers, crs_wkt, **kwargs) 151 if self.mode == 'r': 152 self.session = Session() --> 153 self.session.start(self) 154 elif self.mode in ('a', 'w'): 155 self.session = WritingSession() fiona/ogrext.pyx in fiona.ogrext.Session.start (fiona/ogrext2.c:8432)() fiona/_err.pyx in fiona._err.GDALErrCtxManager.__exit__ (fiona/_err.c:1861)() CPLE_AppDefinedError: b'Recode from ANSI 1252 to UTF-8 failed with the error: "Invalid argument".' 

I tried to find out why I get an error for a while, but cannot find the answer.

The data was obtained from this web page, which I downloaded only from the 2016 link: http://cwfis.cfs.nrcan.gc.ca/datamart/download/nbac?token=78e9bd6af67f71204e18cb6fa4e47515

Can anyone help me? Thanks.

+6
source share
4 answers

It looks like your shapefile contains non-UTF characters, which leads to Fiona.open() calling Fiona.open() (the geopanda uses Fiona to open files).

What I did to solve this error was to open a Shapefile (for example, with QGis), then select " save as and specify the Encoding option as" UTF-8 ":

enter image description here

After that, I did not get an error when calling df = gpd.read_file("convertedShape.shp") .


Another way to do this without using QGis or similar is to read and save your Shapefile again (effectively converting it to the correct format). With OGR, you can do something like this:

 from osgeo import ogr driver = ogr.GetDriverByName("ESRI Shapefile") ds = driver.Open("nbac_2016_r2_20170707_1114.shp", 0) #open your shapefile #get its layer layer = ds.GetLayer() #create new shapefile to convert ds2 = driver.CreateDataSource('convertedShape.shp') #create a Polygon layer, as the one your Shapefile has layer2 = ds2.CreateLayer('', None, ogr.wkbPolygon) #iterate over all features of your original shapefile for feature in layer: #and create a new feature on your converted shapefile with those features layer2.CreateFeature(feature) ds = layer = ds2 = layer2 = None 

It also allowed to open successfully with df = gpd.read_file("convertedShape.shp") after conversion. Hope this helps.

+6
source
 with fiona.open(file, encoding="UTF-8") as f: 

worked for me.

+4
source

Since you have GDAL installed, I recommend converting the file to UTF-8 using the CLI:

 ogr2ogr output.shp input.shp -lco ENCODING=UTF-8 

Worked for me like a charm. It is much faster than QGIS or Python, and can be used in a clustered environment.

0
source

As a complement to this answer, you can pass fiona arguments through the read_file geopand :

 df = gpd.read_file("filename", encoding="utf-8") 
0
source

Source: https://habr.com/ru/post/1273288/


All Articles