Pandas convert string to int

I have a large data block with ID numbers:

ID.head()
Out[64]: 
0    4806105017087
1    4806105017087
2    4806105017087
3    4901295030089
4    4901295030089

These are all the lines at the moment.

I want to convert to intwithout using loops - for that I use ID.astype(int).

The problem is that some of my lines contain dirty data that cannot be converted to int, for example,

ID[154382]
Out[58]: 'CN414149'

How can I (without using loops) remove these types of occurrences so that I can use astypewith peace of mind?

+6
source share
1 answer

You need to add a parameter errors='coerce'to the function to_numeric:

ID = pd.to_numeric(ID, errors='coerce')

If ID- column:

df.ID = pd.to_numeric(df.ID, errors='coerce')

NaN, float.

int NaN , . 0, int:

df.ID = pd.to_numeric(df.ID, errors='coerce').fillna(0).astype(np.int64)

:

df = pd.DataFrame({'ID':['4806105017087','4806105017087','CN414149']})
print (df)
              ID
0  4806105017087
1  4806105017087
2       CN414149

print (pd.to_numeric(df.ID, errors='coerce'))
0    4.806105e+12
1    4.806105e+12
2             NaN
Name: ID, dtype: float64

df.ID = pd.to_numeric(df.ID, errors='coerce').fillna(0).astype(np.int64)
print (df)
              ID
0  4806105017087
1  4806105017087
2              0
+18

Source: https://habr.com/ru/post/1015652/


All Articles