It seems to me that you need to_numeric , because float cannot be attributed to int :
data_df['grade'] = pd.to_numeric(data_df['grade']).astype(int)
Another solution is first added to the float , and then to int :
data_df['grade'] = data_df['grade'].astype(float).astype(int)
Example:
data_df = pd.DataFrame({'grade':['10','20','17.44']}) print (data_df) grade 0 10 1 20 2 17.44 data_df['grade'] = pd.to_numeric(data_df['grade']).astype(int) print (data_df) grade 0 10 1 20 2 17
data_df['grade'] = data_df['grade'].astype(float).astype(int) print (data_df) grade 0 10 1 20 2 17
---
If some values ββcannot be converted and after to_numeric get an error:
ValueError: cannot parse string
you can add the parameter errors='coerce' to convert non-numeric to NaN .
If the values ββare NaN , then it is impossible to distinguish from int , see docs :
data_df = pd.DataFrame({'grade':['10','20','17.44', 'aa']}) print (data_df) grade 0 10 1 20 2 17.44 3 aa data_df['grade'] = pd.to_numeric(data_df['grade'], errors='coerce') print (data_df) grade 0 10.00 1 20.00 2 17.44 3 NaN
If you want to change NaN to some numeric ones, for example. 0 use fillna :
data_df['grade'] = pd.to_numeric(data_df['grade'], errors='coerce') .fillna(0) .astype(int) print (data_df) grade 0 10 1 20 2 17 3 0
A little tip:
Before using errors='coerce' , check all strings where it is not possible to list on numeric characters boolean indexing :
print (data_df[pd.to_numeric(data_df['grade'], errors='coerce').isnull()]) grade 3 aa