Pandas: Convert to Numeric, Create NaN if necessary

Say I have a column in a data frame that contains some numbers and some not numbers

>> df['foo'] 0 0.0 1 103.8 2 751.1 3 0.0 4 0.0 5 - 6 - 7 0.0 8 - 9 0.0 Name: foo, Length: 9, dtype: object 

How can I convert this column to np.float and have everything that is not float convert it to NaN ?

When I try:

 >> df['foo'].astype(np.float) 

or

 >> df['foo'].apply(np.float) 

I get ValueError: could not convert string to float: -

+34
python pandas
Aug 25 '13 at 22:04
source share
4 answers

In 0.17.0 convert_objects raises a warning:

FutureWarning: convert_objects is deprecated. Use data type specific converters pd.to_datetime, pd.to_timedelta and pd.to_numeric.

You can use the pd.to_numeric method and apply it to the data frame using arg coerce .

 df1 = df.apply(pd.to_numeric, args=('coerce',)) 

or maybe more appropriate:

 df1 = df.apply(pd.to_numeric, errors='coerce') 

EDIT

The above method is valid only for versions of pandas> = 0.17.0 , from the document What's New in Pandas 0.17.0 :

pd.to_numeric is a new function for casting strings to numbers (possibly using) (GH11133)

+52
Nov 19 '15 at 5:25
source share

Use the convert_objects (and convert_numeric ) method:

 In [11]: s Out[11]: 0 103.8 1 751.1 2 0.0 3 0.0 4 - 5 - 6 0.0 7 - 8 0.0 dtype: object In [12]: s.convert_objects(convert_numeric=True) Out[12]: 0 103.8 1 751.1 2 0.0 3 0.0 4 NaN 5 NaN 6 0.0 7 NaN 8 0.0 dtype: float64 

Note: this is also available as a DataFrame method.

+31
Aug 25 '13 at 22:33
source share

First replace all string values ​​with None to mark them as missing values ​​and then convert them to float.

 df['foo'][df['foo'] == '-'] = None df['foo'] = df['foo'].astype(float) 
+8
Aug 25 '13 at 22:08
source share

You can just use pd.to_numeric and set the error to coerce without using apply

 df['foo'] = pd.to_numeric(df['foo'], errors='coerce') 
+6
Nov 28 '17 at 23:13
source share



All Articles