Convert Pandas Dataframe Types

I have a pandas dataFrame created with a mysql call that returns data as an object type.

The data is mostly numerical, with some values ​​of "na".

How can I overlay the dataFrame type so that the numeric values ​​are correctly printed (float) and the values ​​"na" are represented as numN NaN values?

+4
source share
3 answers

Use replacement method on data frames:

import numpy as np df = DataFrame({ 'k1': ['na'] * 3 + ['two'] * 4, 'k2': [1, 'na', 2, 'na', 3, 4, 4]}) print df df = df.replace('na', np.nan) print df 

I think it’s useful to point out that df.replace ('na', np.nan) alone will not work. You must return it to the existing framework.

+1
source

df = df.convert_objects(convert_numeric=True) will work in most cases.

It should be noted that this copies the data. It would be preferable to get it on a numeric type on initial reading. If you post your code and a small example, someone can help you with this.

+1
source

This is what Tom suggested and rightly so.

 In [134]: s = pd.Series(['1','2.','na']) In [135]: s.convert_objects(convert_numeric=True) Out[135]: 0 1 1 2 2 NaN dtype: float64 

As Andy points out, this does not work directly (I think this is a mistake), so first convert to all string elements, and then convert

 In [136]: s2 = pd.Series(['1','2.','na',5]) In [138]: s2.astype(str).convert_objects(convert_numeric=True) Out[138]: 0 1 1 2 2 NaN 3 5 dtype: float64 
+1
source

Source: https://habr.com/ru/post/1489580/


All Articles