Pandas Fillna multiple columns in each column mode

Question

Pandas Fillna multiple columns in each column mode

When working with census data, I want to replace NaNs in two columns (“working class” and “home country”) with the corresponding modes of these two columns. I can easily get the modes:

mode = df.filter(["workclass", "native-country"]).mode()

which returns a dataframe:

  workclass native-country 0 Private United-States

but

 df.filter(["workclass", "native-country"]).fillna(mode)

does not replace the NaN in each column with anything, not to mention the mode corresponding to that column. Is there a smooth way to do this?

+5

python numpy pandas data-science

Nick Mar 18 '17 at 4:42

source share

2 answers

You can do it as follows:

 df[["workclass", "native-country"]]=df[["workclass", "native-country"]].fillna(value=mode.iloc[0])

For instance,

  import pandas as pd d={ 'key3': [1,4,4,4,5], 'key2': [6,6,4], 'key1': [6,4,4], } df=pd.DataFrame.from_dict(d,orient='index').transpose()

Then df is

  key3 key2 key1 0 1 6 6 1 4 6 4 2 4 4 4 3 4 NaN NaN 4 5 NaN NaN

Then by doing:

 l=df.filter(["key1", "key2"]).mode() df[["key1", "key2"]]=df[["key1", "key2"]].fillna(value=l.iloc[0])

we get that df is

  key3 key2 key1 0 1 6 6 1 4 6 4 2 4 4 4 3 4 6 4 4 5 6 4

+2

Miriam farber Mar 18 '17 at 4:56

source share

jezrael · Accepted Answer · 2017-03-18T06:26:52+0000

If you want to enter missing values with mode in some dataframe df columns, you can simply fillna Series , created by selecting at the iloc position:

 cols = ["workclass", "native-country"] df[cols]=df[cols].fillna(df.mode().iloc[0])

Or:

 df[cols]=df[cols].fillna(mode.iloc[0])

Your choice:

 df[cols]=df.filter(cols).fillna(mode.iloc[0])

Example:

 df = pd.DataFrame({'workclass':['Private','Private',np.nan, 'another', np.nan], 'native-country':['United-States',np.nan,'Canada',np.nan,'United-States'], 'col':[2,3,7,8,9]}) print (df) col native-country workclass 0 2 United-States Private 1 3 NaN Private 2 7 Canada NaN 3 8 NaN another 4 9 United-States NaN mode = df.filter(["workclass", "native-country"]).mode() print (mode) workclass native-country 0 Private United-States cols = ["workclass", "native-country"] df[cols]=df[cols].fillna(df.mode().iloc[0]) print (df) col native-country workclass 0 2 United-States Private 1 3 United-States Private 2 7 Canada Private 3 8 United-States another 4 9 United-States Private

Pandas Fillna multiple columns in each column mode

More articles: