Remove NaN 'cells without dropping all ROW (Pandas, Python3)

Question

Remove NaN 'cells without dropping all ROW (Pandas, Python3)

Now I have such a DF

Word Word2 Word3 Hello NaN NaN My My Name NaN Yellow Yellow Bee Yellow Bee Hive Golden Golden Gates NaN Yellow NaN NaN

What I was hoping for was to remove all NaN cells from my data frame. So in the end, it will look like where "Yellow Bee Hive" moved to row 1 (similar to what happens when you delete cells from a column in excel):

  Word Word2 Word3 1 Hello My Name Yellow Bee Hive 2 My Yellow Bee 3 Yellow Golden Gates 4 Golden 5 Yellow

Unfortunately, none of them work because they delete the entire ROW!

  df = df[pd.notnull(df['Word','Word2','Word3'])]

or

  df = df.dropna()

Anyone have any suggestions? Should I reindex the table?

+5

python python-3.x pandas

user3682157 Sep 19 '14 at 20:32

source share

2 answers

unutbu · Answer 1 · 2014-09-19T20:40:14+0000

 import numpy as np import pandas as pd import functools def drop_and_roll(col, na_position='last', fillvalue=np.nan): result = np.full(len(col), fillvalue, dtype=col.dtype) mask = col.notnull() N = mask.sum() if na_position == 'last': result[:N] = col.loc[mask] elif na_position == 'first': result[-N:] = col.loc[mask] else: raise ValueError('na_position {!r} unrecognized'.format(na_position)) return result df = pd.read_table('data', sep='\s{2,}') print(df.apply(functools.partial(drop_and_roll, fillvalue='')))

gives

  Word Word2 Word3 0 Hello My Name Yellow Bee Hive 1 My Yellow Bee 2 Yellow Golden Gates 3 Golden 4 Yellow

Shashank agarwal · Answer 2 · 2014-09-19T20:59:13+0000

Since you want the values to move up, you need to create a new data frame

Let's start with -

  Word Word2 0 Hello NaN 1 My My Name 2 Yellow Yellow Bee 3 Golden Golden Gates 4 Yellow NaN

The following method is used -

 def get_column_array(df, column): expected_length = len(df) current_array = df[column].dropna().values if len(current_array) < expected_length: current_array = np.append(current_array, [''] * (expected_length - len(current_array))) return current_array pd.DataFrame({column: get_column_array(df, column) for column in df.columns}

Gives -

  Word Word2 0 Hello My Name 1 My Yellow Bee 2 Yellow Golden Gates 3 Golden 4 Yellow

You can also edit an existing df with the same function -

 for column in df.columns: df[column] = get_column_array(df, column)

Remove NaN 'cells without dropping all ROW (Pandas, Python3)

More articles: