How to remove NaN from the Pandas series, where dtype is a list?

I have pandas.Series , where the dtype for each row is a list object. For instance.

 >>> import numpy as np >>> import pandas as pd >>> x = pd.Series([[1,2,3], [2,np.nan], [3,4,5,np.nan], [np.nan]]) >>> x 0 [1, 2, 3] 1 [2, nan] 2 [3, 4, 5, nan] 3 [nan] dtype: object 

How to remove nan in lists for each row?

Desired Result:

 >>> x 0 [1, 2, 3] 1 [2] 2 [3, 4, 5] 3 [] dtype: object 

It works:

 >>> x.apply(lambda y: pd.Series(y).dropna().values.tolist()) 0 [1, 2, 3] 1 [2.0] 2 [3.0, 4.0, 5.0] 3 [] dtype: object 

Is there a simpler method than using lambda, converting to a list in a Series, removing nan and then retrieving the values ​​back to the list?

+5
source share
2 answers

You can use list comprehension with pandas.notnull to remove NaN values:

 print (x.apply(lambda y: [a for a in y if pd.notnull(a)])) 0 [1, 2, 3] 1 [2] 2 [3, 4, 5] 3 [] dtype: object 

Another solution with filter with the condition when v!=v only for NaN :

 print (x.apply(lambda a: list(filter(lambda v: v==v, a)))) 0 [1, 2, 3] 1 [2] 2 [3, 4, 5] 3 [] dtype: object 

Thanks to DYZ for another solution:

 print (x.apply(lambda y: list(filter(np.isfinite, y)))) 0 [1, 2, 3] 1 [2] 2 [3, 4, 5] 3 [] dtype: object 
+5
source

A simple numpy list solution:

 pd.Series([np.array(e)[~np.isnan(e)] for e in x.values]) 
+1
source

Source: https://habr.com/ru/post/1262263/


All Articles