I would dump on numpy to make it a little faster:
In [11]: df = pd.DataFrame([[np.nan, 1], [0, np.nan], [1, 2]])
In [12]: df
Out[12]:
0 1
0 NaN 1
1 0 NaN
2 1 2
In [13]: pd.isnull(df.values)
Out[13]:
array([[ True, False],
[False, True],
[False, False]], dtype=bool)
In [14]: pd.isnull(df.values).any(1)
Out[14]: array([ True, True, False], dtype=bool)
In [15]: np.nonzero(pd.isnull(df.values).any(1))
Out[15]: (array([0, 1]),)
In [16]: df.index[np.nonzero(pd.isnull(df.values).any(1))]
Out[16]: Int64Index([0, 1], dtype='int64')
To see some timings with slightly large df:
In [21]: df = pd.DataFrame([[np.nan, 1], [0, np.nan], [1, 2]] * 1000)
In [22]: %timeit np.nonzero(pd.isnull(df.values).any(1))
10000 loops, best of 3: 85.8 ยตs per loop
In [23]: %timeit df.index[df.isnull().any(1)]
1000 loops, best of 3: 629 ยตs per loop
and if you care about the index (not the position):
In [24]: %timeit df.index[np.nonzero(pd.isnull(df.values).any(1))]
10000 loops, best of 3: 172 ยตs per loop
source
share