How to search all rows of a data frame for values ​​outside a certain range of numbers?

So, I have a data frame that contains 50 columns and 400 rows consisting of all numbers. I am trying to display only those columns that have values ​​that fall outside the given range (i.e. only display values ​​from -1 to +3).

So far, I:

df[(df.T > 3).all()]

to display values ​​greater than 2, I can change the integer to another number you are interested in, but how can I write something to display numbers outside the range (i.e. display all columns that have values ​​outside the range of -1 to +3).

+4
source share
2 answers

you can use pd.DataFrame.mask

np.random.seed([3,1415])
df = pd.DataFrame(np.random.randint(-2, 4, (5, 3)), columns=list('abc'))
print(df)

   a  b  c
0 -2  1  0
1  1  0  0
2  3  1  3
3  0  1 -2
4  0 -2 -2

, True NaN

df.mask(df.ge(3) | df.le(-1))

     a    b    c
0  NaN  1.0  0.0
1  1.0  0.0  0.0
2  NaN  1.0  NaN
3  0.0  1.0  NaN
4  0.0  NaN  NaN

df.mask(df.lt(3) & df.gt(-1))

     a    b    c
0 -2.0  NaN  NaN
1  NaN  NaN  NaN
2  3.0  NaN  3.0
3  NaN  NaN -2.0
4  NaN -2.0 -2.0
+4

stack, , between, , ~, dropna(axis=1):

In [193]:
df = pd.DataFrame(np.random.randn(5,3), columns=list('abc'))
df

Out[193]:
          a         b         c
0  0.088639  0.275458  0.837952
1  1.395237 -0.582110  0.614160
2 -1.114384 -2.774358  2.119473
3  1.050008 -1.195167 -0.343875
4 -0.006156 -2.028601 -0.071448

In [198]:
df[~df.stack().between(0.1,1).unstack()].dropna(axis=1)

Out[198]:
          a
0  0.088639
1  1.395237
2 -1.114384
3  1.050008
4 -0.006156

, "a" 0,1 1

dropna , , NaN:

In [199]:
df[~df.stack().between(0.1,1).unstack()]

Out[199]:
          a         b         c
0  0.088639       NaN       NaN
1  1.395237 -0.582110       NaN
2 -1.114384 -2.774358  2.119473
3  1.050008 -1.195167 -0.343875
4 -0.006156 -2.028601 -0.071448

, , inclusive=False between

+1

Source: https://habr.com/ru/post/1666628/


All Articles