Records with three or more columns with 0

I have a dataframe that has a lot of 0, like the df example below. I would like to remove any row that has 0 in three or more columns, for example, the Resultdf example.

Below is a script to delete all entries, all 0

df = df[(df.T != 0).any()]

Is there a way to change it so that it deletes records, all 0, or having three or more columns with 0? Or is there another way to do this?

print df:

ind_key prtCnt fldCnt TmCnt bmCnt
1       0      0      0     0
2       2      0      0     3
3       0      1      0     0
4       0      1      1     0

print Resultdf:

ind_key prtCnt fldCnt TmCnt bmCnt
2       2      0      0     3
4       0      1      1     0
+4
source share
3 answers

You can use sumwithaxis = 1

df[df.eq(0).sum(1)<3] # eq mean '=='
Out[523]: 
   ind_key  prtCnt  fldCnt  TmCnt  bmCnt
1        2       2       0      0      3
3        4       0       1      1      0
+6
source

Use idiomatic dropnawith flag set thresh:

df[df != 0].dropna(thresh=len(df.columns) -  2, axis=0)

   ind_key  prtCnt  fldCnt  TmCnt  bmCnt
1        2     2.0     NaN    NaN    3.0
3        4     NaN     1.0    1.0    NaN
+2
source

numpy.argpartition. , . , . 3 , false.

df[~(df.values != 0).argpartition(3, 1)[:, :3].all(1)]

   ind_key  prtCnt  fldCnt  TmCnt  bmCnt
1        2       2       0      0      3
3        4       0       1      1      0
+2

Source: https://habr.com/ru/post/1696204/


All Articles