How to count the longest continuous sequence in pandas

Question

How to count the longest continuous sequence in pandas

Say I have pd.Seriesas below

s = pd.Series([False, True, False,True,True,True,False, False])    

0    False
1     True
2    False
3     True
4     True
5     True
6    False
7    False
dtype: bool

I want to know how long a long sequence is True, in this example it is 3.

I tried this stupid.

s_list = s.tolist()
count = 0
max_count = 0
for item in s_list:
    if item:
        count +=1
    else:
        if count>max_count:
            max_count = count
        count = 0
print(max_count)

He will print 3, but in Seriesall Truehe will print0

+4

python pandas

Dawei Feb 21 '18 at 2:26

source share

6 answers

,

pd.Series(s.index[~s].values).diff().max()-1
Out[57]: 3.0

pandas ' python

from itertools import groupby
max([len(list(group)) for key, group in groupby(s.tolist())])
Out[73]: 3

:

from itertools import compress
max(list(compress([len(list(group)) for key, group in groupby(s.tolist())],[key for key, group in groupby(s.tolist())])))
Out[84]: 3

+4

Wen 21 . '18 2:45

. piRSquared, False . piRSquared .

(np.diff(np.flatnonzero(np.append(True, np.append(~s.values, True)))) - 1).max()

(np.diff(s.where(~s).dropna().index.values) - 1).max()

( , True , piRSquared. , , piRSquared. .)

:

False , False, True.

s.where(s == False).dropna().index.values False
```
array([0, 2, 6, 7])
```

, True False s. , np.diff, .

    array([2, 4, 1])

1 True .
.

+2

Tai 21 . '18 2:40

( @piRSquared):

s.groupby((~s).cumsum()).sum().max()
Out[513]: 3.0

Another option is to use the lambda function for this.

s.to_frame().apply(lambda x: s.loc[x.name:].idxmin() - x.name, axis=1).max()
Out[429]: 3

+2

Allen Feb 21 '18 at 2:58

source share

I'm not quite sure how to do this with pandas, but what about using itertools.groupby?

>>> import pandas as pd
>>> s = pd.Series([False, True, False,True,True,True,False, False])
>>> max(sum(1 for _ in g) for k, g in groupby(s) if k)
3

+1

Delirious lettuce Feb 21 '18 at 2:34

source share

Your code really was very close. It becomes perfect with a minor fix:

count = 0
maxCount = 0
for item in s:
    if item:
        count += 1
        if count > maxCount:
            maxCount = count
    else:
        count = 0
print(maxCount)

+1

Fatihakici Feb 21 '18 at 5:29

source share

piRSquared · Accepted Answer · 2018-02-21T02:28:40+0000

Option 1
Use a series to mask the cumulative amount of negation. Then usevalue_counts

(~s).cumsum()[s].value_counts().max()

3

explanation

(~s).cumsum()- a fairly standard way to create separate groups True/False

0    1
1    1
2    2
3    2
4    2
5    2
6    3
7    4
dtype: int64

, , , 2, . , False ( True (~s)). , .

(~s).cumsum()[s]

1    1
3    2
4    2
5    2
dtype: int64

2, . value_counts max.

2
factorize bincount

a = s.values
b = pd.factorize((~a).cumsum())[0]
np.bincount(b[a]).max()

3

, 1. , max. pd.factorize 0 . , (~a).cumsum(), . , , .

pd.factorize np.bincount, , . .

3
2, :

a = s.values
np.bincount((~a).cumsum()[a]).max()

3

How to count the longest continuous sequence in pandas

More articles: