I have a dataframe that looks like this:
>>> a_df
state
1 A
2 B
3 A
4 B
5 C
What I would like to do is return all consecutive lines corresponding to a specific sequence. For example, if this is a sequence ['A', 'B'], then rows whose state Afollowed immediately Bmust be returned. In the above example:
>>> cons_criteria(a_df, ['A', 'B'])
state
1 A
2 B
3 A
4 B
Or, if the selected array ['A', 'B', 'C'], then the output should be
>>> cons_criteria(a_df, ['A', 'B', 'C'])
state
3 A
4 B
5 C
I decided to do this while maintaining the current state as well as the following state:
>>> df2 = a_df.copy()
>>> df2['state_0'] = a_df['state']
>>> df2['state_1'] = a_df['state'].shift(-1)
Now I can match with state_0and state_1. But this only returns the very first record:
>>> df2[(df2['state_0'] == 'A') & (df2['state_1'] == 'B')]
state
1 A
3 A
, ? pandas?