Merge identical adjacent rows in a Pandas series

Question

Merge identical adjacent rows in a Pandas series

Basically, if the column of my pandas frame looks like this:

[1 1 1 2 2 2 3 3 3 1 1]

I would like it to be included in the following:

[1 2 3 1]

+4

python pandas

shane Jul 19 '16 at 4:51

source share

4 answers

dmlicht · Answer 1 · 2016-07-19T05:03:31+0000

You can write a simple function that traverses the elements of your series, keeping only the first element in the run.

As far as I know, for pandas there is no tool built into pandas. But it is not so much code to do it yourself.

import pandas
example_series = pandas.Series([1, 1, 1, 2, 2, 3])

def collapse(series):
    last = ""
    seen = []
    for element in series:
        if element != last:
            last = element
            seen.append(element)
    return seen

collapse(example_series)

In the above code, you will go through each element of the series and check if it matches the last element seen. If it is not, save it. If so, ignore the value.

, :

return pandas.Series(seen)

alex314159 · Answer 2 · 2016-07-19T06:10:26+0000

, :

x = pandas.Series([1 1 1 2 2 2 3 3 3 1 1])
y = x-x.shift(1)
y[0] = 1
result = x[y!=0]

frist · Answer 3 · 2016-07-19T06:57:30+0000

You can use DataFrame diff and indexing:

>>> df = pd.DataFrame([1,1,2,2,2,2,3,3,3,3,1])
>>> df[df[0].diff()!=0]
    0
0   1
2   2
6   3
10  1
>>> df[df[0].diff()!=0].values.ravel() # If you need an array
array([1, 2, 3, 1])

The same thing works for the series:

>>> df = pd.Series([1,1,2,2,2,2,3,3,3,3,1])
>>> df[df.diff()!=0].values
array([1, 2, 3, 1])

Edchum · Answer 4 · 2016-07-19T08:01:04+0000

You can use shiftto create a boolean mask to compare a string with a previous string:

In [67]:
s = pd.Series([1,1,2,2,2,2,3,3,3,3,4,4,5])
s[s!=s.shift()]

Out[67]:
0     1
2     2
6     3
10    4
12    5
dtype: int64

Merge identical adjacent rows in a Pandas series

You can write a simple function that traverses the elements of your series, keeping only the first element in the run.

More articles: