Removing rows based on column value in pandas

Question

I have a dataframe like this:

col.1 : a a a a a b b b c c c
col.2 : 0 0 1 0 0 0 1 0 0 0 0

I want to delete all values for the same value in column 1 after 1 occurs in col.2. The result should look like this:

col.1 : a a a b b c c c
col.2 : 0 0 1 0 1 0 0 0

Is there a way to do this fast in pandas? I am currently using numpy, where it seems to be very slow.

+4

Abhishek thakur Apr 23 '14 at 13:30

1 answer

Alvaro fuentes · Answer 1 · 2014-04-23T14:15:16+0000

Try the following:

df['col.2'] = df.groupby('col.1')['col.2'].cumsum()
df['col.2'] = df.groupby('col.1')['col.2'].cumsum()
df = df[df['col.2']<2]