Run groupby on 'id_col' and then transform passing the function 'min', this will return the series aligned to the original df so you can add it as a new column:
In [13]: df = pd.DataFrame({"id_col":[0, 0, 0, 1, 1, 1, 2, 2, 2], "val_col":[0.1, 0.2, 0.3, 0.6, 0.4, 0.5, 0.2, 0.1, 0.0]}) df['offset'] = df.groupby('id_col').transform('min') df Out[13]: id_col val_col offset 0 0 0.1 0.1 1 0 0.2 0.1 2 0 0.3 0.1 3 1 0.6 0.4 4 1 0.4 0.4 5 1 0.5 0.4 6 2 0.2 0.0 7 2 0.1 0.0 8 2 0.0 0.0
<strong> timings
In [15]: def apply_by_id_value(df, id_col="id_col", val_col="val_col", offset_col="offset", f=min): for rid in set(df[id_col].values): df.loc[df[id_col] == rid, offset_col] = f(df[df[id_col] == rid][val_col]) return df %timeit apply_by_id_value(df) %timeit df.groupby('id_col').transform('min') 100 loops, best of 3: 8.12 ms per loop 100 loops, best of 3: 5.99 ms per loop
So groupby and transform are faster in this dataset, I expect it to be significantly faster on your real dataset, as it will scale better.
For line 80000 df, I get the following timings:
1 loops, best of 3: 611 ms per loop 1 loops, best of 3: 438 ms per loop