Using the values of the previous "string" in the pandas series

Question

Using the values of the previous "string" in the pandas series

I have a CSV that looks like this (and when typed into a pandas Dataframe with read_csv() , it looks the same).

I want to update the values in the ad_requests column according to the following logic:

For a given string, if the value of ad_requests matters, leave it alone. Else, set the value of the previous row value to ad_requests minus the value of the previous row for impressions . So, in the first example, we would like to:

I get partially there:

 df["ad_requests"] = [i if not pd.isnull(i) else ??? for i in df["ad_requests"]]

And here I am stuck. After else I want to "go back" and access the previous "line", although I know that this does not mean that pandas is supposed to be used. Another thing to note is that the rows will always be grouped into three, according to the ad_tag_name column. If I pd.groupby["ad_tag_name"] , I can turn this into a list and start slicing and indexing, but again, I think pandas should be the best way to do this (as there are a lot of things).

Python: 2.7.10

Pandas: 0.18.0

+5

python python-2.7 pandas dataframe

Pyderman Nov 22 '16 at 4:02

source share

1 answer

Rojan · Accepted Answer · 2016-11-22T10:44:04+0000

You need to do something like this:

 pd.options.mode.chained_assignment = None #suppresses "SettingWithCopyWarning" for index, elem in enumerate(df['ad_requests']): if pd.isnull(elem): df['ad_requests'][index]=df['ad_requests'][index-1]-df['impressions'][index-1]

The warning comes from the fact that we are changing the values of the appearance of the data frame, which affects the original data frame. This is what we want to do, however, it really does not concern us.

(Python 2.7.12 and Pandas 0.19.0)

EDIT:

Change the last line of code from

 df['ad_requests'][index]=df['ad_requests'][index-1]-df['impressions'][index-1]

to

 df.at[index,'ad_requests']=df.at[index-1,'ad_requests']-df.at[index-1,'impressions']

Eliminates the need to suppress any warnings:

 for index, elem in enumerate(df['ad_requests']): if pd.isnull(elem): df.at[index,'ad_requests']=df.at[index-1,'ad_requests']-df.at[index-1,'impressions']

Using the values ​​of the previous "string" in the pandas series

More articles:

Using the values of the previous "string" in the pandas series