I have the following framework:
date_time value member
2013-10-09 09:00:00 664639 Jerome
2013-10-09 09:05:00 197290 Hence
2013-10-09 09:10:00 470186 Ann
2013-10-09 09:15:00 181314 Mikka
2013-10-09 09:20:00 969427 Cristy
2013-10-09 09:25:00 261473 James
2013-10-09 09:30:00 003698 Oliver
and the second data block, where I have the ratings:
date_start date_end
2013-10-09 09:19:00 2013-10-09 09:25:00
2013-10-09 09:25:00 2013-10-09 09:40:00
so I need to create a new column where I will write the index of each interval between two datetime points:
smth like:
date_time value member session
2013-10-09 09:00:00 664639 Jerome 1
2013-10-09 09:05:00 197290 Hence 1
2013-10-09 09:10:00 470186 Ann 1
2013-10-09 09:15:00 181314 Mikka 2
2013-10-09 09:20:00 969427 Cristy 2
2013-10-09 09:25:00 261473 James 2
2013-10-09 09:30:00 003698 Oliver 2
The following code creates a column 'session'
, but does not record the session index (i.e., the row index in the bounds
dataframe) in the column 'session'
, so do not separate the original data frame from the intervals:
def create_interval():
df['session']=''
for index, row in bounds.iterrows():
s = row['date_start']
e = row['date_end']
mask=(df['date'] > s) & (df['date'] < e)
df.loc[mask]['session']='[index]'
return df
UPDATE
bounds['date_start'].searchsorted(df['date_time'])
, , .. : df['Session']
= 1 , = 2 .. Session
, date_start
date_end
bounds
, df ['date_time'] ['start_date'], Session
, ,
user5421875