I have the following framework:
date_time value member
2013-10-09 09:00:00 664639 Jerome
2013-10-09 09:05:00 197290 Hence
2013-10-09 09:10:00 470186 Ann
2013-10-09 09:15:00 181314 Mikka
2013-10-09 09:20:00 969427 Cristy
2013-10-09 09:25:00 261473 James
2013-10-09 09:30:00 003698 Oliver
and the second data block, where I have the ratings:
date_start date_end
2013-10-09 09:19:00 2013-10-09 09:25:00
2013-10-09 09:25:00 2013-10-09 09:40:00
so I need to create a new column where I will write the index of each interval between two datetime points:
smth like:
date_time value member session
2013-10-09 09:00:00 664639 Jerome 1
2013-10-09 09:05:00 197290 Hence 1
2013-10-09 09:10:00 470186 Ann 1
2013-10-09 09:15:00 181314 Mikka 2
2013-10-09 09:20:00 969427 Cristy 2
2013-10-09 09:25:00 261473 James 2
2013-10-09 09:30:00 003698 Oliver 2
The following code creates a column 'session', but does not record the session index (i.e., the row index in the boundsdataframe) in the column 'session', so do not separate the original data frame from the intervals:
def create_interval():
df['session']=''
for index, row in bounds.iterrows():
s = row['date_start']
e = row['date_end']
mask=(df['date'] > s) & (df['date'] < e)
df.loc[mask]['session']='[index]'
return df
UPDATE
bounds['date_start'].searchsorted(df['date_time']) , , .. : df['Session']= 1 , = 2 .. Session , date_start date_end bounds
, df ['date_time'] ['start_date'], Session, ,
user5421875