Python: create a new data column and write an index corresponding to datetime periods

I have the following framework:

 date_time             value     member
2013-10-09 09:00:00  664639  Jerome
2013-10-09 09:05:00  197290  Hence
2013-10-09 09:10:00  470186  Ann
2013-10-09 09:15:00  181314  Mikka
2013-10-09 09:20:00  969427  Cristy
2013-10-09 09:25:00  261473  James
2013-10-09 09:30:00  003698  Oliver

and the second data block, where I have the ratings:

   date_start            date_end
2013-10-09 09:19:00         2013-10-09 09:25:00
2013-10-09 09:25:00         2013-10-09 09:40:00 

so I need to create a new column where I will write the index of each interval between two datetime points:

smth like:

date_time             value     member    session
2013-10-09 09:00:00  664639  Jerome        1
2013-10-09 09:05:00  197290  Hence         1
2013-10-09 09:10:00  470186  Ann            1
2013-10-09 09:15:00  181314  Mikka          2
2013-10-09 09:20:00  969427  Cristy         2
2013-10-09 09:25:00  261473  James          2
2013-10-09 09:30:00  003698  Oliver         2

The following code creates a column 'session', but does not record the session index (i.e., the row index in the boundsdataframe) in the column 'session', so do not separate the original data frame from the intervals:

def create_interval():
    df['session']=''
    for index, row in bounds.iterrows():
        s = row['date_start']
        e = row['date_end']
        mask=(df['date'] > s) & (df['date'] < e)
        df.loc[mask]['session']='[index]'

    return df

UPDATE

bounds['date_start'].searchsorted(df['date_time']) , , .. : df['Session']= 1 , = 2 .. Session , date_start date_end bounds , df ['date_time'] ['start_date'], Session, ,

+4
1

, , ( ), apply "date_time" np.searchsorted, , bounds df :

In [266]:
df['Session'] = df['date_time'].apply(lambda x: np.searchsorted(bounds['date_start'], x)[0])
df

Out[266]:
            date_time   value  member  Session
0 2013-10-09 09:00:00  664639  Jerome        0
1 2013-10-09 09:05:00  197290   Hence        0
2 2013-10-09 09:10:00  470186     Ann        0
3 2013-10-09 09:15:00  181314   Mikka        0
4 2013-10-09 09:20:00  969427  Cristy        1
5 2013-10-09 09:25:00  261473   James        1
6 2013-10-09 09:30:00    3698  Oliver        2

@Jeff , apply , , , :

In [293]:
df['session'] = bounds['date_start'].searchsorted(df['date_time']) 
df

Out[293]:
            date_time   value  member  session
0 2013-10-09 09:00:00  664639  Jerome        0
1 2013-10-09 09:05:00  197290   Hence        0
2 2013-10-09 09:10:00  470186     Ann        0
3 2013-10-09 09:15:00  181314   Mikka        0
4 2013-10-09 09:20:00  969427  Cristy        1
5 2013-10-09 09:25:00  261473   James        1
6 2013-10-09 09:30:00    3698  Oliver        2
+2

Source: https://habr.com/ru/post/1611573/


All Articles