Pandas Index-based Dataframe Mask

I have the following framework:

import pandas as pd index = pd.date_range('2013-1-1',periods=10,freq='15Min') data = pd.DataFrame(data=[1,2,3,4,5,6,7,8,9,0], columns=['value'], index=index) 

How to create a mask based on the index value? I know I can do something like:

 data['value'] > 3 Out[40]: 2013-01-01 00:00:00 False 2013-01-01 00:15:00 False 2013-01-01 00:30:00 False 2013-01-01 00:45:00 True 2013-01-01 01:00:00 True 2013-01-01 01:15:00 True 2013-01-01 01:30:00 True 2013-01-01 01:45:00 True 2013-01-01 02:00:00 True 2013-01-01 02:15:00 False Freq: 15T, Name: value, dtype: bool 

I want to create a mask to only look at some rows where the index is in a certain range. I was thinking of doing something like data['index'].time() > datetime.time(1,15) to create a mask. Except, of course, data['index'] fails because the index is not a column name. How can you refer to the index value for a string in a mask?

+4
source share
2 answers

You can mask with indexer_between_time :

 In [11]: data.index.indexer_between_time(start='01:15', end='02:00') Out[11]: array([5, 6, 7, 8]) In [12]: data.iloc[data.index.indexer_between_time(start='1:15', end='02:00')] Out[12]: value 2013-01-01 01:15:00 6 2013-01-01 01:30:00 7 2013-01-01 01:45:00 8 2013-01-01 02:00:00 9 

As you can see, you are accessing the index using the .index attribute.

Note: by default, indexer_between_time , both the include_start and include_end are True, it also offers the tz argument to compare the time with another time zone.

+10
source

"start" and "stop" keywords are out of date. With pandas> 17.1; Instead, I had to use the following syntax:

 data.iloc[data.index.indexer_between_time('1:15', '02:00')] Out[90]: value 2013-01-01 01:15:00 6 2013-01-01 01:30:00 7 2013-01-01 01:45:00 8 2013-01-01 02:00:00 9 
+3
source

Source: https://habr.com/ru/post/1490596/


All Articles