Python pandas dataframe - any way to set frequency programmatically?

I am trying to process CSV files as follows:

df = pd.read_csv("raw_hl.csv", index_col='time', parse_dates = True))
df.head(2)
                    high        low 
time                
2014-01-01 17:00:00 1.376235    1.375945
2014-01-01 17:01:00 1.376005    1.375775
2014-01-01 17:02:00 1.375795    1.375445
2014-01-01 17:07:00 NaN         NaN 
...
2014-01-01 17:49:00 1.375645    1.375445

type(df.index)
pandas.tseries.index.DatetimeIndex

But they do not automatically have a frequency:

print df.index.freq
None

In case they have different frequencies, it would be convenient to be able to set them automatically. The easiest way is to compare the first two lines:

tdelta = df.index[1] - df.index[0]
tdelta
datetime.timedelta(0, 60) 

So far, so good, but setting the frequency directly to this timedelta fails:

df.index.freq = tdelta
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-25-3f24abacf9de> in <module>()
----> 1 df.index.freq = tdelta

AttributeError: can't set attribute

Is there a way (ideally relatively painless!) To do this?

ANSWER: Pandas gave the dataframe has an index.inferred_freq attribute - perhaps so as not to overwrite the user-defined frequency. df.index.inferred_freq = 'T'

So it's just a matter of using this instead of df.index.freq. Thanks to Jeff, who also provides more details below :)

+4
1

, , df.index.freq

In [20]: df = DataFrame({'A' : np.arange(5)},index=pd.date_range('20130101 09:00:00',freq='3T',periods=5))

In [21]: df
Out[21]: 
                     A
2013-01-01 09:00:00  0
2013-01-01 09:03:00  1
2013-01-01 09:06:00  2
2013-01-01 09:09:00  3
2013-01-01 09:12:00  4

In [22]: df.index.freq
Out[22]: <3 * Minutes>

None

In [23]: df.index = df.index[0:2].tolist() + [Timestamp('20130101 09:05:00')] + df.index[-2:].tolist()

In [24]: df
Out[24]: 
                     A
2013-01-01 09:00:00  0
2013-01-01 09:03:00  1
2013-01-01 09:05:00  2
2013-01-01 09:09:00  3
2013-01-01 09:12:00  4

In [25]: df.index.freq

, . ( ), , ).

In [31]: df.resample('T').ffill().reindex(pd.date_range(df.index[0],df.index[-1],freq='3T'))
Out[31]: 
                     A
2013-01-01 09:00:00  0
2013-01-01 09:03:00  1
2013-01-01 09:06:00  2
2013-01-01 09:09:00  3
2013-01-01 09:12:00  4
+4

Source: https://habr.com/ru/post/1568654/


All Articles