Round pandas datetime index?

Question

Round pandas datetime index?

I read several timers spreadsheets in w460> and combine them with a common datetime pandas index. The data logger that recorded the time site is not 100% more accurate, which makes oversampling very annoying, because depending on whether the time is slightly higher or lower than the selected interval, it will create NaN and start making my series look like a broken line. Here is my code

def loaddata(filepaths): t1 = time.clock() for i in range(len(filepaths)): xl = pd.ExcelFile(filepaths[i]) df = xl.parse(xl.sheet_names[0], header=0, index_col=2, skiprows=[0,2,3,4], parse_dates=True) df = df.dropna(axis=1, how='all') df = df.drop(['Decimal Year Day', 'Decimal Year Day.1', 'RECORD'], axis=1) if i == 0: dfs = df else: dfs = concat([dfs, df], axis=1) t2 = time.clock() print "Files loaded into dataframe in %s seconds" %(t2-t1) files = ["London Lysimeters corrected 5min.xlsx", "London Water Balance 5min.xlsx"] data = loaddata(files)

Here's the idea of the index:

data.index
class 'pandas.tseries.index.DatetimeIndex'> [2012-08-27 12: 05: 00.000002, ..., 2013-07-12 15: 10: 00.000004] Length: 91910, Frequency: None, Time Zone: None

What would be the fastest and most common for rounding the index to the nearest minute?

+4

python-2.7 pandas datetime resampling

pbreach Jul 22 '13 at 6:25

source share

3 answers

Issue 4314 mentioned by Jeff is now closed, and for DatetimeIndex, Timestamp, TimedeltaIndex and Timedelta, the round() method was added to pandas 0.18.0. Now we can do the following:

 In[109]: index = pd.DatetimeIndex([pd.Timestamp('20120827 12:05:00.002'), pd.Timestamp('20130101 12:05:01'), pd.Timestamp('20130712 15:10:30'), pd.Timestamp('20130712 15:10:31')]) In[110]: index.values Out[110]: array(['2012-08-27T12:05:00.002000000', '2013-01-01T12:05:01.000000000', '2013-07-12T15:10:30.000000000', '2013-07-12T15:10:31.000000000'], dtype='datetime64[ns]') In[111]: index.round('min') Out[111]: DatetimeIndex(['2012-08-27 12:05:00', '2013-01-01 12:05:00', '2013-07-12 15:10:00', '2013-07-12 15:11:00'], dtype='datetime64[ns]', freq=None)

round() takes a frequency parameter. String aliases for it are listed here .

+4

wombatonfire Sep 03 '16 at 18:42

source share

For data columns; Usage: round_hour (df.Start_time)

 def round_hour(x,tt=''): if tt=='M': return pd.to_datetime(((x.astype('i8')/(1e9*60)).round()*1e9*60).astype(np.int64)) elif tt=='H': return pd.to_datetime(((x.astype('i8')/(1e9*60*60)).round()*1e9*60*60).astype(np.int64)) else: return pd.to_datetime(((x.astype('i8')/(1e9)).round()*1e9).astype(np.int64))

0

notilas Jan 21 '15 at 23:38

source share

Jeff · Accepted Answer · 2013-07-22T11:49:07+0000

Here is a little trick. Time in nanoseconds (if you look like np.int64 ). So, round to minutes in nanoseconds.

 In [75]: index = pd.DatetimeIndex([ Timestamp('20120827 12:05:00.002'), Timestamp('20130101 12:05:01'), Timestamp('20130712 15:10:00'), Timestamp('20130712 15:10:00.000004') ]) In [79]: index.values Out[79]: array(['2012-08-27T08:05:00.002000000-0400', '2013-01-01T07:05:01.000000000-0500', '2013-07-12T11:10:00.000000000-0400', '2013-07-12T11:10:00.000004000-0400'], dtype='datetime64[ns]') In [78]: pd.DatetimeIndex(((index.asi8/(1e9*60)).round()*1e9*60).astype(np.int64)).values Out[78]: array(['2012-08-27T08:05:00.000000000-0400', '2013-01-01T07:05:00.000000000-0500', '2013-07-12T11:10:00.000000000-0400', '2013-07-12T11:10:00.000000000-0400'], dtype='datetime64[ns]')

Round pandas datetime index?

More articles: