I read several timers spreadsheets in w460> and combine them with a common datetime pandas index. The data logger that recorded the time site is not 100% more accurate, which makes oversampling very annoying, because depending on whether the time is slightly higher or lower than the selected interval, it will create NaN and start making my series look like a broken line. Here is my code
def loaddata(filepaths): t1 = time.clock() for i in range(len(filepaths)): xl = pd.ExcelFile(filepaths[i]) df = xl.parse(xl.sheet_names[0], header=0, index_col=2, skiprows=[0,2,3,4], parse_dates=True) df = df.dropna(axis=1, how='all') df = df.drop(['Decimal Year Day', 'Decimal Year Day.1', 'RECORD'], axis=1) if i == 0: dfs = df else: dfs = concat([dfs, df], axis=1) t2 = time.clock() print "Files loaded into dataframe in %s seconds" %(t2-t1) files = ["London Lysimeters corrected 5min.xlsx", "London Water Balance 5min.xlsx"] data = loaddata(files)
Here's the idea of ββthe index:
data.index
class 'pandas.tseries.index.DatetimeIndex'> [2012-08-27 12: 05: 00.000002, ..., 2013-07-12 15: 10: 00.000004] Length: 91910, Frequency: None, Time Zone: None
What would be the fastest and most common for rounding the index to the nearest minute?
source share