Pandas Moving Median for Repeating Time Series Data

Question

Pandas Moving Median for Repeating Time Series Data

I see that Pandas does not allow duplicate time series indexes ( https://github.com/pydata/pandas/issues/643 ), but will be added soon. I am wondering if there is a good way to apply a rolling window to a dataset with duplicate times using a tag / column with multiple indices.

Basically, I have csv of unordered events that consist of epochal, hierarchical tags (tag1, tag2) and time. A small sample:

epochTimeMS,event,tag,timeTakenMS 1331782842801,event1,tag1,16 1331782841535,event1,tag2,1278 1331782842801,event1,tag1,17 1331782842381,event2,tag1,436

What I want to do is build and scale the graph with various ms windows, by the tag event and event +. It seems like this should be done in Pandas, but not sure if I will have to wait until the time series indexes are repeated first. Any thoughts on hacking this now?

+4

python matplotlib pandas

Aaron Mar 18 2018-12-18T00:

source share

1 answer

Wes McKinney · Accepted Answer · 2012-03-18 21:43

There is nothing really to stop you right now:

 In [17]: idf = df.set_index(['tag', 'epochTimeMS'], verify_integrity=False).sort_index() In [18]: idf Out[18]: event timeTakenMS tag epochTimeMS tag1 1331782842381 event2 436 1331782842801 event1 16 1331782842801 event1 17 tag2 1331782841535 event1 1278 In [20]: idf.ix['tag1'] Out[20]: event timeTakenMS epochTimeMS 1331782842381 event2 436 1331782842801 event1 16 1331782842801 event1 17

Accessing certain values by timestamp will throw an exception (this will be improved as you mention), but you can certainly work with the data. Now, if you want a window with a fixed length (in temporary space), this is not very well supported yet, but I created a problem here:

https://github.com/pydata/pandas/issues/936

If you could speak out on the mailing list about your API requirements in your application, this would be useful for me and the guys, since we are now actively working on time series capabilities.

Pandas Moving Median for Repeating Time Series Data

More articles: