This is my first attempt at Pandas. I think I have a reasonable precedent, but I stumble. I want to load a tab delimited file into a Pandas Dataframe, and then group it by character and draw it using x.axis indexed by a TimeStamp column. Here is a subset of the data:
Symbol,Price,M1,M2,Volume,TimeStamp TBET,2.19,3,8.05,1124179,9:59:14 AM FUEL,3.949,9,1.15,109674,9:59:11 AM SUNH,4.37,6,0.09,24394,9:59:09 AM FUEL,3.9099,8,1.11,105265,9:59:09 AM TBET,2.18,2,8.03,1121629,9:59:05 AM ORBC,3.4,2,0.22,10509,9:59:02 AM FUEL,3.8599,7,1.07,102116,9:58:47 AM FUEL,3.8544,6,1.05,100116,9:58:40 AM GBR,3.83,4,0.46,64251,9:58:24 AM GBR,3.8,3,0.45,63211,9:58:20 AM XRA,3.6167,3,0.12,42310,9:58:08 AM GBR,3.75,2,0.34,47521,9:57:52 AM MPET,1.42,3,0.26,44600,9:57:52 AM
Note two things about the TimeStamp column;
- it has duplicate meanings and
- intervals are irregular.
I thought I could do something like this ...
from pandas import * import pylab as plt df = read_csv('data.txt',index_col=5) df.sort(ascending=False) df.plot() plt.show()
But the read_csv method throws an exception "I tried columns 1-X as an index, but found duplicates." Is there an option that allows me to specify an index column with duplicate values?
I would also be interested in combining my irregular time intervals with a resolution of up to one second, I would still like to build several events in a certain second, but maybe I could introduce a unique index and then align my prices with it?