The situation is that I have several files with time_series data for different stocks with several fields. each file contains
time, open, high, low, close, volume
the goal is to have it all in one form data frame
field open high ...
security hk_1 hk_2 hk_3 ... hk_1 hk_2 hk_3 ... ...
time
t_1 open_1_1 open_2_1 open_3_1 ... high_1_1 high_2_1 high_3_1 ... ...
t_2 open_1_2 open_2_2 open_3_2 ... high_1_2 high_2_2 high_3_2 ... ...
... ... ... ... ... ... ... ... ... ...
I created a multi index
fields = ['time','open','high','low','close','volume','numEvents','value']
midx = pd.MultiIndex.from_product([security_name'], fields], names=['security', 'field'])
and to begin with, I tried to apply this multi-index to the data file that I get from reading data from csv (by creating a new data frame and adding an index)
for c in eqty_names_list:
midx = pd.MultiIndex.from_product([[c], fields], names=['security', 'field'])
df_temp = pd.read_csv('{}{}.csv'.format(path, c))
df_temp = pd.DataFrame(df_temp, columns=midx, index=df_temp['time'])
df_temp.df_name = c
all_dfs.append(df_temp)
However, the new information frame contains only nan
security 1_HK
field time open high low close volume
time
NaN NaN NaN NaN NaN NaN NaN
In addition, it still contains a column for time, although I tried to make this index (so that I can later join all other data frames for other stocks by index to get an aggregated data frame).
, , :
security 1_HK
field time open high low close volume
time
- ( )
field time open high ...
security 1_HK 2_HK ... 1_HK 2_HK ... ...
time