Pandas applying multicolumnindex to dataframe

Question

Pandas applying multicolumnindex to dataframe

The situation is that I have several files with time_series data for different stocks with several fields. each file contains

time, open, high, low, close, volume

the goal is to have it all in one form data frame

field      open                              high                            ...
security    hk_1      hk_2      hk_3 ...      hk_1      hk_2      hk_3 ...  ...
time
t_1      open_1_1  open_2_1  open_3_1 ...  high_1_1  high_2_1  high_3_1 ...  ...            
t_2      open_1_2  open_2_2  open_3_2 ...  high_1_2  high_2_2  high_3_2 ...  ...
...        ...        ...       ... ...       ...       ...       ... ...  ...

I created a multi index

fields = ['time','open','high','low','close','volume','numEvents','value']
midx = pd.MultiIndex.from_product([security_name'], fields], names=['security', 'field'])

and to begin with, I tried to apply this multi-index to the data file that I get from reading data from csv (by creating a new data frame and adding an index)

for c in eqty_names_list:

    midx = pd.MultiIndex.from_product([[c], fields], names=['security', 'field'])

    df_temp = pd.read_csv('{}{}.csv'.format(path, c))
    df_temp = pd.DataFrame(df_temp, columns=midx, index=df_temp['time'])
    df_temp.df_name = c
    all_dfs.append(df_temp)

However, the new information frame contains only nan

security    1_HK
field       time    open    high    low     close   volume
time                                
 NaN         NaN     NaN     NaN    NaN       NaN      NaN

In addition, it still contains a column for time, although I tried to make this index (so that I can later join all other data frames for other stocks by index to get an aggregated data frame).

, , :

security    1_HK
field       time    open    high    low     close   volume
time

- ( )

field       time                open    high        ...
security    1_HK    2_HK ...    1_HK    2_HK ...    ...
time

+1

python pandas indexing

chrise 09 . '16 11:44

1

jezrael · Accepted Answer · 2016-08-09T11:54:01+0000

, files, , DataFrames concat (axis=1). keys, Multiindex :

:

a.csv, b.csv, c.csv

import pandas as pd
import glob

files = glob.glob('files/*.csv')
dfs = [pd.read_csv(fp) for fp in files]

eqty_names_list = ['hk1','hk2','hk3']
df = pd.concat(dfs, keys=eqty_names_list, axis=1)

print (df)
  hk1       hk2       hk3      
    a  b  c   a  b  c   a  b  c
0   0  1  2   0  9  6   0  7  1
1   1  5  8   1  6  4   1  3  2

swaplevel sort_index

df.columns = df.columns.swaplevel(0,1)
df = df.sort_index(axis=1)
print (df)
    a           b           c        
  hk1 hk2 hk3 hk1 hk2 hk3 hk1 hk2 hk3
0   0   0   0   1   9   7   2   6   1
1   1   1   1   5   6   3   8   4   2

Pandas applying multicolumnindex to dataframe

More articles: