MultiIndexing rows versus columns in pandas DataFrame

Question

MultiIndexing rows versus columns in pandas DataFrame

I work with a multi-indexing data framework in pandas and wonder if I should specify rows or columns.

My data looks something like this: DataTable

the code:

import numpy as np
import pandas as pd
arrays = pd.tools.util.cartesian_product([['condition1', 'condition2'], 
                                          ['patient1', 'patient2'],
                                          ['measure1', 'measure2', 'measure3']])
colidxs = pd.MultiIndex.from_arrays(arrays, 
                                    names=['condition', 'patient', 'measure'])
rowidxs = pd.Index([0,1,2,3], name='time')
data = pd.DataFrame(np.random.randn(len(rowidxs), len(colidxs)), 
                    index=rowidxs, columns=colidxs)

Here I select a multiindex column with the rationale that the pandas dataframe consists of a series, and my data ultimately represents a bunch of time series (hence, it is indexed by time).

, , , multiindexing. , - , query , , - df.T.query('color == "red"').T.

, , (, query ).

.

+4

python numpy pandas multi-index

Lei 27 . '14 4:31

1

Lei · Answer 1 · 2014-02-28T02:46:17+0000

, / DataFrame:

[]: column-first
get:
, :
query:
loc, iloc, ix: -
xs: -
sortlevel: -
groupby: -

"-" , [:, ] axis=1; "row-only" , , - .

, , .

: pandas / DataFrame? , [] loc/iloc/ix , .

MultiIndexing rows versus columns in pandas DataFrame

More articles: