I am trying to apply a function to each DataFrame in a Pandas panel. I can write it as a loop, but indexing seems to be time consuming. I hope the built-in Pandas feature can be faster.
I have data frames that look like (actually about 50 rows per column):
mydata = pd.DataFrame( { 'hits' : [ 123, 456,678 ], 'sqerr' : [ 253, 641, 3480] } )
They are located in a panel with a multi-index key:
mydict = { (0, 20 ) : mydata, (30, 40 ) : moredata }
mypanel = pd.Panel( mydict )
The panel is as follows:
<class 'pandas.core.panel.Panel'>
Dimensions: 1600 (items) x 48 (major_axis) x 2 (minor_axis)
Items axis: (-4000, -4000) to (3800, 3800)
Major_axis axis: 0 to 47
Minor_axis axis: hits to sqerr
I have a function that takes a DataFrame and prints a number:
def condenser( df ):
return some_stuff( df['hits'], df['sqerr'] )
I want to reduce my panel to a series indexed by my multi-index and with the results of my capacitor function as its values.
I can do:
intermediate = []
for k, df in mypanel.iteritems():
intermediate.append( condenser( df ) )
result = pd.Series( results, index = pypanel.items )
, , 4% condenser. iteritems __getitem__, , .
mypanel.apply( condenser, axis = 'items' ), DataFrames. -, DataFrame?
P.s. Python 2.7.9 Pandas 0.15.2