Pandas: connecting with an external product

Question

Pandas: connecting with an external product

I want to multiply the demand table specified for several products (here: Water, Elec) and area types (Com, Ind, Res) with a DataFrame ( areas ), which is the area table for these area types.

 import pandas as pd areas = pd.DataFrame({'Com':[1,2,3], 'Ind':[4,5,6]}) demand = pd.DataFrame({'Water':[4,3], 'Elec':[8,9]}, index=['Com', 'Ind'])

Before:

 areas Com Ind 0 1 4 1 2 5 2 3 6 demand Elec Water Com 8 4 Ind 9 3

After:

 area_demands Com Ind Elec Water Elec Water 0 8 4 36 12 1 16 8 45 15 2 24 12 54 18

My attempt

Detailed and incomplete; does not work for an arbitrary amount of goods.

 areas = pd.DataFrame({'area': areas.stack()}) areas.index.names = ['Edge', 'Type'] both = areas.reset_index(1).join(demand, on='Type') both['Elec'] = both['Elec'] * both['area'] both['Water'] = both['Water'] * both['area'] del both['area'] # almost there; it must be late, I fail to make 'Type' a hierarchical column...

Almost there:

  Type Elec Water Edge 0 Com 8 4 0 Ind 36 12 1 Com 16 8 1 Ind 45 15 2 Com 24 12 2 Ind 54 18

In short

How to combine / multiply DataFrames areas and demand together in a worthy way?

+6

python pandas

ojdo Sep 2 '13 at 18:05

source share

1 answer

unutbu · Accepted Answer · 2013-09-02T18:42:17+0000

 import pandas as pd areas = pd.DataFrame({'Com':[1,2,3], 'Ind':[4,5,6]}) demand = pd.DataFrame({'Water':[4,3], 'Elec':[8,9]}, index=['Com', 'Ind']) def multiply_by_demand(series): return demand.ix[series.name].apply(lambda x: x*series).stack() df = areas.apply(multiply_by_demand).unstack(0) print(df)

gives

  Com Ind Elec Water Elec Water 0 8 4 36 12 1 16 8 45 15 2 24 12 54 18

How it works:

First, look what happens when we call areas.apply(foo) . foo gets the columns of areas one by one:

 def foo(series): print(series) In [226]: areas.apply(foo) 0 1 1 2 2 3 Name: Com, dtype: int64 0 4 1 5 2 6 Name: Ind, dtype: int64

So, suppose series is one of these columns:

 In [230]: series = areas['Com'] In [231]: series Out[231]: 0 1 1 2 2 3 Name: Com, dtype: int64

We can muliply request this series this way:

 In [229]: demand.ix['Com'].apply(lambda x: x*series) Out[229]: 0 1 2 Elec 8 16 24 Water 4 8 12

This is half the numbers we want, but not in the form in which we want. DataFrame now needs to return a series , not a DataFrame . One way to turn a DataFrame into a series is to use stack . See what happens if we stack this DataFrame. Columns become a new index level:

 In [232]: demand.ix['Com'].apply(lambda x: x*areas['Com']).stack() Out[232]: Elec 0 8 1 16 2 24 Water 0 4 1 8 2 12 dtype: int64

So, using this as the return value of multiply_by_demand , we get:

 In [235]: areas.apply(multiply_by_demand) Out[235]: Com Ind Elec 0 8 36 1 16 45 2 24 54 Water 0 4 12 1 8 15 2 12 18

Now we want the first level of the index to become a column. This can be done with unstack :

 In [236]: areas.apply(multiply_by_demand).unstack(0) Out[236]: Com Ind Elec Water Elec Water 0 8 4 36 12 1 16 8 45 15 2 24 12 54 18

In the request in the comments, here is the pivot_table solution:

 import pandas as pd areas = pd.DataFrame({'Com':[1,2,3], 'Ind':[4,5,6]}) demand = pd.DataFrame({'Water':[4,3], 'Elec':[8,9]}, index=['Com', 'Ind']) areas = pd.DataFrame({'area': areas.stack()}) areas.index.names = ['Edge', 'Type'] both = areas.reset_index(1).join(demand, on='Type') both['Elec'] = both['Elec'] * both['area'] both['Water'] = both['Water'] * both['area'] both.reset_index(inplace=True) both = both.pivot_table(values=['Elec', 'Water'], rows='Edge', cols='Type') both = both.reorder_levels([1,0], axis=1) both = both.reindex(columns=both.columns[[0,2,1,3]]) print(both)

Pandas: connecting with an external product

More articles: