I have a Multiindexed DataFrame containing df explanatory variables and a DataFrame containing df_Y response df_Y
# Create DataFrame for explanatory variables np.arrays = [['foo', 'foo', 'foo', 'bar', 'bar', 'bar'], [1, 2, 3, 1, 2, 3]] df = pd.DataFrame(np.random.randn(6,2), index=pd.MultiIndex.from_tuples(zip(*np.arrays)), columns=['X1', 'X2'])

# Create DataFrame for response variables df_Y = pd.DataFrame([1, 2, 3], columns=['Y'])

I can only perform regression at the same DataFrame level with index foo
df_X = df.ix['foo'] # using only 'foo' reg = linear_model.Ridge().fit(df_X, df_Y) reg.coef_
Problem: However, since the Y variables are the same for both the foo and bar levels, we can therefore have twice as many regression patterns if we also include bar .

What is the best way to reformat / collapse / expand a layered DataFrame so that we can use all the data for our regression? Other levels may have smaller lines that df_Y
Sorry for the confusing wording, I'm not sure about the correct terms / phrases