Merging two data frames with a multi-index

I saw several posts about this, but I could not understand how merging, combining, and concat would deal with this. How can I combine two data frames to find the corresponding indexes?

in

import pandas as pd import numpy as np row_x1 = ['a1','b1','c1'] row_x2 = ['a2','b2','c2'] row_x3 = ['a3','b3','c3'] row_x4 = ['a4','b4','c4'] index_arrays = [np.array(['first', 'first', 'second', 'second']), np.array(['one','two','one','two'])] df1 = pd.DataFrame([row_x1,row_x2,row_x3,row_x4], columns=list('ABC'), index=index_arrays) print(df1) 

of

  ABC first one a1 b1 c1 two a2 b2 c2 second one a3 b3 c3 two a4 b4 c4 

in

 row_y1 = ['d1','e1','f1'] row_y2 = ['d2','e2','f2'] df2 = pd.DataFrame([row_y1,row_y2], columns=list('DEF'), index=['first','second']) print(df2) 

of

  DEF first d1 e1 f1 second d2 e2 f2 

In other words, how can I combine them to achieve df3 (as follows)?

in

 row_x1 = ['a1','b1','c1'] row_x2 = ['a2','b2','c2'] row_x3 = ['a3','b3','c3'] row_x4 = ['a4','b4','c4'] row_y1 = ['d1','e1','f1'] row_y2 = ['d2','e2','f2'] row_z1 = row_x1 + row_y1 row_z2 = row_x2 + row_y1 row_z3 = row_x3 + row_y2 row_z4 = row_x4 + row_y2 df3 = pd.DataFrame([row_z1,row_z2,row_z3,row_z4], columns=list('ABCDEF'), index=index_arrays) print(df3) 

of

  ABCDEF first one a1 b1 c1 d1 e1 f1 two a2 b2 c2 d1 e1 f1 second one a3 b3 c3 d2 e2 f2 two a4 b4 c4 d2 e2 f2 
+5
source share
1 answer

Option 1
Use pd.DataFrame.reindex + pd.DataFrame.join
reindex has a convenient level parameter that allows you to expand index levels that are not present.

 df1.join(df2.reindex(df1.index, level=0)) ABCDEF first one a1 b1 c1 d1 e1 f1 two a2 b2 c2 d1 e1 f1 second one a3 b3 c3 d2 e2 f2 two a4 b4 c4 d2 e2 f2 

Option 2
You can rename your axis and join will work

 df1.rename_axis(['a', 'b']).join(df2.rename_axis('a')) ABCDEF ab first one a1 b1 c1 d1 e1 f1 two a2 b2 c2 d1 e1 f1 second one a3 b3 c3 d2 e2 f2 two a4 b4 c4 d2 e2 f2 

You can follow this with another rename_axis to get the desired results.

 df1.rename_axis(['a', 'b']).join(df2.rename_axis('a')).rename_axis([None, None]) ABCDEF first one a1 b1 c1 d1 e1 f1 two a2 b2 c2 d1 e1 f1 second one a3 b3 c3 d2 e2 f2 two a4 b4 c4 d2 e2 f2 

+6
source

Source: https://habr.com/ru/post/1272314/


All Articles