I found the pandas merge method does a strange job if the key index on the left and right is different.
for instances, I define the left and right dataframes as follows
left_df
0 1 2 3 4 5
0 1 2 1 2 3 4
1 2 3 2 3 4 5
2 1 2 3 4 5 6
3 2 2 4 5 6 7
4 2 3 5 6 7 8
right_df
0 1 2 3 4 5
0 1 2 3 4 5 6
1 1 2 3 4 5 7
2 2 3 4 5 6 7
3 2 3 4 5 6 8
and combine work with several parameters,
pd.merge(left_df, right_df, how="inner", left_on = [0,1], right_on=[0,1], indicator=False)
The result will find as expected.
0 1 2_x 3_x 4_x 5_x 2_y 3_y 4_y 5_y
0 1 2 1 2 3 4 3 4 5 6
1 1 2 1 2 3 4 3 4 5 7
2 1 2 3 4 5 6 3 4 5 6
3 1 2 3 4 5 6 3 4 5 7
4 2 3 2 3 4 5 4 5 6 7
5 2 3 2 3 4 5 4 5 6 8
6 2 3 5 6 7 8 4 5 6 7
7 2 3 5 6 7 8 4 5 6 8
But if I set the left_on and right_on parameters differently, the result becomes very strange, as shown below.
merge job with '1,2' left key index
pd.merge(left_df, right_df, how="inner", left_on = [1,2], right_on=[0,1], indicator=False)
1 2 0_x 1_x 2_x 3_x 4_x 5_x 0_y 1_y 2_y 3_y 4_y 5_y
0 2 3 1 2 3 4 5 6 2 3 4 5 6 7
1 2 3 1 2 3 4 5 6 2 3 4 5 6 8
^ ^ ^ ^
these columns are duplicated.
0_x 1 2 3_x 4_x 5_x 2_y 3_y 4_y 5_y
0 1 2 3 4 5 6 4 5 6 7
1 1 2 3 4 5 6 4 5 6 8
this is what I expected. (keys of each df are removed.)
Is there any parameter or solution to the above weird job?