Pandas join 2 columns

I'm having trouble getting these two dfs to join in the way I would like. The first df has a hierarchical index, which I created using df1 = df3.groupby(["STATE_PROV_CODE", "COUNTY"]).size() to get a counter for each county.

 STATE_PROV_CODE COUNTY COUNT AL Autauga County 1 Baldwin County 1 Barbour County 1 Bibb County 1 Blount County 1 STATE_PROV_CODE COUNTY ANSI Cl FIPS 0 AL Autauga County H1 01001 1 AL Baldwin County H1 01003 2 AL Barbour County H1 01005 3 AL Bibb County H1 01007 4 AL Blount County H1 01009 

In SQL, I would like to do the following:

 SELECT STATE_PROV_CODE, COUNTY, FIPS, COUNT, FROM df1, df2 ON STATE_PROV_CODE, COUNTY WHERE df1.STATE_PROV_CODE = df2.STATE_PROV_CODE AND df1.COUNTY = df2.COUNTY 

I would like the result to be as follows:

 STATE_PROV_CODE COUNTY COUNT FIPS AL Autauga County 1 01001 Baldwin County 1 01003 Barbour County 1 01005 Bibb County 1 01007 Blount County 1 01009 
+6
source share
1 answer

I believe that you customized the group results and the second data frame, this merge will work:

 df = pd.merge(df1, df2, left_index=True, right_on=['STATE_PROV_CODE', 'COUNTY']) 

he will break MultiIndex; however, if you want to return it, all you have to do is

 df = df.set_index(['STATE_PROV_CODE', 'COUNTY']) 
+2
source

Source: https://habr.com/ru/post/973050/


All Articles