Create adjacency matrix for two columns in pandas dataframe

Question

Create adjacency matrix for two columns in pandas dataframe

I have a data frame in the form:

index  Name_A  Name_B
  0    Adam    Ben
  1    Chris   David
  2    Adam    Chris
  3    Ben     Chris

And I would like to get an adjacency matrix for Name_Aand Name_B, that is:

      Adam Ben Chris David
Adam   0    1    1     0
Ben    0    0    1     0
Chris  0    0    0     1
David  0    0    0     0

What is the most python / scalable way to solve this problem?

EDIT: Also, I know that if the set row Adam, Benis in the data set, then at another point, Ben, Adamit will also be in the data set.

+8

python pandas dataframe

The ref Mar 15 '17 at 10:00

source share

1 answer

jezrael · Accepted Answer · 2017-03-15T10:03:36+0000

You can use and then by the values of the column and index: crosstab reindex union

df = pd.crosstab(df.Name_A, df.Name_B)
print (df)
Name_B  Ben  Chris  David
Name_A                   
Adam      1      1      0
Ben       0      1      0
Chris     0      0      1

df = pd.crosstab(df.Name_A, df.Name_B)
idx = df.columns.union(df.index)
df = df.reindex(index = idx, columns=idx, fill_value=0)
print (df)
       Adam  Ben  Chris  David
Adam      0    1      1      0
Ben       0    0      1      0
Chris     0    0      0      1
David     0    0      0      0

Create adjacency matrix for two columns in pandas dataframe

More articles: