Pandas Counting and Counting

I am using Python Pandas. I have a column with a row, and I would like to have an intersection between the columns.

For example, I have the following input

1: Andi 2: Andi, Cindy 3: Thomas, Cindy 4: Cindy, Thomas 

And I would like to have the following output:

Therefore, the combination of Andy and Thomas does not appear in the data, but Cindy and Thomas appear twice.

  Andi Thomas Cindy Andi 1 0 1 Thomas 0 1 2 Cindy 1 2 1 

Does anyone know how I can handle this? It would be great!

Thank you very much and welcome

Andy

+5
source share
1 answer

First you can create dummy columns:

 df['A'].str.get_dummies(', ') Out: Andi Cindy Thomas 0 1 0 0 1 1 1 0 2 0 1 1 3 0 1 1 

And use this in a point product:

 tab = df['A'].str.get_dummies(', ') tab.T.dot(tab) Out: Andi Cindy Thomas Andi 2 1 0 Cindy 1 3 2 Thomas 0 2 2 

Diagonal entries will give you the number of entries for each person. If you need to set the diagonals to 1, there are several alternatives . One of them is np.fill_diagonal from numpy.

 co_occurrence = tab.T.dot(tab) np.fill_diagonal(co_occurrence.values, 1) co_occurrence Out: Andi Cindy Thomas Andi 1 1 0 Cindy 1 1 2 Thomas 0 2 1 
+9
source

Source: https://habr.com/ru/post/1269674/


All Articles