Python Pandas: the number of times each unique value appears for multiple columns

Question

Python Pandas: the number of times each unique value appears for multiple columns

Suppose I have a DataFrame, for example,

In [7]: source = pd.DataFrame([['amazon.com', 'correct', 'correct'], ['amazon.com', 'incorrect', 'correct'], ['walmart.com', 'incorrect', 'correct'], ['walmart.com', 'incorrect', 'incorrect']], columns=['domain', 'price', 'product']) In [8]: source Out[8]: domain price product 0 amazon.com correct correct 1 amazon.com incorrect correct 2 walmart.com incorrect correct 3 walmart.com incorrect incorrect

I would like to count for each domain number of times price == 'correct' and price == 'incorrect' and the same for product . In other words, I would like to see a way out

  domain key value count 0 amazon.com price correct 1 1 amazon.com price incorrect 1 2 amazon.com product correct 2 3 walmart.com price incorrect 2 4 walmart.com product correct 1 5 walmart.com product incorrect 1

How to do it?

+4

pandas

duckworthd Jul 19 '13 at 23:25

source share

2 answers

 In [17]: %paste ( pd.melt(source, id_vars=['domain'], value_vars=['price', 'product']) .groupby(['domain', 'variable', 'value']) .size() .reset_index() .rename(columns={'variable': 'key', 0: 'count'}) ) ## -- End pasted text -- Out[17]: domain key value count 0 amazon.com price correct 1 1 amazon.com price incorrect 1 2 amazon.com product correct 2 3 walmart.com price incorrect 2 4 walmart.com product correct 1 5 walmart.com product incorrect 1

0

duckworthd Jul 19 '13 at 23:38

source share

Jeff · Accepted Answer · 2013-07-19T23:40:34+0000

The attached application will do this.

 In [24]: source.groupby('domain').apply(lambda x: x[['price','product']].apply(lambda y: y.value_counts())).fillna(0) Out[24]: price product domain amazon.com correct 1 2 incorrect 1 0 walmart.com correct 0 1 incorrect 2 1

Python Pandas: the number of times each unique value appears for multiple columns

More articles: