I have some monthly data that I am trying to summarize using Pandas, and I need to count the number of unique records that appear every month. Here is a sample code that shows what I'm trying to do:
import pandas as pd
mnths = ['JAN','FEB','MAR','APR']
custs = ['A','B','C',]
testFrame = pd.DataFrame(index=custs, columns=mnths)
testFrame['JAN']['A'] = 'purchased Prod'
testFrame['JAN']['B'] = 'No Data'
testFrame['JAN']['C'] = 'Purchased Competitor'
testFrame['FEB']['A'] = 'purchased Prod'
testFrame['FEB']['B'] = 'purchased Prod'
testFrame['FEB']['C'] = 'purchased Prod'
testFrame['MAR']['A'] = 'No Data'
testFrame['MAR']['B'] = 'No Data'
testFrame['MAR']['C'] = 'Purchased Competitor'
testFrame['APR']['A'] = 'Purchased Competitor'
testFrame['APR']['B'] = 'purchased Prod'
testFrame['APR']['C'] = 'Purchased Competitor'
uniqueValues = pd.Series(testFrame.values.ravel()).unique()
#CODE TO GET COUNT OF ENTRIES IN testFrame BY UNIQUE VALUE
Output Required:
JAN FEB MAR APR
purchased Prod ? ? ? ?
Purchased Competitor ? ? ? ?
No Data ? ? ? ?
I can get unique values and create a new dataframe with the correct axes / columns
I started here and here:
Pandas: counting unique values in a data frame
Find unique values in a Pandas frame, regardless of the location of a row or column
but still cannot get access to the formats I need. I'm not quite sure how to apply the df.groupby syntax or the df.apply syntax to what I'm working with.