I am trying to show the relative percentage for a group as well as the total frequency in sns barplot. The two groups that I'm comparing are very different in size, so I show the percentage for the groups in the function below.
Here is the syntax for the sample I created, which has similar relative group sizes to my data ("groups") among the target categorical variable ("element"). "rand" is just the variable that I use to create df.
# import pandas and seaborn import pandas as pd import seaborn as sns import numpy as np # create dataframe foobar = pd.DataFrame(np.random.randn(100, 3), columns=('groups', 'item', 'rand')) # get relative groupsizes for row, val in enumerate(foobar.rand) : if val > -1.2 : foobar.loc[row, 'groups'] = 'A' else: foobar.loc[row, 'groups'] = 'B' # assign categories that I am comparing graphically if row < 20: foobar.loc[row, 'item'] = 'Z' elif row < 40: foobar.loc[row, 'item'] = 'Y' elif row < 60: foobar.loc[row, 'item'] = 'X' elif row < 80: foobar.loc[row, 'item'] = 'W' else: foobar.loc[row, 'item'] = 'V'
Here is a function I wrote that compares relative frequencies across groups. It has some default variables, but I reassigned them for this question.
def percent_categorical(item, df=IA, grouper='Active Status') : # plot categorical responses to an item ('column name') # by percent by group ('diff column name w categorical data') # select a data frame (default is IA) # 'Active Status' is default grouper # create df of item grouped by status grouped = (df.groupby(grouper)[item] # convert to percentage by group rather than total count .value_counts(normalize=True) # rename column .rename('percentage') # multiple by 100 for easier interpretation .mul(100) # change order from value to name .reset_index() .sort_values(item)) # create plot PercPlot = sns.barplot(x=item, y='percentage', hue=grouper, data=grouped, palette='RdBu' ).set_xticklabels( labels = grouped[item ].value_counts().index.tolist(), rotation=90) #show plot return PercPlot
The following are the function and the resulting graph:
percent_categorical('item', df=foobar, grouper='groups')

This is good because it allows me to show the relative percentage for the group. However, I also want to display absolute numbers for each group, preferably in a legend. In this case, I would like it to show that there are 89 members of group A and 11 members of group B.
Thanks in advance for any help.