I am trying to find the probability of a given word within a data frame, but I am getting an error AttributeError: 'Series' object has no attribute 'columns'
with my current setting. Hope you can help me find where the error is.
I start with a data framework that looks like below and transforms it to find a common score for each individual word using the following function.
query count
foo bar 10
super 8
foo 4
super foo bar 2
Function below:
def _words(df):
return df['query'].str.get_dummies(sep=' ').T.dot(df['count'])
The result is in the bottom df (the note 'foo' is 16 because it appears 16 times throughout df):
bar 12
foo 16
super 10
The problem occurs when trying to find the probability of a given keyword in df, which does not currently add the column name. Below is what I'm working with right now, but it throws an AttributeError object: 'Series' does not have a column attribute.
def _probability(df, query):
return df[query] / df.groupby['count'].sum()
, _probability (df, 'foo') 0.421052632 (16/(12 + 16 + 10)). !