I want to filter out the pandas framework if there is an element in this list in the name column entry.
Here we have a DataFrame
x = DataFrame(
[['sam', 328], ['ruby', 3213], ['jon', 121]],
columns=['name', 'score'])
Now let's say that we have a list ['sam', 'ruby'], and we want to find all the lines in which the name is listed, and then summarize the score.
The solution I have is the following:
total = 0
names = ['sam', 'ruby']
for name in names:
identified = x[x['name'] == name]
total = total + sum(identified['score'])
However, when the dataframe gets extremely large and the list of names is also very large, everything is very slow.
Is there a faster alternative?
thanks
source
share