Conditional sum in Python based on string input

Question

Conditional sum in Python based on string input

I am trying to make a conditional sum-product in Python. A simplified idea is this:

A = [1 1 2 3 3 3]
B = [0.50 0.25 0.99 0.80 0.70 0.20]

I would like to have a way out

Total1 = 0.50*1 + 0.25*1
Total2 = 0.99*2
Total3 = 0.80*3 + 0.70*3 + 0.20*3

I thought to use the FOR ... IF ... structure to indicate that for a given value in, Aall relevant values in should be summed B.

This is actually a huge data set, so I have to make the script able to iterate over all categories?

At this point, I'm struggling to translate the idea into the appropriate Python script. Can someone point me in the right direction?

+4

python python-3.x conditional sum

Sibren de preter Aug 11 '17 at 10:21

source share

4 answers

- . , , , , , :

In [1]: sums = {}
In [2]: A = [1, 1, 2, 3, 3, 3]
   ...: B = [0.50, 0.25, 0.99, 0.80, 0.70, 0.20]
In [3]: for count, item in zip(A, B):
    ...:     try:
    ...:         sums[count] += item * count
    ...:     except KeyError:
    ...:         sums[count] = item * count
    ...:         

In [4]: sums
Out[5]: {1: 0.75, 2: 1.98, 3: 5.1}

Edit:

deafultdict, try-except:

In [2]: from collections import defaultdict

In [3]: sum = defaultdict(lambda: 0)

In [4]: sum[1]
Out[4]: 0

In [5]: sum
Out[5]: defaultdict(<function __main__.<lambda>>, {1: 0})

EDIT2:

, - . :

In [6]: sums = defaultdict(int)

In [7]: A = [1, 1, 2, 3, 3, 3]
   ...: B = [0.50, 0.25, 0.99, 0.80, 0.70, 0.20]

In [8]: for count, item in zip(A, B):
   ...:     sums[count] += count * item
   ...:     

In [9]: sums
Out[9]: defaultdict(int, {1: 0.75, 2: 1.98, 3: 5.1})

+2

gonczor 11 . '17 10:39

I think you can solve this using itertools.groupby:

import itertools
from operator import itemgetter

results = [group * sum(v[1] for v in values)
           for group, values in itertools.groupby(zip(A, B), itemgetter(0))]

This assumes that all equal numbers in Aare adjacent to each other. If they may be missing, you will either have to sort them or use a different algorithm.

0

Blckknght Aug 11 '17 at 10:37

source share

If you don't mind using numpy for this and assuming the groups are ordered, you can do this:

A = [1, 1, 2, 3, 3, 3]
B = [0.50, 0.25, 0.99, 0.80, 0.70, 0.20]
A = np.asarray([1, 1, 2, 3, 3, 3])
B = np.asarray([0.50, 0.25, 0.99, 0.80, 0.70, 0.20])
index = np.full(len(A),True)
index[:-1] = A[1:] != A[:-1]
prods = A*B

#result
res = np.add.reduceat(prods, np.append([0], (np.where(index)[0]+1)[:-1]))

Also, if you have large lists, this can really speed up operations.

0

Clock slave Aug 11 '17 at 11:45

source share

MSeifert · Accepted Answer · 2017-08-11T10:38:27+0000

itertools.groupby ( , A , , , A=[1,1,2,2,1]):

from itertools import groupby
A = [1, 1, 2, 3, 3, 3]
B = [0.50, 0.25, 0.99, 0.80, 0.70, 0.20]

for key, grp in groupby(zip(A, B), key=lambda x: x[0]):
    grp = [i[1] for i in grp]
    print(key, key * sum(grp))

:

1 0.75
2 1.98
3 5.1

:

res = []
for key, grp in groupby(zip(A, B), key=lambda x: x[0]):
    grp = [i[1] for i in grp]
    res.append(key*sum(grp))
print(res)
# [0.75, 1.98, 5.1]

, iteration_utilities.groupedby:

>>> from iteration_utilities import groupedby
>>> from operator import itemgetter, add

>>> {key: key*sum(value) for key, value in groupedby(zip(A, B), key=itemgetter(0), keep=itemgetter(1)).items()}
{1: 0.75, 2: 1.98, 3: 5.1}

reduce groupedby:

>>> groupedby(zip(A, B), key=itemgetter(0), keep=lambda x: x[0]*x[1], reduce=add)
{1: 0.75, 2: 1.98, 3: 5.1}

: iteration_utilities.

Conditional sum in Python based on string input

More articles: