Python/Numpy - calculate sum of equal array elements

Question

I have two numpy arrays, looking like:

field = np.array([5,1,3,3,2,1,6])    
counts = np.array([100,210,300,150,20,90,170])

They are not sorted (and shouldnt change). I now want to calculate a third array (of the same length and order) which contains the sum of the counts whenever they lie in the same field. Here the result should be:

field_counts = np.array([100,300,450,450,20,300,170])

The arrays are very long, such that iterating through it (and always looking where the corresponding partner fields are) is way too inefficient. Maybe I am just not seeing the wood for the trees... I hope someone can help me out on this!

Aside: when you find yourself needing a groupby operation, that's often a sign you should be using pandas instead of numpy; your operation would be something like df.groupby("field")["counts"].transform(sum). — DSM, Mar 26 '15 at 20:53

Julien Spronck · Accepted Answer · 2015-03-26 20:31:40Z

I don't know if it will be efficient enough (since I do iterate over field) but here is a suggestion. I first make a directory of field/counts values. Then, I create an array based on that.

from collections import defaultdict
dic = defaultdict(int)
for j, f in enumerate(field):
    dic[f] += counts[j]

field_counts = np.array([dic[f] for f in field])

Kasramvd · Answer 2 · 2015-03-26 20:31:57Z

up vote 1 down vote

Use the following list comprehension :

>>> [np.sum(counts[np.where(field==i)]) for i in field]
[100, 300, 450, 450, 20, 300, 170]

You can get the index of same elements in field with np.where :

>>> [np.where(field==i) for i in field]
[(array([0]),), (array([1, 5]),), (array([2, 3]),), (array([2, 3]),), (array([4]),), (array([1, 5]),), (array([6]),)]

And then get the corresponding elements of counts with indexing! and calculate the sum with np.sum.

answered Mar 26 '15 at 20:31

Kasramvd

63.5k85090

This will be very slow if the arrays are long; you've made this an N^2 calculation. – DSM Mar 26 '15 at 20:39

add a comment |

Eelco Hoogendoorn · Answer 3 · 2016-04-02 18:57:07Z

up vote 0 down vote

This problem an be solved in a fully vectorized manner using the numpy_indexed package (disclaimer: I am its author)

import numpy_indexed as npi
g = npi.group_by(field)
field_counts = g.sum(counts)[1][g.inverse]

g.sum computes the sums for each group of unique fields, and g.inverse maps those values back to the original fields.

edited Apr 2 '16 at 18:57

answered Apr 2 '16 at 18:23

Eelco Hoogendoorn

5,01411629

There is a reason a went through the hassle to package this functionality, since there are indeed many questions of this type. In my perception, all these questions stand to benefit from my answers; as does this one. It substantially improves upon the currently accepted answer in several respects. It is my understanding that the sections you refer to are directed at commercial purposes; but this is a free-as-in-beer open-source package, but correct me if I'm wrong. My only selfish motive here is getting it better tested :). – Eelco Hoogendoorn Apr 2 '16 at 18:38

Subjectively, it feels more like self-promotion to me if I do mention my authorship; but thank you for the heads-up. Do you happen to have a link to any resources that are a bit more explicit about the distinction between commercial and non-commercial purposes? – Eelco Hoogendoorn Apr 2 '16 at 18:46

1

Some of them are duplicates I would say, yes. I will follow your suggestion to disclose authorship then, thanks. – Eelco Hoogendoorn Apr 2 '16 at 18:56

I do appreciate the feedback – Eelco Hoogendoorn Apr 2 '16 at 19:02

1

Awesome @EelcoHoogendoorn I see you added disclosure :). Please do the same for your other answers as well. As a side-note, if some of them are duplicates, feel free to flag them as such! I will delete my previous comments to clean up. – Tunaki Apr 2 '16 at 19:05

add a comment |

asked	2 years, 3 months ago
viewed	423 times
active	1 year, 3 months ago

Python/Numpy - calculate sum of equal array elements

3 Answers 3

Your Answer

Not the answer you're looking for? Browse other questions tagged python arrays numpy sum unique or ask your own question.

Visit Chat

Hot Network Questions

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python arrays numpy sum unique or ask your own question.

Related