up vote 4 down vote favorite
2
share [g+] share [fb]

I have an NxM array in numpy that I would like to take the log of, and ignore entries that were negative prior to taking the log. When I take the log of negative entries, it returns -Inf, so I will have a matrix with some -Inf values as a result. I then want to sum over the columns of this matrix, but ignoring the -Inf values -- how can I do this?

For example,

mylogarray = log(myarray)
# take sum, but ignore -Inf?
sum(mylogarray, 0)

I know there's nansum and I need the equivalent, something like infsum.

Thanks.

link|improve this question

76% accept rate
feedback

4 Answers

up vote 2 down vote accepted

Use masked arrays:

>>> a = numpy.array([2, 0, 1.5, -3])
>>> b = numpy.ma.log(a)
>>> b
masked_array(data = [0.69314718056 -- 0.405465108108 --],
             mask = [False  True False  True],
       fill_value = 1e+20)

>>> b.sum()
1.0986122886681096
link|improve this answer
can you please expand on this? I don't understand the example. How did you initialize the masked array above? – user248237 Dec 20 '10 at 0:35
2  
@user248237 - The numpy.ma.log, etc, functions will automatically create a masked array where anything that results in a inf or nan is masked. This is a bit less explicit, though, so you can instead do this: a = np.ma.masked_where(a == np.inf, a), and then just do b = np.log(a) (or any other function). Alternatively, you can avoid masked arrays and just do np.log(a[a != np.inf]).sum() (You can index by boolean arrays, it's much cleaner and faster than the filter-based answers.) – Joe Kington Dec 20 '10 at 3:19
@user248237 I didn't initialize the masked array explicitly. a is just a normal, non-masked array. ma.log masks all values where the (real) logarithm is undefined. Then the resulting masked array b is treated roughly as if the masked entries weren't there. – Philipp Dec 22 '10 at 1:10
feedback

The easiest way to do this is to use numpy.ma.masked_invalid():

a = numpy.log(numpy.arange(15))
a.sum()
# -inf
numpy.ma.masked_invalid(a).sum()
# 25.19122118273868
link|improve this answer
feedback

Use a filter():

>>> array
array([  1.,   2.,   3., -Inf])
>>> sum(filter(lambda x: x != float('-inf'), array))
6.0
link|improve this answer
Is this considered a vectorized operation? Is there a more efficient way? I need to do this many times in my code and wanted a vectorized approach – user248237 Dec 19 '10 at 23:47
Are you asking if this is done in-place with iterators? No. Is there a more efficient way? AFAIK, you'd have to loop through the array as there's no filter function that returns an iterator, unless you write one. – marcog Dec 19 '10 at 23:55
I don't think the filter code works for NxM arrays.. it seems to onlyu work for 1xM vectors. – user248237 Dec 20 '10 at 0:33
2  
The "numpythonic" way to do filter(lambda x: x != float('-inf'), array) is just x[x != np.inf] Using list comprehensions, filter, etc, is much slower on numpy arrays than it is on lists. Because of that, numpy arrays have a number of facilities to avoid explicitly looping through and operating on each element. – Joe Kington Dec 20 '10 at 3:49
feedback

maybe you can index your matrix and use:

import numpy as np;
matrix = np.array([[1.,2.,3.,np.Inf],[4.,5.,6.,np.Inf],[7.,8.,9.,np.Inf]]);
print matrix[:,1];
print sum(filter(lambda x: x != np.Inf,matrix[:,1]));
print matrix[1,:];
print sum(filter(lambda x: x != np.Inf,matrix[1,:]));
link|improve this answer
feedback

Your Answer

 
or
required, but never shown

Not the answer you're looking for? Browse other questions tagged or ask your own question.