Ignoring -Inf values in arrays using numpy/scipy in Python

Question

I have an NxM array in numpy that I would like to take the log of, and ignore entries that were negative prior to taking the log. When I take the log of negative entries, it returns -Inf, so I will have a matrix with some -Inf values as a result. I then want to sum over the columns of this matrix, but ignoring the -Inf values -- how can I do this?

For example,

mylogarray = log(myarray)
# take sum, but ignore -Inf?
sum(mylogarray, 0)

I know there's nansum and I need the equivalent, something like infsum.

Thanks.

Philipp · Answer 1 · 2010-12-19 23:53:19Z

up vote 2 down vote accepted

Use masked arrays:

>>> a = numpy.array([2, 0, 1.5, -3])
>>> b = numpy.ma.log(a)
>>> b
masked_array(data = [0.69314718056 -- 0.405465108108 --],
             mask = [False  True False  True],
       fill_value = 1e+20)

>>> b.sum()
1.0986122886681096

answered Dec 19 '10 at 23:53

Philipp
12.3k12033

can you please expand on this? I don't understand the example. How did you initialize the masked array above? – user248237 Dec 20 '10 at 0:35

2

@user248237 - The numpy.ma.log, etc, functions will automatically create a masked array where anything that results in a inf or nan is masked. This is a bit less explicit, though, so you can instead do this: a = np.ma.masked_where(a == np.inf, a), and then just do b = np.log(a) (or any other function). Alternatively, you can avoid masked arrays and just do np.log(a[a != np.inf]).sum() (You can index by boolean arrays, it's much cleaner and faster than the filter-based answers.) – Joe Kington Dec 20 '10 at 3:19

@user248237 I didn't initialize the masked array explicitly. a is just a normal, non-masked array. ma.log masks all values where the (real) logarithm is undefined. Then the resulting masked array b is treated roughly as if the masked entries weren't there. – Philipp Dec 22 '10 at 1:10

feedback

Sven Marnach · Answer 2 · 2010-12-23 23:41:32Z

up vote 2 down vote

The easiest way to do this is to use numpy.ma.masked_invalid():

a = numpy.log(numpy.arange(15))
a.sum()
# -inf
numpy.ma.masked_invalid(a).sum()
# 25.19122118273868

answered Dec 23 '10 at 23:41

Sven Marnach
59k5110179

feedback

marcog · Answer 3 · 2010-12-19 23:45:57Z

up vote 1 down vote

Use a filter():

>>> array
array([  1.,   2.,   3., -Inf])
>>> sum(filter(lambda x: x != float('-inf'), array))
6.0

answered Dec 19 '10 at 23:45

marcog
18.8k1255102

Is this considered a vectorized operation? Is there a more efficient way? I need to do this many times in my code and wanted a vectorized approach – user248237 Dec 19 '10 at 23:47

Are you asking if this is done in-place with iterators? No. Is there a more efficient way? AFAIK, you'd have to loop through the array as there's no filter function that returns an iterator, unless you write one. – marcog Dec 19 '10 at 23:55

I don't think the filter code works for NxM arrays.. it seems to onlyu work for 1xM vectors. – user248237 Dec 20 '10 at 0:33

2

The "numpythonic" way to do filter(lambda x: x != float('-inf'), array) is just x[x != np.inf] Using list comprehensions, filter, etc, is much slower on numpy arrays than it is on lists. Because of that, numpy arrays have a number of facilities to avoid explicitly looping through and operating on each element. – Joe Kington Dec 20 '10 at 3:49

feedback

Rainyboy1987 · Answer 4 · 2010-12-20 03:06:13Z

maybe you can index your matrix and use:

import numpy as np;
matrix = np.array([[1.,2.,3.,np.Inf],[4.,5.,6.,np.Inf],[7.,8.,9.,np.Inf]]);
print matrix[:,1];
print sum(filter(lambda x: x != np.Inf,matrix[:,1]));
print matrix[1,:];
print sum(filter(lambda x: x != np.Inf,matrix[1,:]));

asked	1 year ago
viewed	421 times
active	1 year ago

Ignoring -Inf values in arrays using numpy/scipy in Python

4 Answers

Your Answer

Not the answer you're looking for? Browse other questions tagged python numpy scipy or ask your own question.

Ignoring -Inf values in arrays using numpy/scipy in Python

4 Answers

Your Answer

Not the answer you're looking for? Browse other questions tagged python numpy scipy or ask your own question.

Hello World!

Related