numpy 3d array - selecting maximum elements along given axis

Question

I have a numpy array that has dimensions (x, y, z) = (5, 50, 4). For every (x, y) pair, I want to find the index of the maximum value along the z axis. This index is in range(4). I want to select all of these "maximum" elements and set them to 1. Then, I want to select all other elements and set them to zero.

To explain it another way, I want to look at all vectors in the "z" direction (there are x*y of these vectors total). I want to set the maximum element to 1 and all other elements to 0. For example, the vector (0.25, 0.1, 0.5, 0.15) will become (0, 0, 1, 0).

I've tried many different things. The argmax function seems like it should help. But how do I use it to select elements correctly? I have tried things like...

x = data
i = x.argmax(axis = 2)
x[i] # shape = (5, 50, 50, 4)
x[:,:,i] # shape = (5, 50, 5, 50)
x[np.unravel_index(i), x.shape] # shape = (5, 50)

The last one, which uses np.unravel_index, has the correct shape, but the selected indices are NOT the maximum values along the z axis. So I'm having some trouble. If anyone could help at all, it would be really awesome. Thanks!

Edit: Here is a way I have found to do this. But if anyone has anything that is faster, please let me know!

def fix_vector(a):
    i = a.argmax()
    a = a*0
    a[i] = 1
    return a

y = np.apply_along_axis(fix_vector, axis=2, arr=x)

I would really like to optimize this if possible, since I call this function MANY times.

Edit: Thanks DSM for a nice solution. Here is a small example dataset, as requested in the comments.

data = np.random.random((3,5,4))
desired_output = np.apply_along_axis(fix_vector, axis=2, arr=data)

This uses the fix_vector function I posted above, but DSM's solution is faster. Thanks again!

You should add a, small, example of the input array and the desired result after processing. — wwii, Aug 13 '15 at 3:40

DSM · Answer 1 · 2015-08-13 03:41:01Z

This isn't particularly elegant, but:

def faster(x):
    d = x.reshape(-1, x.shape[-1])
    d2 = np.zeros_like(d)
    d2[np.arange(len(d2)), d.argmax(1)] = 1
    d2 = d2.reshape(x.shape)
    return d2

seems a little faster than the fix_vector approach. For example:

>>> x,y,z = 5,50,4
>>> data = np.random.random((x,y,z))
>>> np.allclose(orig(data), faster(data))
True
>>> %timeit -n 1000 orig(data)
1000 loops, best of 3: 5.77 ms per loop
>>> %timeit -n 1000 faster(data)
1000 loops, best of 3: 36.6 µs per loop

wwii · Answer 2 · 2015-08-13 04:41:31Z

up vote 1 down vote

This can be done with numpy.where

import numpy as np
a = np.array([[[ 0.25,  0.10 ,  0.50 ,  0.15],
               [ 0.50,  0.60 ,  0.40 ,  0.30]],
              [[ 0.25,  0.50 ,  0.20 ,  0.70],
               [ 0.80,  0.10 ,  0.50 ,  0.15]]])

Find the max value along the last axis

b = a.max(-1)    #shape is (2,2)

Add an axis to b so that it will broadcast across a and create a boolean array.

condition = a == b[..., np.newaxis]

Use numpy.where for the substitution.

c = np.where(condition, 1, 0)

>>> c
array([[[0, 0, 1, 0],
        [0, 1, 0, 0]],

       [[0, 0, 0, 1],
        [1, 0, 0, 0]]])

def f(a):
    b = a.max(-1)
    condition = a == b[..., np.newaxis]
    return np.where(condition, 1, 0)

So I wasn't satisfied - I've seen cases where reshape is faster than adding a new axis so I played around a bit. Seems numpy.where itself is a tad slow. Using boolean indexing with an assignment gives pretty much the same performance as @DSM's.

def h(a):
    b = a.max(-1)
    condition = a == b[..., np.newaxis]
    a[condition] = 1
    a[np.logical_not(condition)] = 0
    return a

edited Aug 13 '15 at 4:41

answered Aug 13 '15 at 4:17

wwii

6,1312927

Comparable but a tad (50%) slower than @DSM's - looks like they both scale the same with the size of the array. – wwii Aug 13 '15 at 4:26

Note that conceivably this could result in multiple 1s in a row if the maximum isn't unique. But for all we know, this might be what the OP wants to happen in that case (the question is ambiguous on the point). – DSM Aug 13 '15 at 4:26

@DSM - yours chooses the lowest index for multiple maximums in a row? – wwii Aug 13 '15 at 4:47

thanks for the solution, wwii. this is exactly what i was looking for. just fyi, for my application, i never encounter two identical values, so both solutions should work. – adn bps Aug 13 '15 at 14:53

add a comment |

hpaulj · Answer 3 · 2015-08-13 16:52:44Z

Sample x, (2,3,4) in shape:

In [934]: x
Out[934]: 
array([[[ 5,  7,  4, 15],
        [ 9, 23, 14,  9],
        [17, 13,  2,  1]],

       [[ 2, 18,  3, 18],
        [10, 12, 23, 14],
        [17, 23, 19, 13]]])

The max values along the last axis - clear enough, a (2,3) array:

In [935]: x.max(2)
Out[935]: 
array([[15, 23, 17],
       [18, 23, 23]])

Cooresponding indexes along that axis

In [936]: I=x.argmax(2)

In [937]: I
Out[937]: 
array([[3, 1, 0],
       [1, 2, 1]], dtype=int32)

One way of indexing x with those indexes. At this point I'm constructing the 0 and 1 axis indexes manually. It needs to be automated

In [938]: x[np.arange(2)[:,None], np.arange(3)[None,:], I]
Out[938]: 
array([[15, 23, 17],
       [18, 23, 23]])

The same indexes can be used to set values in another array:

In [939]: y=np.zeros_like(x)   
In [940]: y[np.arange(2)[:,None],np.arange(3)[None,:],I]=x.max(2)
In [941]: y
Out[941]: 
array([[[ 0,  0,  0, 15],
        [ 0, 23,  0,  0],
        [17,  0,  0,  0]],

       [[ 0, 18,  0,  0],
        [ 0,  0, 23,  0],
        [ 0, 23,  0,  0]]])

The trick is to come up with a more streamlined way of generating the indexing tuple:

(np.arange(2)[:,None], np.arange(3)[None,:], I)

(array([[0],
    [1]]), 
 array([[0, 1, 2]]), 
 array([[3, 1, 0],
    [1, 2, 1]], dtype=int32))

np.ix_ helps, though concatenating 2 tuples is a little ugly:

J = np.ix_(range(x.shape[0]), range(x.shape[1])) + (I,)

Indexing on a flattened array can be faster:

In [998]: J1 = np.ravel_multi_index(J, x.shape)
In [999]: J1
Out[999]: 
array([[ 3,  5,  8],
     [13, 18, 21]], dtype=int32)

In [1000]: x.flat[J1]
Out[1000]: 
array([[15, 23, 17],
   [18, 23, 23]])

In [1001]: y.flat[J1]=2

In [1002]: y
Out[1002]: 
array([[[0, 0, 0, 2],
        [0, 2, 0, 0],
        [2, 0, 0, 0]],

       [[0, 2, 0, 0],
        [0, 0, 2, 0],
        [0, 2, 0, 0]]])

But I still need to construct the J indexing tuple.

This is similar to DSMs solution, but sticks with the 3 dimensions. Flattening 2 of the dimensions eliminates the need for the xi_ construction.

def max_index(x, n, fn=np.argmax):
    # return multidimensional indexing tuple
    # applying fn along axis n
    I = fn(x, axis=n)
    alist = list(np.ix_(*[range(i) for i in I.shape]))
    alist.insert(n, I)
    return tuple(alist)

is a general purpose function to generate the J like indexing tuple

e.g.

y = np.zeros_like(x)
I = max_index(x, 1, np.argmin)
y[I] = x[I]
I = max_index(x, 1, np.argmax)
y[I] = x[I]

inserts both the minimum and maximum values of x (along axis 1) into y. Speed is about half of DSMs faster.

thanks for the solution, hpaulj. i think DSM and wwii provided more elegant solutions, but it was interesting to see how this works too. — adn bps, Aug 13 '15 at 14:57

asked	1 year ago
viewed	451 times
active	1 year ago

numpy 3d array - selecting maximum elements along given axis

3 Answers 3

Your Answer

Not the answer you're looking for? Browse other questions tagged python arrays numpy multidimensional-array or ask your own question.

Hot Network Questions

numpy 3d array - selecting maximum elements along given axis

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python arrays numpy multidimensional-array or ask your own question.

Related

Hot Network Questions