Sign up ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free.

I have a 2-dimensional array of integers, we'll call it "A".

I want to create a 3-dimensional array "B" of all 1s and 0s such that:

  • for any fixed (i,j) sum(B[i,j,:])==A[i.j], that is, B[i,j,:] contains A[i,j] 1s in it
  • the 1s are randomly placed in the 3rd dimension.

I know how I would do this using standard python indexing but this turns out to be very slow.

I am looking for a way to do this that takes advantage of the features that can make Numpy fast.

Here is how I would do it using standard indexing:

B=np.zeros((X,Y,Z))
indexoptions=range(Z)

for i in xrange(Y):
    for j in xrange(X):
        replacedindices=np.random.choice(indexoptions,size=A[i,j],replace=False)
        B[i,j,[replacedindices]]=1

Can someone please explain how I can do this in a faster way?

Edit: Here is an example "A":

A=np.array([[0,1,2,3,4],[0,1,2,3,4],[0,1,2,3,4],[0,1,2,3,4],[0,1,2,3,4]])

in this case X=Y=5 and Z>=5

share|improve this question
1  
Trying to make progress on this, I asked a simpler question: stackoverflow.com/questions/26310897/… - but then I realized that my planned np.random.shuffle(np.rollaxis(B, 2)) doesn't shuffle all the rows independently, so this is not quite an answer yet. Building blocks, maybe. :) – John Zwinck Oct 11 '14 at 4:22

1 Answer 1

up vote 3 down vote accepted

Essentially the same idea as @JohnZwinck and @DSM, but with a shuffle function for shuffling a given axis:

import numpy as np

def shuffle(a, axis=-1):
    """
    Shuffle `a` in-place along the given axis.

    Apply numpy.random.shuffle to the given axis of `a`.
    Each one-dimensional slice is shuffled independently.
    """
    b = a.swapaxes(axis,-1)
    # Shuffle `b` in-place along the last axis.  `b` is a view of `a`,
    # so `a` is shuffled in place, too.
    shp = b.shape[:-1]
    for ndx in np.ndindex(shp):
        np.random.shuffle(b[ndx])
    return


def random_bits(a, n):
    b = (a[..., np.newaxis] > np.arange(n)).astype(int)
    shuffle(b)
    return b


if __name__ == "__main__":
    np.random.seed(12345)

    A = np.random.randint(0, 5, size=(3,4))
    Z = 6

    B = random_bits(A, Z)

    print "A:"
    print A
    print "B:"
    print B

Output:

A:
[[2 1 4 1]
 [2 1 1 3]
 [1 3 0 2]]
B:
[[[1 0 0 0 0 1]
  [0 1 0 0 0 0]
  [0 1 1 1 1 0]
  [0 0 0 1 0 0]]

 [[0 1 0 1 0 0]
  [0 0 0 1 0 0]
  [0 0 1 0 0 0]
  [1 0 1 0 1 0]]

 [[0 0 0 0 0 1]
  [0 0 1 1 1 0]
  [0 0 0 0 0 0]
  [0 0 1 0 1 0]]]
share|improve this answer
    
Hmmph. I'm annoyed that shuffle doesn't work like I thought it did. Could the Python-level loop be avoided by reshaping to a lower-D object and shuffling that? – DSM Oct 11 '14 at 4:54
1  
@DSM: I share your annoyance! I couldn't find a way to make this work with a single call to np.random.shuffle. (My first version of shuffle--not shown here--is a vectorized Fisher-Yates algorithm, but it is not as clear as this one, and probably a lot slower that this one when the non-axis dimensions are small.) – Warren Weckesser Oct 11 '14 at 5:01
    
Thanks! For large arrays this method is more than 100 times faster than the way I was originally doing it. – user3927843 Oct 11 '14 at 8:53

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.