Fast replace in numpy array

Question

I have been trying to implement some modification to speed up this pseudo code:

>>> A=np.array([1,1,1,2,2,2,3,3,3])
>>> B=np.array([np.power(A,n) for n in [3,4,5]])
>>> B
array([[  1,   1,   1,   8,   8,   8,  27,  27,  27],
       [  1,   1,   1,  16,  16,  16,  81,  81,  81],
       [  1,   1,   1,  32,  32,  32, 243, 243, 243]])

Where elements of A are often repeated 10-20 times and the shape of B needs to be retained because it is multiplied by another array of the same shape later.

My first idea was to use the following code:

uA=np.unique(A)
uB=np.array([np.power(uA,n) for n in [3,4,5]])
B=[]
for num in range(uB.shape[0]):
    Temp=np.copy(A)
    for k,v in zip(uA,uB[num]): Temp[A==k] = v
    B.append(Temp)
B=np.array(B)
### Also any better way to create the numpy array B?

This seems fairly terrible and there is likely a better way. Any idea on how to speed this up would be much appreciated.

Thank you for your time.

Here is an update. I realized that my function was poorly coded. A thank you to everyone for the suggestions. I will try to rephrase my questions better in the future so that they show everything required.

Normal='''
import numpy as np
import scipy
def func(value,n):
    if n==0: return 1
    else: return np.power(value,n)/scipy.factorial(n,exact=0)+func(value,n-1)
A=np.random.randint(10,size=250)
A=np.unique(A)
B=np.array([func(A,n) for n in [6,8,10]])
'''

Me='''
import numpy as np
import scipy
def func(value,n):
    if n==0: return 1
    else: return np.power(value,n)/scipy.factorial(n,exact=0)+func(value,n-1)
A=np.random.randint(10,size=250)
uA=np.unique(A)
uB=np.array([func(A,n) for n in [6,8,10]])
B=[]
for num in range(uB.shape[0]):
    Temp=np.copy(A)
    for k,v in zip(uA,uB[num]): Temp[A==k] = v
    B.append(Temp)
B=np.array(B)
'''


Alex='''
import numpy as np
import scipy
A=np.random.randint(10,size=250)
power=np.arange(11)
fact=scipy.factorial(np.arange(11),exact=0).reshape(-1,1)
power=np.power(A,np.arange(11).reshape(-1,1))
value=power/fact
six=np.sum(value[:6],axis=0)
eight=six+np.sum(value[6:8],axis=0)
ten=eight+np.sum(value[8:],axis=0)
B=np.vstack((six,eight,ten))
'''
Alex='''
import numpy as np
import scipy
A=np.random.randint(10,size=250)
power=np.arange(11)
fact=scipy.factorial(np.arange(11),exact=0).reshape(-1,1)
power=np.power(A,np.arange(11).reshape(-1,1))
value=power/fact
six=np.sum(value[:6],axis=0)
eight=six+np.sum(value[6:8],axis=0)
ten=eight+np.sum(value[8:],axis=0)
B=np.vstack((six,eight,ten))
'''

Alex2='''
import numpy as np
import scipy
def find_count(the_list):
    count = list(the_list).count
    result = [count(item) for item in set(the_list)]
    return result
A=np.random.randint(10,size=250)
A_unique=np.unique(A)
A_counts = np.array(find_count(A_unique))
fact=scipy.factorial(np.arange(11),exact=0).reshape(-1,1)
power=np.power(A_unique,np.arange(11).reshape(-1,1))
value=power/fact
six=np.sum(value[:6],axis=0)
eight=six+np.sum(value[6:8],axis=0)
ten=eight+np.sum(value[8:],axis=0)
B_nodup=np.vstack((six,eight,ten))
B_list = [ np.transpose( np.tile( B_nodup[:,i], (A_counts[i], 1) ) ) for i in range(A_unique.shape[0]) ]
B = np.hstack( B_list )
'''


print timeit.timeit(Normal, number=10000)
print timeit.timeit(Me, number=10000)
print timeit.timeit(Alex, number=10000)
print timeit.timeit(Alex2, number=10000)

Normal: 10.7544178963
Me:     23.2039361
Alex:    4.85648703575
Alex2:   4.18024992943

Append is slow, so is a python list compared to ndarray. Don't you already know the size of B? Declare it before the loop and just insert values.
Can you expand on this some. Do you mean np.empty(shape) then fill in the values?
Yes. If you are taking the power of each element to 3 different powers (3,4,5) in example then you know you get lengh of A * 3 elements.

Alex I · Accepted Answer · 2012-11-27 23:12:23Z

Use a combination of numpy.tile() and numpy.hstack(), as follows:

A = np.array([1,2,3])
A_counts = np.array([3,3,3])
A_powers = np.array([[3],[4],[5]])
B_nodup = np.power(A, A_powers)
B_list = [ np.transpose( np.tile( B_nodup[:,i], (A_counts[i], 1) ) ) for i in range(A.shape[0]) ]
B = np.hstack( B_list )

The transpose and stack may be reversed, this may be faster:

B_list = [ np.tile( B_nodup[:,i], (A_counts[i], 1) ) for i in range(A.shape[0]) ]
B = np.transpose( np.vstack( B_list ) )

This is likely only worth doing if the function you are calculating is quite expensive, or it is duplicated many, many times (more than 10); doing a tile and stack to prevent calculating the power function an extra 10 times is likely not worth it. Please benchmark and let us know.

EDIT: Or, you could just use broadcasting to get rid of the list comprehension:

>>> A=np.array([1,1,1,2,2,2,3,3,3])
>>> B = np.power(A,[[3],[4],[5]])
>>> B
array([[  1,   1,   1,   8,   8,   8,  27,  27,  27],
       [  1,   1,   1,  16,  16,  16,  81,  81,  81],
       [  1,   1,   1,  32,  32,  32, 243, 243, 243]])

This is probably pretty fast, but doesn't actually do what you asked.

Im not sure why people are so against list comprehension. In this scenario its about the same/slightly better then these other options.
@Ophion: are you sure? Please compare with the broadcasting method above, not the one suggested by larsmans below (which requires an extra reshape and transpose). I suspect broadcasting wins by a lot for large arrays (try 1k by 1k or so).
@AlexI: your version effectively reshapes [3,4,5]. When I run it on a larger example, it's a tad slower than my version, though probably not significantly so. reshape and transpose are very cheap.
@Alex: Ah yes, when there are many rows the broadcasting method above is much faster. This typically only has about three rows so the differences are fairly negligible. Its certainly interesting and I will use it in the future. For the np.tile I will give it a go; however, the order is important and not always sorted.
@Ophion: the np.tile is used to make the desired number of copies of each column in the result (given by A_counts); it doesn't require anything to be sorted.

user948652 · Answer 2 · 2012-11-27 23:09:21Z

My go at it with 200k iterations, the first method is mine.

import numpy as np
import time

N = 200000
start = time.time()
for j in range(N):

    x = np.array([1,1,1,2,2,2,3,3,3])
    powers = np.array([3,4,5])
    result = np.zeros((powers.size,x.size)).astype(np.int32)
    for i in range(powers.size):
        result[i,:] = x**powers[i]
print time.time()-start, "seconds"

start = time.time()
for j in range(N):
    A=np.array([1,1,1,2,2,2,3,3,3])
    B = np.power(A,[[3],[4],[5]])
print time.time()-start, "seconds"

start = time.time()
for j in range(N):
    np.power(A.reshape(-1,1), [3,4,5]).T
print time.time()-start, "seconds"

start = time.time()
for j in range(N):
    A=np.array([1,1,1,2,2,2,3,3,3])
    B=np.array([np.power(x,n) for n in [3,4,5]])
print time.time()-start, "seconds"

Produces

8.88000011444 seconds
9.25099992752 seconds
3.95399999619 seconds
7.43799996376 seconds

larsmans method is clearly fastest.

(ps how do you link to an answer or user here without explicit url @larsman doesnt work)

You forgot to initialize A for every loop of larsmans answer. Updating for this, using timeit, and A=np.random.randint(20, size=500) initialize at the very start you will find they are very close together. Thanks for posting this.

larsmans · Answer 3 · 2012-11-27 22:21:27Z

up vote 3 down vote

You can broadcast np.power across A if you change its shape to that of a column vector.

>>> np.power(A.reshape(-1,1), [3,4,5]).T
array([[  1,   1,   1,   8,   8,   8,  27,  27,  27],
       [  1,   1,   1,  16,  16,  16,  81,  81,  81],
       [  1,   1,   1,  32,  32,  32, 243, 243, 243]])

answered 21 hours ago

larsmans
101k894204

Any intuitive reason why (-1,1) is a coulm and (1,-1) is a row? Just trying to understand the logic. Or is it "just because"? – user948652 21 hours ago

1

@user948652: try printing them. Or, if you're looking for a reason why the first index is the row index: that's just a convention. – larsmans 21 hours ago

Ah should have googled first, thanks anyway :) newshape : int or tuple of ints The new shape should be compatible with the original shape. If an integer, then the result will be a 1-D array of that length. One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions. – user948652 21 hours ago

This actually takes more time then list comprehension by about 5% according to timeit. – Ophion 21 hours ago

@Ophion: on my box, it's about 20% faster for your example data. When A.size == 21 (7 distinct values), it's about twice as fast. – larsmans 20 hours ago

feedback

asked	today
viewed	58 times
active	today

Fast replace in numpy array

3 Answers

Your Answer

Not the answer you're looking for? Browse other questions tagged python arrays performance numpy replace or ask your own question.

Hello World!

Community Bulletin

Fast replace in numpy array

3 Answers

Your Answer

Not the answer you're looking for? Browse other questions tagged python arrays performance numpy replace or ask your own question.

Hello World!

Community Bulletin

Related