Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I have been trying to implement some modification to speed up this pseudo code:

>>> A=np.array([1,1,1,2,2,2,3,3,3])
>>> B=np.array([np.power(A,n) for n in [3,4,5]])
>>> B
array([[  1,   1,   1,   8,   8,   8,  27,  27,  27],
       [  1,   1,   1,  16,  16,  16,  81,  81,  81],
       [  1,   1,   1,  32,  32,  32, 243, 243, 243]])

Where elements of A are often repeated 10-20 times and the shape of B needs to be retained because it is multiplied by another array of the same shape later.

My first idea was to use the following code:

uA=np.unique(A)
uB=np.array([np.power(uA,n) for n in [3,4,5]])
B=[]
for num in range(uB.shape[0]):
    Temp=np.copy(A)
    for k,v in zip(uA,uB[num]): Temp[A==k] = v
    B.append(Temp)
B=np.array(B)
### Also any better way to create the numpy array B?

This seems fairly terrible and there is likely a better way. Any idea on how to speed this up would be much appreciated.

Thank you for your time.

Here is an update. I realized that my function was poorly coded. A thank you to everyone for the suggestions. I will try to rephrase my questions better in the future so that they show everything required.

Normal='''
import numpy as np
import scipy
def func(value,n):
    if n==0: return 1
    else: return np.power(value,n)/scipy.factorial(n,exact=0)+func(value,n-1)
A=np.random.randint(10,size=250)
A=np.unique(A)
B=np.array([func(A,n) for n in [6,8,10]])
'''

Me='''
import numpy as np
import scipy
def func(value,n):
    if n==0: return 1
    else: return np.power(value,n)/scipy.factorial(n,exact=0)+func(value,n-1)
A=np.random.randint(10,size=250)
uA=np.unique(A)
uB=np.array([func(A,n) for n in [6,8,10]])
B=[]
for num in range(uB.shape[0]):
    Temp=np.copy(A)
    for k,v in zip(uA,uB[num]): Temp[A==k] = v
    B.append(Temp)
B=np.array(B)
'''


Alex='''
import numpy as np
import scipy
A=np.random.randint(10,size=250)
power=np.arange(11)
fact=scipy.factorial(np.arange(11),exact=0).reshape(-1,1)
power=np.power(A,np.arange(11).reshape(-1,1))
value=power/fact
six=np.sum(value[:6],axis=0)
eight=six+np.sum(value[6:8],axis=0)
ten=eight+np.sum(value[8:],axis=0)
B=np.vstack((six,eight,ten))
'''
Alex='''
import numpy as np
import scipy
A=np.random.randint(10,size=250)
power=np.arange(11)
fact=scipy.factorial(np.arange(11),exact=0).reshape(-1,1)
power=np.power(A,np.arange(11).reshape(-1,1))
value=power/fact
six=np.sum(value[:6],axis=0)
eight=six+np.sum(value[6:8],axis=0)
ten=eight+np.sum(value[8:],axis=0)
B=np.vstack((six,eight,ten))
'''

Alex2='''
import numpy as np
import scipy
def find_count(the_list):
    count = list(the_list).count
    result = [count(item) for item in set(the_list)]
    return result
A=np.random.randint(10,size=250)
A_unique=np.unique(A)
A_counts = np.array(find_count(A_unique))
fact=scipy.factorial(np.arange(11),exact=0).reshape(-1,1)
power=np.power(A_unique,np.arange(11).reshape(-1,1))
value=power/fact
six=np.sum(value[:6],axis=0)
eight=six+np.sum(value[6:8],axis=0)
ten=eight+np.sum(value[8:],axis=0)
B_nodup=np.vstack((six,eight,ten))
B_list = [ np.transpose( np.tile( B_nodup[:,i], (A_counts[i], 1) ) ) for i in range(A_unique.shape[0]) ]
B = np.hstack( B_list )
'''


print timeit.timeit(Normal, number=10000)
print timeit.timeit(Me, number=10000)
print timeit.timeit(Alex, number=10000)
print timeit.timeit(Alex2, number=10000)

Normal: 10.7544178963
Me:     23.2039361
Alex:    4.85648703575
Alex2:   4.18024992943
share|improve this question
    
Append is slow, so is a python list compared to ndarray. Don't you already know the size of B? Declare it before the loop and just insert values. –  arynaq Nov 27 '12 at 22:14
    
Can you expand on this some. Do you mean np.empty(shape) then fill in the values? –  Ophion Nov 27 '12 at 22:17
    
Yes. If you are taking the power of each element to 3 different powers (3,4,5) in example then you know you get lengh of A * 3 elements. –  arynaq Nov 27 '12 at 22:18
    
Updated the pseudo code so that it works. –  Ophion Nov 27 '12 at 22:35

3 Answers 3

up vote 1 down vote accepted

Use a combination of numpy.tile() and numpy.hstack(), as follows:

A = np.array([1,2,3])
A_counts = np.array([3,3,3])
A_powers = np.array([[3],[4],[5]])
B_nodup = np.power(A, A_powers)
B_list = [ np.transpose( np.tile( B_nodup[:,i], (A_counts[i], 1) ) ) for i in range(A.shape[0]) ]
B = np.hstack( B_list )

The transpose and stack may be reversed, this may be faster:

B_list = [ np.tile( B_nodup[:,i], (A_counts[i], 1) ) for i in range(A.shape[0]) ]
B = np.transpose( np.vstack( B_list ) )

This is likely only worth doing if the function you are calculating is quite expensive, or it is duplicated many, many times (more than 10); doing a tile and stack to prevent calculating the power function an extra 10 times is likely not worth it. Please benchmark and let us know.

EDIT: Or, you could just use broadcasting to get rid of the list comprehension:

>>> A=np.array([1,1,1,2,2,2,3,3,3])
>>> B = np.power(A,[[3],[4],[5]])
>>> B
array([[  1,   1,   1,   8,   8,   8,  27,  27,  27],
       [  1,   1,   1,  16,  16,  16,  81,  81,  81],
       [  1,   1,   1,  32,  32,  32, 243, 243, 243]])

This is probably pretty fast, but doesn't actually do what you asked.

share|improve this answer
    
Im not sure why people are so against list comprehension. In this scenario its about the same/slightly better then these other options. –  Ophion Nov 27 '12 at 22:53
2  
@Ophion: are you sure? Please compare with the broadcasting method above, not the one suggested by larsmans below (which requires an extra reshape and transpose). I suspect broadcasting wins by a lot for large arrays (try 1k by 1k or so). –  Alex I Nov 27 '12 at 23:25
    
@AlexI: your version effectively reshapes [3,4,5]. When I run it on a larger example, it's a tad slower than my version, though probably not significantly so. reshape and transpose are very cheap. –  larsmans Nov 27 '12 at 23:33
    
@Alex: Ah yes, when there are many rows the broadcasting method above is much faster. This typically only has about three rows so the differences are fairly negligible. Its certainly interesting and I will use it in the future. For the np.tile I will give it a go; however, the order is important and not always sorted. –  Ophion Nov 27 '12 at 23:37
    
@Ophion: the np.tile is used to make the desired number of copies of each column in the result (given by A_counts); it doesn't require anything to be sorted. –  Alex I Nov 28 '12 at 0:16

You can broadcast np.power across A if you change its shape to that of a column vector.

>>> np.power(A.reshape(-1,1), [3,4,5]).T
array([[  1,   1,   1,   8,   8,   8,  27,  27,  27],
       [  1,   1,   1,  16,  16,  16,  81,  81,  81],
       [  1,   1,   1,  32,  32,  32, 243, 243, 243]])
share|improve this answer
    
Any intuitive reason why (-1,1) is a coulm and (1,-1) is a row? Just trying to understand the logic. Or is it "just because"? –  arynaq Nov 27 '12 at 22:27
1  
@user948652: try printing them. Or, if you're looking for a reason why the first index is the row index: that's just a convention. –  larsmans Nov 27 '12 at 22:30
    
Ah should have googled first, thanks anyway :) newshape : int or tuple of ints The new shape should be compatible with the original shape. If an integer, then the result will be a 1-D array of that length. One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions. –  arynaq Nov 27 '12 at 22:33
    
This actually takes more time then list comprehension by about 5% according to timeit. –  Ophion Nov 27 '12 at 22:34
    
@Ophion: on my box, it's about 20% faster for your example data. When A.size == 21 (7 distinct values), it's about twice as fast. –  larsmans Nov 27 '12 at 23:30

My go at it with 200k iterations, the first method is mine.

import numpy as np
import time

N = 200000
start = time.time()
for j in range(N):

    x = np.array([1,1,1,2,2,2,3,3,3])
    powers = np.array([3,4,5])
    result = np.zeros((powers.size,x.size)).astype(np.int32)
    for i in range(powers.size):
        result[i,:] = x**powers[i]
print time.time()-start, "seconds"

start = time.time()
for j in range(N):
    A=np.array([1,1,1,2,2,2,3,3,3])
    B = np.power(A,[[3],[4],[5]])
print time.time()-start, "seconds"

start = time.time()
for j in range(N):
    np.power(A.reshape(-1,1), [3,4,5]).T
print time.time()-start, "seconds"

start = time.time()
for j in range(N):
    A=np.array([1,1,1,2,2,2,3,3,3])
    B=np.array([np.power(x,n) for n in [3,4,5]])
print time.time()-start, "seconds"

Produces

8.88000011444 seconds
9.25099992752 seconds
3.95399999619 seconds
7.43799996376 seconds

larsmans method is clearly fastest.

(ps how do you link to an answer or user here without explicit url @larsman doesnt work)

share|improve this answer
1  
You forgot to initialize A for every loop of larsmans answer. Updating for this, using timeit, and A=np.random.randint(20, size=500) initialize at the very start you will find they are very close together. Thanks for posting this. –  Ophion Nov 27 '12 at 23:26
    
That's @larsmans, not larsman :) –  larsmans Nov 27 '12 at 23:34

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.