Rolling loop with an array of data points

Question

I have a NumPy array of about 2500 data points. This function is called on a rolling basis where 363 data points is passed at a time.

def fcn(data):
    a = [data[i]/np.mean(data[i-2:i+1])-1 for i in range(len(data)-1, len(data)-362, -1)]
    return a

This takes about 5 seconds to run. I think the bottleneck is the list slicing. Any thoughts on how to speed this up?

Mike Samuel · Accepted Answer · 2011-08-02 23:52:57Z

range returns a list. You could use xrange instead.

range([start,] stop[, step]) -> list of integers
Return a list containing an arithmetic progression of integers.

vs

xrange([start,] stop[, step]) -> xrange object
Like range(), but instead of returning a list, returns an object that generates the numbers in the range on demand. For looping, this is slightly faster than range() and more memory efficient.

The other thing that strikes me is the slice in the argument to np.mean. The slice is always of length 3. Assuming this is an arithmetic mean, you could turn the division into

(3.0 * data[i] / (data[i - 2] + data[i - 1] + data[i]))

So putting it together

def fcn(data):
    return [(3.0 * data[i] / (data[i - 2] + data[i - 1] + data[i])) - 1
            for i in xrange(len(data) - 1, len(data) - 362, -1)]

and you could further optimize the sum of last three values by recognizing that if

x = a[n] + a[n+1] + a[n+2]

and you have already computed

y = a[n - 1] + a[n] + a[n + 1]

then

x = y + (a[n - 1] - a[n + 2])

which helps whenever a local variable access and assignment is faster than accessing an element in a series.

This is great feedback. Your version of fcn runs in half the time as using np.mean! — strimp099, Aug 3 '11 at 0:46

Winston Ewert · Answer 2 · 2011-08-03 00:09:22Z

up vote 2 down vote

When using numpy, you should avoid writing loops. Instead you should do operations on the array.

Slicing in numpy is really cheap, because it doesn't actually copy anything.

The tricky part in eliminating the loop is the rolling np.mean(), but see this web page for code to help eliminate that: http://www.rigtorp.se/2011/01/01/rolling-statistics-numpy.html

answered Aug 3 '11 at 0:09

Winston Ewert
22.1k42960

Thanks for the feedback. I'm going to check out np.strides. – strimp099 Aug 3 '11 at 0:47

add a comment |

asked	3 years ago
viewed	612 times
active	2 months ago

current community

your communities

more stack exchange communities

Rolling loop with an array of data points

2 Answers 2

Your Answer

Not the answer you're looking for? Browse other questions tagged python array numpy or ask your own question.

Linked

Hot Network Questions

current community

your communities

more stack exchange communities

Rolling loop with an array of data points

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python array numpy or ask your own question.

Linked

Related

Hot Network Questions