Python function speedup - how to?

Question

I wrote a simple function that calculates and returns the maximum drawdown of a set of returns. I am trying to squeeze as much efficiency out of the code as possible for speed. I've got it down to about as fast as I can go. Does anyone have suggestions on how to write this function more efficiently perhaps through list comprehensions etc.?

import numpy as np

def max_drawdown(returns):

    draw_series = np.array(np.ones(np.size(returns)))
    max_return = 0; max_draw = 1; draw = 1
    returns = returns + 1

    for r in range(returns.count()-1, 0, -1):

        if returns[r] > max_return:
            max_return = returns[r]

        else:
            draw = returns[r] / max_return
            if draw < max_draw:
                max_draw = draw

        draw_series[r-1] = -(1 - max_draw)

    return draw_series

What type of object is returns? returns = returns + 1 suggests it might be an integer, or a numpy array, but neither of those has a count method.
It is actually a Pandas TimeSeries object which acts like a numpy array. returns.count() is the same as len(returns).
This is minor and more aesthetic than performance-related, but note that numpy.ones returns an array, so passing it to numpy.array is redundant.
Thanks @senderle. I found some optimization stuff on loops here, python.org/doc/essays/list2str.html.

Paul · Answer 1 · 2011-07-16 00:35:35Z

You are correct to point out that your implementation is terribly inefficient compared to most built-in Numpy operations of similar complexity. 100X speedup would be reasonable for large arrays once you eliminate the python loop. It would be trivial to replace your python loop with some Numpy indexing or broadcasting if it weren't for the pesky draw_series[r-1] = -(1 - max_draw) line which operates on the next-to-be-computed item in the array. This is analogous to Numpy's accumulate but obviously there's no implementation of it for your particular algorithm. You have three options as I see it:

Study your problem hard and see if you decompose it into numpy-only accumulate and regular operations.
See if your algorithm can be expressed as a compiled numexpr expression.
Compile this function using Cython, f2py or ctypes

One minor improvement is to replace returns = returns + 1 with returns += 1 which will operate in-place and avoid re-allocating the returns array.

Hope this helps. good luck.

Mike Dunlavey · Answer 2 · 2011-07-15 21:53:31Z

up vote 1 down vote

Don't just optimize this or optimize that by educated guessing.

Find out which lines of code are responsible for a large fraction of time, as shown in this answer, and focus your attention there.

Then when you've optimized that, do it all again, until you can't improve it any more.

answered Jul 15 '11 at 21:53

Mike Dunlavey
1433

	That's good advice, thanks. However, I'm not exactly sure what you are doing in your other post. I get the idea of identifying which lines of code are taking the most time, but I do not know how are you getting there. – strimp099 Jul 15 '11 at 22:03
	@strimp099: I thought I made it pretty clear, but I'll admit not everybody gets it right away. You give the program enough data, or you simply loop it enough times so it takes at least several seconds. During that time, you hit Ctrl-C to halt it, and capture the call stack. Do that a few times. Whether a line of code is a function call or not, the fraction of time it costs is the fraction of samples that show it. If something shows up on >1 stack, if you can optimize it, you win. If something never shows up, you can be sure it's too small to worry about. – Mike Dunlavey Jul 15 '11 at 23:18
	@strimp099: Continued (forgive me)... If you look at the other answers to that question, people say things like "your bottleneck is this ...", which is a plausible guess and it's wrong. However, sampling the stack does locate the big time-drain. If that is fixed so it takes less time, then it's a new game, and what was not the big time-taker before may now become a big time-taker, percentage-wise, because the pond is smaller. – Mike Dunlavey Jul 15 '11 at 23:24

GravityBringer · Answer 3 · 2011-07-15 23:53:30Z

If you want high-performance code, Python probably isn't the right language. You don't seem to be doing anything that's much more intensive than what is necessary to achieve your intended computation, so it is unlikely you can increase performance much more.

If you must use Python for whatever reason (such as your data structures coming from a Python environment), you could always use Swig or a similar tool to write a C program that does what you need and link that program to Python. This won't be worth it unless you're working on a very large dataset. Another possibility is to simply dump your data to a file, have a C program process it and dump an output file which could then be read by your program. Of course, you run the risk of spending more time in I/O operations, which could well outweigh any performance gains of this approach.

A less radical proposal: Do you expect that the if statement here:

    if returns[r] > max_return:
        max_return = returns[r]

will be true only rarely? If so, try the following. I doubt it will improve performance substantially, but it's easy to give it a try. It may also make performance worse (it all depends on your general type of dataset):

Change the if-else:

    if returns[r] > max_return:
        max_return_reciprocal = 1.0/returns[r]
    else:
        draw = returns[r] * max_return_reciprocal
        #...then the if draw <

This could spare you from a lot of floating-point divisions, which are quite slow compared to multiplies. This probably won't substantially improve performance, though, because I expect that most of the slowness comes from the overhead associated with Python (interpretation of code). You might also want to look at what exactly this line does:

draw_series = np.array(np.ones(np.size(returns)))

Can you time it and see if it is causing the performance problem?

Winston Ewert · Answer 4 · 2011-07-18 03:19:25Z

Untested, and probably not quite correct. I think it may actually apply operations backwards, but you should be easily able to flip that.

import numpy as np

def max_drawdown(returns):
    returns += 1
    max_returns = np.maximum.accumulate(returns)
    draw = returns / max_returns
    max_draw = np.minimum.accumulate(draw)
    draw_series = -(1 - max_draw)

    return draw_series

Comments on your code:

import numpy as np

def max_drawdown(returns):

    draw_series = np.array(np.ones(np.size(returns)))

np.ones, returns an array. There is no reason to pass it to np.array afterwards. If you aren't going to use the ones you store in the array use numpy.empty which skips the initialization step.

    max_return = 0; max_draw = 1; draw = 1

You declare draw far away from where it used. Just assign to it in the scope its used in. Multiple assignments on one lined is also frowned upon in python.

    returns = returns + 1

Use a compound assignment

    for r in range(returns.count()-1, 0, -1):

I have to recommend against r, as its not a common abbreviation and I think it makes the code hard to read.

        if returns[r] > max_return:
            max_return = returns[r]

        else:
            draw = returns[r] / max_return
            if draw < max_draw:
                max_draw = draw

Your math seems inscrutable, but perhaps it makes sense in context. Consider some comments to explain the reasoning

        draw_series[r-1] = -(1 - max_draw)

    return draw_series

+1 I was writing up the exact same thing eariler, but got busy and never posted it. This is definitely the way to go! (It's also ~3 orders of magnitude faster for large-ish arrays.) For the OP, note that you can create a reversed view of the array by returning return draw_series[::-1].

Whoopska · Answer 5 · 2011-07-17 22:27:26Z

The fastest I could get this using python only is a bit less than twice the speed before. My best attempt was

 def max_drawdown(returns):

     draw_series = np.empty(returns.size)

     max_return = 0; min_draw = 1; draw = 1
     returns += 1.0

     for r in range(returns.size,0,-1):
         ret = returns[r-1]
         if ret > max_return:
             max_return = ret
         else:
             draw = ret / max_return
             if draw < min_draw:
                 min_draw = draw

         draw_series[r-1] = min_draw
     returns -= 1.0 # otherwise side effects
     data_series -= 1.0

     return draw_series

I should note though that this attempts to fix some things that looked weird in your function such as the draw_series / max-return index offset. But these can be fixed relatively easily.

np.empty: initializes the array but doesn't bother to set the inside so you save looping through the array as you would have to with np.ones.

returns +(-)= 1 changes the value of returns in place, so it should not be considered a thread-safe function with this addition. I've negated the change so that there are no side effects after the execution has completed, but this still represents a problem if you plan to thread this.

draw_series - 1.0 executes the same as the min_draw - 1 setting in the draw series, but some how seems to make python happier (or as you have it -(1 - max_draw))

I tried both having a new array to hold the max_returns and execute them element wise at the end and storing the 1.0 / max_return value and multiplying through but each seemed to slow down the execution for some reason.

It didn't seem like the iterator enumerate(reversed(returns)) helped at all with the loop even though it simplified the logic.

asked	1 year ago
viewed	448 times
active	1 year ago

Python function speedup - how to?

migrated from stackoverflow.com Jul 17 '11 at 11:52

5 Answers

Your Answer

Not the answer you're looking for? Browse other questions tagged python performance numpy or ask your own question.

Visit Chat

Python function speedup - how to?

migrated from stackoverflow.com Jul 17 '11 at 11:52

5 Answers

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python performance numpy or ask your own question.

Visit Chat

Related