Code Review Stack Exchange is a question and answer site for peer programmer code reviews. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I wrote a Python module to learn about memoization, and I have some questions on what I did.

  • How pythonic is this?
  • Does it make a difference if I used a class attribute instead of a function attribute for my lookup_table?
  • How can I modify multi_run_chrono to use it like Python's decorators (with @)?
  • Should I keep the try... except KeyError or rather do if key in dictionary...?
  • Are there some nice tricks I could do to improve this code's elegance and/or performance?

Here is the code:

from __future__ import print_function
from random import randint
from time import clock


def naive_factorial(n):
    """calculates n! with a simple recursive algorithm"""
    if n == 0:
        return 1
    else:
        return n * naive_factorial(n-1)


def memo_factorial(n):
    """calculates n! using memoization in a recursive algorithm"""
    if not hasattr(memo_factorial, 'lookup_table'):
        memo_factorial.lookup_table = {0: 1}
    try:
        return memo_factorial.lookup_table[n]
    except KeyError:
        val = n * memo_factorial(n - 1)
        memo_factorial.lookup_table[n] = val
        return val


def check_correctness():
    print('Checking both functions do the same things')

    for i in [0, 1, 4, 10, 22, 63]:
        n = naive_factorial(i)
        m = memo_factorial(i)
        print('n = %d -- Naive : %d, Memo : %d' % (i, n, m))
        if n != m:
            print('Error in functions implementation')
            exit()


def multi_run_chrono(func, nb_runs, min_val, max_val):
    print('Running %s %d times, with n from %d to %d'
          % (func.__name__, nb_runs, min_val, max_val))
    tests = [randint(min_val, max_val) for _ in xrange(nb_runs)]
    starting_time = clock()
    for n in tests:
        func(n)
    ending_time = clock()
    print('Duration : %fs' % (ending_time - starting_time))


def main():
    # check_correctness()  # comment/uncomment to check or not
    for func in [naive_factorial, memo_factorial]:
        multi_run_chrono(func, 20000, 10, 200)


if __name__ == '__main__':
    main()

Benchmarking on my computer :

Running naive_factorial 20000 times, with n from 10 to 200
Duration : 0.596933s
Running memo_factorial 20000 times, with n from 10 to 200
Duration : 0.006060s

All remarks are welcome, thank you very much!

share|improve this question
1  
That's a nice piece of code regarding PEP8 style guide. Good job. Nothing to comment on this part. – Dex' ter Aug 5 at 9:20
4  
Consider updating to Python 3 where memoization is built-in functools.lru_cache, also consider separating the algoritmh from the memoization logic. – Caridorc Aug 5 at 9:28
    
@Dex'ter thanks, this is not the first time I post here, glad to see I learned my lessons well. – BusyAnt Aug 5 at 9:31
    
@Caridorc I'll look it up thanks. By separating the algoritmh from the memoization logic, do you mean like decorating the naive_factorial to make it use memoization instead of creating a whole new function with memoization integrated? – BusyAnt Aug 5 at 9:32
    
@BusyAnt Exactly, so you can reuse the memoization capability with different functions. – Caridorc Aug 5 at 9:35
up vote 7 down vote accepted

Regarding the memoization, here are two different ways to do this (for single-valued functions):

import functools

def memoize(func):
    cache = func.cache = {}

    @functools.wraps(func)
    def wrapper(n):
        if n not in cache:
            cache[n] = func(n)
        return cache[n]
    return wrapper

def memodict(f):
    """ Memoization decorator for a function taking a single argument """
    class memodict(dict):
        def __missing__(self, key):
            ret = self[key] = f(key)
            return ret
    return memodict().__getitem__

@memodict
# @memoize
def factorial(n):
    """calculates n! with a simple recursive algorithm"""
    if n == 0:
        return 1
    else:
        return n * factorial(n-1)

The latter is faster because it saves one additional lookup, but only works for single-valued functions, whereas the first can be extended to also pass along multiple arguments.

The memoize is also nice, because it keeps the documentation of the function because of functools.wraps (try help(factorial) in both cases to see the difference).

Which implementation is faster depends on how often your try block misses and succeeds.

As for timings: Using @memoize to calculate all factorials from 1 to 100000000 takes 2.66s on my machine, while doing the same with @memodict takes 2.51s, so not a big difference.

If you want to use a decorator to time the execution time of a function, this will not work very nicely here, since if you decorate factorial, every execution of the function will be timed (because of recursion this will be very often).

You can however use something like this:

def timeit(func, *args, **kwargs):
    starting_time = clock()
    result = func(*args, **kwargs)
    ending_time = clock()
    print 'Duration: {}'.format(ending_time - starting_time)
    return result

timeit(factorial, 100)

If you do want to use this as decorator, just insert a wrapper:

def timeit(func):
    def wrapper(*args, **kwargs):
        starting_time = clock()
        result = func(*args, **kwargs)
        ending_time = clock()
        print 'Duration: {}'.format(ending_time - starting_time)
        return result
    return wrapper

@timeit
def f():
    pass
share|improve this answer
    
I'm a big fan of your memodict implementation, but it should still be possible to use it with multiple-valued functions. You just need an extra level of indirection that takes the args & kwargs and turns them into a tuple (or something hashable) and uses that as a key. – Dannnno Aug 5 at 16:01
    
Yes, indeed. I took the implementation from code.activestate.com/recipes/… where different ways to get it to work with multiple arguments are discussed. All take a small performance hit when applied to the special case of one argument, though. – Graipher 2 days ago

check_correctness

def check_correctness():

Functions should better take arguments to allow reuse, I would give the two functions and the inputs as arguments.

print('Checking both functions do the same things')

Unavoidable printing inside a function makes it impossible to reuse and violates the single responsability principle (computation + information)

for i in [0, 1, 4, 10, 22, 63]:

i is not descriptive you should use testcase to convey more information.

n = naive_factorial(i) 
m = memo_factorial(i)

After removing print there is no need for these temporary variables.

 print('n = %d -- Naive : %d, Memo : %d' % (i, n, m))

Remove as above.

if n != m:
    print('Error in functions implementation') exit() 

Raise an exception instead to allow the caller to catch it if he wants (further increasing re-usability)

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.