Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I'm learning about algorithms and have been recently experimenting with insertion sort via recursion in Python.

My code works fine, but I was wondering if there are any possible improvements that can be made. I am especially interested in ways to reduce the actual content of the code (i.e. make it shorter) or improve the speed.

def insertionsort(alist):
    i = len(alist)
    if i > 2:
        temp = alist[:i-1]
        insertionsort(temp)
        alist[:i-1] = temp[:i-1]

    k = len(alist) - 1
    m = len(alist)

    while k >= 1 and alist[k-1] > alist[k]:
        front = alist[k]
        back = alist[k - 1]
        alist[k] = back
        alist[k-1] = front

        k = k - 1


A = [4, 1, 6, 3, 9, 10]
insertionsort(A)

print(A)
share|improve this question

migrated from stackoverflow.com Jul 24 at 6:53

This question came from our site for professional and enthusiast programmers.

    
You don't use m so that can go, you can also use alist[k], alist[k-1] = alist[k-1], alist[k] and forget front and back. pastebin.com/Vip7uzXL –  Padraic Cunningham Jul 23 at 22:38

1 Answer 1

Starting with the low-hanging fruit: A variable called 'temp' is a big sign that you can do better. If you were to set up your code so that it returned a new list rather than modifying it in place, you could do this:

alist[:i-1] = insertionsort(alist[:i-1])

The rest of your code assumes that we will be working on the original, but the easiest fix for that is to make a copy at the earliest opportunity. If you do that early (and with an appropriate comment), and you don't need the original list anymore (you don't), then you can reuse the name alist without losing clarity. Unfortunately, copying the list is bad for performance, but readability needs to come first. But way down the bottom, I will point out a bigger improvement that will mean we don't actually have to choose between performance and readability here.


You can also eliminate that i: Python allows negative indices to all builtin sequences, which are defined to count from the back. So the above line is just:

alist[:-1] = insertionsort(alist[:-1])

and the condition above it can test against len(alist) explicitly.


In your second loop, you use four lines to swap two list elements. This can be more idiomatically done in one line using tuple assignment:

alist[k], alist[k-1] = alist[k-1], alist[k]

But we can do better even then this - we don't need to do all these swaps at all. Instead, find where the last element should go, and put it directly there. This is exactly the type of job the bisect module is good for:

candidate = alist.pop()
bisect.insort(alist, candidate)

And this replaces the whole second while loop.


So so far we have:

def insertionsort(alist):
    # work on a copy instead of the original
    alist = alist[:]

    if len(alist) > 2:            
        alist[:-1] = insertionsort(alist[:-1])

    candidate = alist.pop()
    bisect.insort(alist, candidate)

    return alist

A = [4, 1, 6, 3, 9, 10]
A = insertionsort(A)

print(A)

I said before that copying is potentially bad for performance (each copy takes time). And we're doing a lot of it. This line:

alist[:-1] = insertionsort(alist[:-1])

makes a new list containing all but the last element, which the recursive call will promptly clone (in its entirety). So that's two copies for each element after the second. It would be better if we could tell the recursive call to only treat up to a certain point in the list, and everything after that. To do this, we put the bulk of the code into a helper function:

def insertionsort(alist):
    def sort_helper(alist, hi):
        if hi > 1:            
            sort_helper(alist, hi-1)

        candidate = alist.pop(hi)
        bisect.insort(alist, candidate, hi=hi)

    alist = alist[:]
    sort_helper(alist, len(alist)-1)
    return alist

A = [4, 1, 6, 3, 9, 10]
A = insertionsort(A)

print(A)

Note that the first condition changed from testing a length to testing a position, and that the helper function works completely in-place. This makes one copy, in the outer function. You can also change it back to an in-place sort like your original code by deleting two lines, and it will make zero copies. This is probably the best a recursive insertion sort can be.

share|improve this answer
    
Thank you. I'm still unsure about the role of this helper function - could you please perhaps explain it more? –  genesis Jul 24 at 23:24
    
The helper function exists so that it can take an extra argument that it doesn't make sense to expose to the main function (because it's an implementation detail). The extra argument lets you partition the list into 'already sorted' and 'still unsorted' without copying it. It plays the same role as a loop variable in an iterative version. –  lvc Jul 25 at 0:27
    
Sure the code works? I tried it in Python with the helper function, doesn't seem to be sorting? –  genesis Jul 25 at 16:50
    
Because I forgot to update the test to reflect that it now returns the sorted list rather than sorting in-place. Fixed. –  lvc Jul 26 at 0:35

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.