I've made the following implementation of counting sort. I don't think that I understand the algorithm fully but this implementation seems to be working correctly.

I've used the Counterdictionary from the collections module. It seems to be the best way to handle the counting part. But I haven't seen anyone using it to implement counting sort.

Is there any way to improve the code?

from random import randrange
from collections import Counter
from copy import deepcopy

def counting_sort(aList, length):
    """
    counting sort does not work for negative numbers
    counting sort assumes that each element is SMALL INTEGER

    running time is O(maxVal - minVal) ---> Linear

    (Useful only when the difference between maxVal and minVal is not large)
    """
    list_copy = deepcopy(aList)  # a copy so that I can return it and check it against the original list applied to sorted function
    c = Counter()

    for num in list_copy:
        c[num] += 1

    index = 0

    # for i in range(len(c)):
        # while 0 < c[i]:
            # list_copy[index] = i
            # c[i] -= 1
            # index += 1
    for key in c:
        while c[key] > 0:
            list_copy[index] = key
            c[key] -= 1
            index += 1

    return list_copy


aList = [randrange(100) for _ in range(1000)]
after = counting_sort(aList, len(aList))
print(after == sorted(aList))
share|improve this question

closed as off-topic by Jaime, Pimgd, janos Apr 19 '16 at 12:19

This question appears to be off-topic. The users who voted to close gave this specific reason:

  • "Questions containing broken code or asking for advice about code not yet written are off-topic, as the code is not ready for review. After the question has been edited to contain working code, we will consider reopening it." – Jaime, Pimgd, janos
If this question can be reworded to fit the rules in the help center, please edit the question.

    
This code is broken, as discussed in the accepted solution comments, which is also broken. – Jaime Apr 19 '16 at 11:48
up vote 1 down vote accepted

Your code is not reliable because Counter is unordered. In many cases, it comes up with the right result, but do not depend on it. I will not give an alternate solution because Code Review is for improvements on existing code, not for writing new code. Although your code should not be used for sorting, I have a couple comments on what you did.


You aren't using length, so why require it for an argument?

Instead of creating a copy of a list and then changing each index, create an empty list and add things to it.

Your implementation of Counter does not take advantage of it. You use it very similarly to how you would use a dictionary. Just use Counter(aList)

You can simplify your for key in c: loop by using itertools.repeat. Note: don't use it for mutable objects.

Your naming is inconsistent. PEP 8 recommends snake_case, but you use both snake_case and pascalCase. You aren't required to comply with PEP 8's recommendation, but you should at least be consistent. You should also use more descriptive names than c, for example.

The new code:

from collections import Counter
from itertools import repeat

def counting_sort(number_list):
    """
    counting sort does not work for negative numbers
    counting sort assumes that each element is SMALL INTEGER

    running time is O(maxVal - minVal) ---> Linear

    (Useful only when the difference between maxVal and minVal is not large)
    """

    sorted_list = []
    for value, amount in Counter(number_list).items():
        sorted_list.extend(repeat(value, amount))

    return sorted_list
share|improve this answer
    
Counter is a dicionary at heart, and therefore has no ordering giarantee. Your code will, just like the OP's, cluster the input list, but there is no guarantee about it being sorted. – Jaime Apr 19 '16 at 5:15
    
Also, your approach works if the objects in the list are plain numbers, where the comparison value is all the state of the object. But if they include other metadata, e.g. if you want to sort nodes in a tree by their value but keeping their links to subtrees, you need to be much more careful about what you place where, see here for the standard implementation. – Jaime Apr 19 '16 at 5:23
    
@Jaime so is zondo's absolutely gorgeous implementation wrong? I was a bit nervous about the lack of order in Counter at first, but my code and zondo's work perfectly so far in my testing (I've tested it on 1 million ints. it takes 2 seconds). Do you an equally elegant solution that's better? – MAA Apr 19 '16 at 11:20
    
The minimal example that shows this is broken is counting_sort([7, 8]), which gives the wrong [8, 7]result, at least in CPython 2.7.6. In the CPython implementation of dicts, a dictionary with less than 5 entries will be stored in an 8 item linear array. The hash function of integers in Python is their value, so integer j gets stored in position j % 8. This leads to even funnier results, like counting_sort([8, 0]) returning a wrong result, but counting_sort([0, 8]) returning the correct one. – Jaime Apr 19 '16 at 11:47
    
@Jaime: I haven't deleted my answer because I believe I mentioned some helpful things and I did not suggest an alternate solution, but I added a notice at the top that his code is not reliable. – zondo Apr 19 '16 at 22:54

Not the answer you're looking for? Browse other questions tagged or ask your own question.