Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I'm trying to create a program that creates a record of how many times letters appear in a given string (ignoring blank spaces) using recursion.

s = "Hello there this is the string"

new_s = ""
for i in s:
    if i != " ":
        new_s += i
new_s = new_s.lower()
com_dict = {}
def merge_count(s):
    global com_dict
    if len(s) == 1:
        if s[0] in com_dict:
            com_dict[s[0]] += 1
        else:
            com_dict[s[0]] = 1
    else:
        merge_count(s[:int(len(s)/2)])
        merge_count(s[int(len(s)/2):])

merge_count(new_s)
for i in com_dict:
    print i, com_dict[i]

Is the global variable necessary? Should I be using a different data structure instead of a dictionary? I will need to rank the results based on the most frequently appearing letters. Is using recursion in this scenerio generally a bad idea? Would iterating be a better choice?

share|improve this question
4  
Any reason you're not using Counter? –  mjolka 21 hours ago
    
I would say use a dict instead. –  Ryan 21 hours ago
    
Is this an exercise where you use only the bit of Python you already know, or are you interested in getting the absolute best way to do it using the standard library? (as that would be Counter) –  RemcoGerlich 16 hours ago

2 Answers 2

Is the global variable necessary?

No. Global variables are almost never necessary, and rarely the best solution. Consider this:

def merge_count(s, com_dict):
    if len(s) == 1:
        if s[0] in com_dict:
            com_dict[s[0]] += 1
        else:
            com_dict[s[0]] = 1
    else:
        merge_count(s[:int(len(s)/2)], com_dict)
        merge_count(s[int(len(s)/2):], com_dict)

def main():
    s = "Hello there this is the string"

    new_s = ""
    for i in s:
        if i != " ":
            new_s += i
    new_s = new_s.lower()

    com_dict = {}
    merge_count(new_s, com_dict)
    for i in com_dict:
        print i, com_dict[i]

main()

In this, you pass a dictionary to merge into into to the function.

You do

new_s = ""
for i in s:
    if i != " ":
        new_s += i

Addition of strings in loops is always almost bad, because it may reallocate the string on every addition. Instead, one should do a join:

new_s = "".join(i for i in s if i != " ")

However, in this case you might as well just do

new_s = s.replace(" ", "")

Recursively splitting in merge_count is pointless. Just do

def merge_count(s, com_dict):
    for i in s:
        if s in com_dict:
            com_dict[s[0]] += 1
        else:
            com_dict[s[0]] = 1

which affords the better interface of

def merge_count(s):
    com_dict = {}

    for i in s:
        if s in com_dict:
            com_dict[i] += 1
        else:
            com_dict[i] = 1

    return com_dict

One can simplify things with defaultdict:

from collections import defaultdict

def merge_count(s):
    com_dict = defaultdict(int)

    for i in s:
        com_dict[i] += 1

    return com_dict

Or even further with Counter:

from collections import Counter

def merge_count(s):
    return Counter(s)

Instead of

for i in com_dict:
    print i, com_dict[i]

One should do

for i, v in com_dict.iteritems():
    print i, v

Then one fixes up the naming

from collections import Counter

def main():
    string = "Hello there this is the string"

    lower_chars = string.replace(" ", "").lower()
    char_counts = Counter(lower_chars)

    for char, count in char_counts.items():
        print(char, count)

main()
share|improve this answer
1  
Use print char_counts.most_common() instead to print the frequencies in descending order. –  user1016274 12 hours ago

Is the global variable necessary?

No. You don't need the global keyword there, because you don't reassign the variable, you modify its content. On the other hand, right now the dictionary is too exposed. It might be modified outside of merge_counts. Another problem is that if I don't see the implementation of the function, it's not clear that it expects this dictionary to be defined. It would be better to rewrite in a way that eliminates these issues, for example:

def merge_counts(s):
    com_dict = {}
    def merge_counts_recurse(s):
        # ...
    return com_dict

In this version the dictionary is no longer exposed in global scope, it cannot be modified outside of merge_counts, and the function no longer depends on the dictionary defined outside.

Should I be using a different data structure instead of a dictionary?

The data structure is fine for the purpose, but there's a higher level utility class that you could use instead of reinventing the wheel, Counter, as @mjolka pointed out.

I will need to rank the results based on the most frequently appearing letters. Is using recursion in this scenerio generally a bad idea? Would iterating be a better choice?

Recursion is not bad, it's just a bit unusual for this scenario. Iteration is the obvious choice, and a lot easier to implement. In the first loop where you excluded the space characters, you could at the same time build the counts. That would have been the obvious, easiest implementation.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.