Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

So, I have a function that takes a column title, and a response.body from a urllib GET (I already know the body contains text/csv), and iterates through the data to build a list of values to be returned. My question to the gurus here: have I written this in the cleanest, most efficient way possible? Can you suggest any improvements?

 
    def _get_values_from_csv(self, column_title, response_body):
        """retrieves specified values found in the csv body returned from GET
        @requires: csv
        @param column_title: the name of the column for which we'll build a list of return values.
        @param response_body: the raw GET output, which should contain the csv data
        @return: list of elements from the column specified.
        @note: the return values have duplicates removed. This could pose a problem, if you are looking for duplicates.
        I'm not sure how to deal with that issue."""
        dicts = [row for row in csv.DictReader(response_body.split("\r\n"))]
        results = {}
        for dic in dicts:
            for k, v in dic.iteritems():
                try:
                    results[k] = results[k] + [v] #adds elements as list+list
                except: #first time through the iteritems loop.
                    results[k] = [v]

        #one potential problem with this technique: handling duplicate rows
        #not sure what to do about it.
        return_list = list(set(results[column_title]))
        return_list.sort()
        return return_list

Thanks!

share|improve this question

migrated from stackoverflow.com Feb 16 '11 at 19:19

This question came from our site for professional and enthusiast programmers.

    
Probably better off on codereview.stackexchange.com –  Daniel DiPaolo Feb 16 '11 at 19:10
1  
Tip: Don't use a blanket except, you will catch ALL exceptions rather than the one you want. –  ash Feb 16 '11 at 19:13

2 Answers 2

up vote 7 down vote accepted

Here's a shorter function that does the same thing. It doesn't create lists for the columns you're not interested in.

def _get_values_from_csv(self, column_title, response_body):
    dicts = csv.DictReader(response_body.split("\r\n"))
    return sorted(set(d[column_title] for d in dicts))
share|improve this answer
    
Thanks! This is brilliant! I'm still trying to wrap my head around how to use dictionaries in a comprehension context like this. So this is exactly what I was looking for. :) –  Greg Gauthier Feb 16 '11 at 19:39
    
sorted() will return a list, so the list() bit isn't needed. –  Lennart Regebro Feb 16 '11 at 20:54
    
Good point, Lennart -- Edited out. –  Martin Stone Feb 17 '11 at 8:12

My suggestions:

def _get_values_from_csv(self, column_title, response_body):
    # collect results in a set to eliminate duplicates
    results = set()

    # iterate the DictReader directly
    for dic in csv.DictReader(response_body.split("\r\n")):
        # only add the single column we are interested in
        results.add(dic[column_title])

    # turn the set into a list and sort it
    return_list = list(results)
    return_list.sort()
    return return_list
share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.