Parse a csv file and create a dictionary of partial results

Question

I have a bunch of .csv files which I have to read and look for data. The .csv file is of the format:

A row of data I will ignore
State,County,City
WA,king,seattle
WA,pierce,tacoma

In every csv file, the order of columns is not consistent. For example in csv1 the order can be State,County,City, in csv2 it can be City,County,State. What I am interested is the State and County. Given a county I want to find out what State it is in. I am ignoring the fact that same counties can exist in multiple States. The way I am approaching this:

with open(‘file.csv’) as f:
    data = f.read()

# convert the data to iterable, skip the first line
reader = csv.DictReader(data.splitlines(1)[1:])
lines = list(reader)
counties = {k: v for (k,v in ((line[‘county’], line[‘State’]) for line in lines)}

Is there a better approach to this?

200_success · Accepted Answer · 2014-11-26 02:55:55Z

up vote 4 down vote accepted

You're on the right track, using a with block to open the file and csv.DictReader() to parse it.

Your list handling is a bit clumsy, though. To skip a line, use next(f). Avoid making a list of the entire file's data, if you can process the file line by line. The dict comprehension has an unnecessary complication as well.

with open('file.csv') as f:
    _ = next(f)
    reader = csv.DictReader(f)
    counties = { line['County']: line['State'] for line in reader }

Your sample file had County as the header, whereas your code looked for line[‘county’]. I assume that the curly quotes are an artifact of copy-pasting, but you should pay attention to the capitalization.

answered Nov 26 '14 at 2:55

200_success♦
74.2k787279

I am really getting the data from an S3 bucket, but I didn't want to make the code more complicated in my example. So, I get the key from the bucket and then I say data = key.get_contents_as_string() So I am not really reading from a file. Instead, the contents of the key are the string representation of the csv file. I like the way you eliminated the list and cleaned up the dict comprehension, is there a way that I can avoid doing the data.splitlines(1)[1:]) when I create the reader since I already have the data in a string? (and i need to ignore the first row) – Mark Nov 26 '14 at 3:39

4

With all due respect, if you choose to strip out key relevant details in the code you submit for review, you should be prepared to accept an answer that addresses the code you submitted, not the code you had in mind. We would have been quite happy to review the code you actually wrote, had you submitted that instead. – 200_success♦ Nov 26 '14 at 4:15

I apologize, I should have stated that earlier. Thank you for your help, I do appreciate it – Mark Nov 26 '14 at 14:10

add a comment |

asked	9 months ago
viewed	325 times
active	9 months ago

current community

your communities

more stack exchange communities

Parse a csv file and create a dictionary of partial results

1 Answer 1

Your Answer

Not the answer you're looking for? Browse other questions tagged python python-2.7 csv or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Parse a csv file and create a dictionary of partial results

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged python python-2.7 csv or ask your own question.

Related

Hot Network Questions