Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I have one file with indexes. In example, they are:

1    Fruit
2    Meat
3    Fish        
4    Salmon
5    Pork
6    Apple

And a dictionary, because I want to match entries that I choose, so for example

similar = {'1' : '6', '2' : '5'}

Since they are on the same file, what my program does is, IE, scans for the '1' on the file, and then re-scans from the beginning looking for the '6'. Same with all the numbers, re-scanning always.

It's a very large file with a lot of numbers.

This is some code

similar = { '4719' : '4720c01' }

for aline in invFormatted:
    lines = aline.split("\t") #Splits the CSV file on every tab.
    lotID = lines[0]
    partNO = lines[1] #Splits and gets the part that would be the "index"
    if similar.has_key(partNO):
        for alines in invFormatted:
            liness = alines.split("\t")
            lotIDmatch = liness[0]
            partNOmatch = liness[1]
            if liness[1] == similar[partNO]:

Would there be a way to make it so it only scans it once?

Or any other ideas to put better this program?

Cheers!


In fear of falling in the TL;DR category or the "you want me to do this code for you" category, I didn't post the full thing, but since some people asked, here it is (don't kill me).

There is a text file formatted this way.

51689299    4719    Medium Azure    Riding Cycle
51689345    4720c01 Trans-Clear Wheel & Tire Assembly

In real life, it would have thousands of entries.

Then, my python file 'knows' that the part number 4719 matches the part number 4720c01.

similar = { '4719' : '4720c01' }

Now, what it does. (I think!)

invFormatted = open('export.txt', 'r') #Opens the file with the part numbers

with open ('upload.txt', 'a') as upload:

    upload.write("<INVENTORY>") #Something that would be on the export

    for aline in invFormatted:
        lines = aline.split("\t") #Splits the text file on every tab
        lotID = lines[0] #The lot ID is the first word
        partNO = lines[1] #Part number the second one
        if partNO in similar: #Sees if the part number in question has a match, if YES
            for alines in invFormatted: #Searchs my inventory file again to see if the match is in my inventory
                liness = alines.split("\t")
                lotIDmatch = liness[0]
                partNOmatch = liness[1]
                if liness[1] == similar[partNO]: #if YES, prints out what I want it to print, and goes back to the first step
                    upload.write("blabla")

    invFormatted.close()

    upload.write("\n</INVENTORY>")

There it is, thank you!


I have gotten here so far, I think it's close to as much optimization as I can. Maybe someone can think of else?

parts = {}

infile = open("export.txt")
for line in infile:
    line = line.strip()
    xml = [p.strip() for p in line.split("\t")]
    parts[xml[1]] = (xml[0], xml[2], xml[3])

for aline in invFormatted:
    lines = aline.split("\t") #Splits the CSV file on every "
    lotID = lines[0]
    partNO = lines[1]
    if partNO in similar:
        similarPart = similar[partNO]
        if partNO in parts:
                upload.write("\n <ITEM>\n  <LOTID>" + lotID + "</LOTID>\n  <DESCRIPTION>To be used with &amp;lt;a href=\"/storeDetail.asp?b=-16205034&amp;h=314137&amp;q=" + similarPart + "\"&amp;gt;" + similarPart + "&amp;lt;/a&amp;gt;</DESCRIPTION>\n" + " </ITEM>")
share|improve this question
    
#Splits the CSV file on every "," - no it doesn't. It splits the lines on every tab. Also, if this is supposed to be CSV, there's a module for that. –  user2357112 Jan 6 at 9:31
1  
Your example code doesn't demonstrate the double-scanning your question is about. Could you provide a different example? –  user2357112 Jan 6 at 9:32
    
Of course, that comment was actually from another file I had where it splitted them on the commas. Always a module for that. :-) –  Brick Top Jan 6 at 9:32
1  
The simple thing to do is to read the whole file in memory to a dict, and do your lookup there. If the file is too big to fit in memory, you can use pytables with an indexed column. Or even simpler; if the numbers are always continuously counting from 1, you can simply seek for the correct line in the file. –  Eelco Hoogendoorn Jan 6 at 9:39
    
@EelcoHoogendoorn, so, something like this? stackoverflow.com/questions/4803999/python-file-to-dictionary PS the file actually has 3 variables, can you do a 3 variable dict? –  Brick Top Jan 6 at 9:44
show 6 more comments

migrated from stackoverflow.com Jan 15 at 14:14

This question came from our site for professional and enthusiast programmers.

1 Answer

up vote 0 down vote accepted

If you rebuild your similar dictionary to be two-way:

complement = {v:k for k, v in similar.iteritems()}
similar.update(complement)

You may skip second pass (BTW, drop has_key - it is an old form):

if part_no in similar:
share|improve this answer
add comment

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.