Tell me more ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I am trying to learn some re factoring techniques on my own and wanted a few exmaples. I am given the following code:

import csv
from xml.etree.ElementTree import Element, SubElement, Comment, tostring
from xml.etree.ElementTree import ElementTree
import xml.etree.ElementTree as etree

def main():
    reader = read_csv()
    generate_xml(reader)

def read_csv():
    with open ('1250_12.csv', 'r') as data:
        return list(csv.reader(data))

def generate_xml(reader):
    root = Element('Solution')
    root.set('version','1.0')
    tree = ElementTree(root)

    head = SubElement(root, 'DrillHoles')
    head.set('total_holes', '238')

    description = SubElement(head,'description')
    current_group = None
    i = 0
    for row in reader:
        if i > 0:
            x1,y1,z1,x2,y2,z2,cost = row
            if current_group is None or i != current_group.text:
                current_group = SubElement(description, 'hole',{'hole_id':"%s"%i})

                collar = SubElement (current_group, 'collar',{'':', '.join((x1,y1,z1))}),
                toe = SubElement (current_group, 'toe',{'':', '.join((x2,y2,z2))})                                       
                cost = SubElement(current_group, 'cost',{'':cost})
        i+=1

    def indent(elem, level=0):
        i = "\n" + level*"  "
        if len(elem):
            if not elem.text or not elem.text.strip():
                elem.text = i + "  "
            if not elem.tail or not elem.tail.strip():
                elem.tail = i
            for elem in elem:
                indent(elem, level+1)
            if not elem.tail or not elem.tail.strip():
                elem.tail = i
        else:
            if level and (not elem.tail or not elem.tail.strip()):
                elem.tail = i
    indent(root)
    tree.write(open('holes1.xml','w'))

As you can see this code reads in data from csv file and generates an xml file with respects to the csv data. Can you guys spot any parts where re-factoring can make this code more efficient?

share|improve this question
1  
one note : you have a for elem in elem that reassigns elem to the last element of elem, and then you access elem again. Not sure if that's what you want to do or not, but anyway that's confusing. – njzk2 Jun 6 at 13:29
add comment (requires an account with 50 reputation)

2 Answers

Minor stuff:

  • Remove the last import - etree is not used anywhere.
  • Merge the two first imports

Possibly speed-improving stuff:

  • Avoid converting the csv.reader output before returning unless absolutely necessary.
  • Skip indent unless the output must be readable by a human with a non-formatting editor.
  • If you need to indent the output, existing solutions are probably very efficient.
  • Use reader.next() to skip the header line in generate_xml, then you don't need to keep checking the value of i.
share|improve this answer
add comment (requires an account with 50 reputation)

Don't use something like for elem in elem at some point with larger for loops you will miss that elem is different variable before and in/after the for loop:

for subelem in elem:
    indent(subelem, level+1)

if not subelem.tail or not elem.tail.strip():
    subelem.tail = i

Since indent(subelem...) already sets the tail, you probably do not need to do that again.

share|improve this answer
add comment (requires an account with 50 reputation)

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.