Take the tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I am writing a few python functions to parse through an xml schema for reuse later to check and create xml docs in this same pattern. Below are two functions I wrote to parse out data from simpleContent and simpleType objects. After writing this, it looked pretty messy to me and I'm sure there is a much better (more pythonic) way to write these functions and am looking for any assistance. I am use lxml for assess to the etree library.

def get_simple_type(element):
    simple_type = {}
    ename = element.get("name")
    simple_type[ename] = {}
    simple_type[ename]["restriction"] = element.getchildren()[0].attrib
    elements = element.getchildren()[0].getchildren()
    simple_type[ename]["elements"] = []
    for elem in elements:
        simple_type[ename]["elements"].append(elem.get("value"))
    return simple_type  

def get_simple_content(element):
    simple_content = {}
    simple_content["simpleContent"] = {}
    simple_content["simpleContent"]["extension"] = element.getchildren()[0].attrib
    simple_content["attributes"] = []
    attributes = element.getchildren()[0].getchildren()
    for attribute in attributes:
        simple_content["attributes"].append(attribute.attrib)
    return simple_content

Examples in the schema of simpleContent and simpleTypes (they will be consistently formatted so no need to make the code more extensible for the variety of ways these elements could be represented in a schema):

<xs:simpleContent>
    <xs:extension base="xs:integer">
        <xs:attribute name="sort_order" type="xs:integer" />
    </xs:extension>
</xs:simpleContent>

<xs:simpleType name="yesNoOption">
    <xs:restriction base="xs:string">
      <xs:enumeration value="yes"/>
      <xs:enumeration value="no"/>
      <xs:enumeration value="Yes"/>
      <xs:enumeration value="No"/>
    </xs:restriction>
</xs:simpleType>

The code currently creates dictionaries like those show below, and I would like to keep that consistent:

{'attributes': [{'type': 'xs:integer', 'name': 'sort_order'}], 'simpleContent': {'extension': {'base': 'xs:integer'}}}


{'yesNoOption': {'restriction': {'base': 'xs:string'}, 'elements': ['yes', 'no', 'Yes', 'No']}}
share|improve this question
add comment

1 Answer

up vote 3 down vote accepted

Do more with literals:

def get_simple_type(element):
    return {
        element.get("name"): {
            "restriction": element.getchildren()[0].attrib,
            "elements": [ e.get("value") for e in element.getchildren()[0].getchildren() ]
        }
    }

def get_simple_content(element):
    return { 
        "simpleContent": {
            "extension": element.getchildren()[0].attrib,
            "attributes": [ a.attrib for a in element.getchildren()[0].getchildren() ]
         }
    }
share|improve this answer
 
thanks I used this solution, a few quick syntax fixes: some of the getchildren calls are missing parenthesis at the end. (edits were too short for codereview to allow me to fix) –  Mike J Apr 17 '12 at 16:55
 
fixed up the missing parens :) –  pjz Apr 17 '12 at 19:24
 
if you get a chance, could you also review my expansion on this here codereview.stackexchange.com/questions/10960/…? –  Mike J Apr 17 '12 at 21:15
add comment

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.