Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free.

I'm very new to python and I need to modify the

<test name="test02"></xmpp> to <test name="test03"></xmpp> 

<temp-config>QA</temp-config> to <temp-config>Prod</temp-config> 

for all 5 occurrences using python. Not sure what lib to use. Any help in this is really appreciated.

<config>
<logging></logging>
<test-mode>false</test-mode>
<test name="test02"></xmpp>
<mail></mail>
<test-system>0</test-system>
<system id="0" name="suite1" type="regression">
    <temp-config>QA</temp-config>
    <rpm>0.5</rpm>
    <cycles>3</cycles>
</system>
<system id="1" name="suite2" type="regression">
    <temp-config>QA</temp-config>
    <rpm>0.5</rpm>
    <cycles>3</cycles>
</system>
<system id="2" name="suite3" type="regression">
    <temp-config>QA</temp-config>
    <rpm>0.5</rpm>
    <cycles>3</cycles>
</system>
<system id="3" name="suite4" type="regression">
    <temp-config>QA</temp-config>
    <rpm>0.5</rpm>
    <cycles>3</cycles>
</system>
<system id="4" name="suite5" type="regression">
    <temp-config>QA</temp-config>
    <rpm>0.5</rpm>
    <cycles>3</cycles>
</system>
</config>
share|improve this question

closed as not a real question by Lev Levitsky, Andy Hayden, ElYusubov, Igy, Sudarshan Jan 29 '13 at 5:32

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center. If this question can be reworded to fit the rules in the help center, please edit the question.

4 Answers 4

up vote 1 down vote accepted

Use lxml. This example uses lxml.etree and would actually fail on your example xml, as it has some unclosed tags in it. If you have the same problems with the real-world data that you are going to parse, use lxml.html, that can handle broken xml (instructions added to the code as comments).

In [14]: import lxml.etree as et  # for broken xml add an import:
                                  # import lxml.html as lh

In [15]: doc = et.fromstring(xmlstr)  # for broken xml replace this line with:
                                      # doc = lh.fromstring(xmlstr)

                                      # if you read xml from a file:
                                      # doc = et.parse('file_path')

In [16]: for elem in doc.xpath('.//temp-config'):
    ...:     elem.text = 'Prod'
    ...:     

In [17]: print et.tostring(doc,pretty_print=True)
<config>
  <logging/>
  <test-mode>false</test-mode>
  <test name="test02">
    <mail/>
    <test-system>0</test-system>
    <system id="0" name="suite1" type="regression">
      <temp-config>Prod</temp-config>
      <rpm>0.5</rpm>
      <cycles>3</cycles>
    </system>
    <system id="1" name="suite2" type="regression">
      <temp-config>Prod</temp-config>
      <rpm>0.5</rpm>
      <cycles>3</cycles>
    </system>
    <system id="2" name="suite3" type="regression">
      <temp-config>Prod</temp-config>
      <rpm>0.5</rpm>
      <cycles>3</cycles>
    </system>
    <system id="3" name="suite4" type="regression">
      <temp-config>Prod</temp-config>
      <rpm>0.5</rpm>
      <cycles>3</cycles>
    </system>
    <system id="4" name="suite5" type="regression">
      <temp-config>Prod</temp-config>
      <rpm>0.5</rpm>
      <cycles>3</cycles>
    </system>
  </test>
</config>

Note: As pointed out by others, you have some less powerful alternatives in the standard library. For simple tasks as this they may be well suited, however, if your xml files are broken, parsing them with standard library tools equals wasting your time.

share|improve this answer

ElementTree is a great choice -- pure Python and included in the standard library, so it's the most portable option. However, I always go straight to lxml -- it has the same API, it's just faster and it can do a lot more (since it is effectively a wrapper around libxml2).

from lxml import etree
tree = etree.parse(path_to_my_xml)

for elem in tree.findall('.//test'):
    assert elem.attrib['name'] == 'test02'
    elem.attrib['name'] == 'test03'

for elem in tree.findall('.//temp-config'):
    assert elem.text == 'QA'
    elem.text = 'Prod'

with open(path_to_output_file, 'w') as file_handle:
    file_handle.write(etree.tostring(tree, pretty_print=True, encoding='utf8'))
share|improve this answer
    
Sorry I get the following error: Traceback (most recent call last): File "testrun.py", line 13, in <module> with open(tree, 'w') as file_handle: TypeError: coercing to Unicode: need string or buffer, lxml.etree._ElementTree found –  DigitalDyn Jan 29 '13 at 14:53
    
@DigitalDyn - with open(tree, 'w') as file_handle: is not part of what I posted. i think you may have made a copy-and-paste error :) –  simon Jan 29 '13 at 16:53

i suggest to use ElementTree: http://docs.python.org/2/library/xml.etree.elementtree.html

example:

for atype in e.findall('type')
 print(atype.get('foobar'))

or see this thread: easiest way to parse xml in python

share|improve this answer

To complete the answers above, using lxml, here's how you'd change the 'name' attribute value:

from lxml import etree
tree = etree.parse(path_to_my_xml)
for elem in tree.xpath('.//temp-config'):
    elem.text = 'Prod'
for elem in tree.xpath(".//test[@name='test02']"):
    elem.attrib['name'] = 'test03'
share|improve this answer

Not the answer you're looking for? Browse other questions tagged or ask your own question.