0

I'm very new to Python, only read the Learn Python the Hard Way. But I think this is still way out of my scope. My skills are in XML/XSL, not Python. I need a little help to get started.

Overview: I need to add missing XML data (addition.xml) into a existing XML file (original.xml).

XML file (with the data that is missing): (addition.xml)

<profile>
    <dog-list>
        <dog>
            <name>sally</dog>
            <age>1</age>
        </dog>
        <dog>
            <name>susie</dog>
            <age>12</age>
        </dog>
    </dog-list>
    <people-list>
        <person>
            <name>ue</name>
            <age>25</age>
            <gender>female</gender>
        </person>
    </people-list>
</profile>

XML data above adds to this XML file: (original.xml)

<profile>
    <cat-list>
        <cat>
            <name>foo></name>
        </cat>
        <cat>
            <name>bar</name>
            <age>3</age>
        </cat>
    </cat-list>
    <bird-list>
        <bird>
            <name>cricket</name>
            <age>2</age>
        </bird>
    </bird-list>
    <people-list>
        <person>
            <name>tyler</name>
            <age>26</age>
        </person>
    </people-list>
    <car-list>
        <car>
            <make>mitsubishi</make>
            <model>evo x</model>
            <year>2013</year>
        </car>
    </car-list>
</profile>

My expected output should be: --> the new (original.xml)

<profile>
    <cat-list>
        <cat>
            <name>foo></name>
        </cat>
        <cat>
            <name>bar</name>
            <age>3</age>
        </cat>
    </cat-list>
    <dog-list>
        <dog>
            <name>sally</dog>
            <age>1</age>
        </dog>
        <dog>
            <name>susie</dog>
            <age>12</age>
        </dog>
    </dog-list>
    <bird-list>
        <bird>
            <name>cricket</name>
            <age>2</age>
        </bird>
    </bird-list>
    <people-list>
        <person>
            <name>tyler</name>
            <age>26</age>
        </person>
        <person>
            <name>ue</name>
            <age>25</age>
            <gender>female</gender>
        </person>
    </people-list>
    <car-list>
        <car>
            <make>mitsubishi</make>
            <model>evo x</model>
            <year>2013</year>
        </car>
    </car-list>
</profile>

What happens here is that data from the addition.xml is missing from the original.xml file. How do I go about adding data from addition.xml into the original.xml instead of creating a new file, overwriting it.

I look all over google and stackoverflow. I know that I could use ElementTree but I have the foggiest idea how to create this result.

Any help is greatly appreciated!

6
  • Your xml data is not valid: watch mismatched opening and closing tags.
    – alecxe
    Jan 15, 2014 at 22:21
  • Fixed. Sorry, I typed that out, and missed a closing tag </bird-list>
    – misterbear
    Jan 15, 2014 at 22:23
  • Is the order of elements relevant? Does <dog-list> have to come after <cat-list>?
    – Robᵩ
    Jan 15, 2014 at 22:58
  • Yes it does, I finally found something that was really close here. But it's not breaking into a new line, but rather clustering the line side by side.
    – misterbear
    Jan 15, 2014 at 23:05
  • And it doesn't overwrite to the file or create a new file. It just prints it in terminal.
    – misterbear
    Jan 15, 2014 at 23:13

1 Answer 1

1

Your requirements don't allow a general-purpose merge program (like the one you link to), but here is a program that might work for you.

Usage: ./program.py original.xml addition.xml

#! /usr/bin/python2

import sys
from lxml import etree

result = etree.Element('root')
parser = etree.XMLParser(remove_blank_text=True)

# Add each file to the tree
for xmlfile in sys.argv[1:]:
  with open(xmlfile) as xmlfile:
    btree = etree.parse(xmlfile, parser)
  # Ensure that the resulting tree has the right root
  result.tag = btree.getroot().tag
  # Consider each 2nd-level item
  for bchild in btree.xpath("/*/*"):
    tags = result.xpath("./%s"%bchild.tag)
    if len(tags) == 0:
      # Add <dog-list>, for example
      #print "adding %s to %s"%(bchild.tag, result.tag)
      result.append(bchild)
    else:
      for bgrandchild in bchild:
        # add <dog>, for example
        #print "adding %s to %s"%(bgrandchild.tag, tags[0].tag)
        tags[0].append(bgrandchild)

with open("output.xml", "w") as output:
  output.write(etree.tostring(result, pretty_print = True))

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.