Unix & Linux Stack Exchange is a question and answer site for users of Linux, FreeBSD and other Un*x-like operating systems. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

xml1.xml

<app>
    <bbb>
        <jjj>test1</jjj>
     </bbb>
     <bbb>   
        <jjj>test2</jjj>
    </bbb>
</app>

xml2.xml

file2 xml2.xml

<app>
    <bbb>   
       <jjj>test2</jjj>
    </bbb>
    <bbb>
        <jjj>test3</jjj>
    </bbb>
    <bbb>
        <jjj>test4</jjj>
    </bbb>
</app>

Can I combine 2 file to 1 file as below?

<app>
     <bbb>
        <jjj>test1</jjj>
     </bbb>
    <bbb>   
       <jjj>test2</jjj>
    </bbb>
    <bbb>
        <jjj>test3</jjj>
    </bbb>
    <bbb>
        <jjj>test4</jjj>
    </bbb>
</app>
share|improve this question
    
Possible duplicate of stackoverflow.com/questions/10163675/merge-xml-files-in-php – David King Dec 7 '15 at 15:56
    
@DavidKing questions/answers on other sites don't count as duplicates. If the answer there is helpful, copy it and post it here. – terdon Dec 7 '15 at 16:01

Adapted from http://stackoverflow.com/questions/10163675/merge-xml-files-in-php

$doc1 = new DOMDocument();
$doc1->load('xml1.xml');

$doc2 = new DOMDocument();
$doc2->load('xml2.xml');

// get 'app' element of document 1
$app1 = $doc1->getElementsByTagName('app')->item(0);

// iterate over 'bbb' elements of document 2
$items2 = $doc2->getElementsByTagName('bbb');
for ($i = 0; $i < $items2->length; $i ++) {
    $item2 = $items2->item($i);

    // import/copy item from document 2 to document 1
    $item1 = $doc1->importNode($item2, true);

    // append imported item to document 1 'app' element
    $app1 ->appendChild($item1);

}
$doc1->save('merged.xml');
share|improve this answer

It looks like you can do a merge sort and prune it. Basically sort just assumes you know what you're doing and runs a single pass over two or more inputs, interleaving them as their lexicographic sort order converges.

Here's what a GNU -merge sort prints for your example:


<app>
<app>
    <bbb>
    <bbb>
        <jjj>test1</jjj>
     </bbb>
     <bbb>
       <jjj>test2</jjj>
    </bbb>
    <bbb>
        <jjj>test2</jjj>
    </bbb>
</app>
        <jjj>test3</jjj>
    </bbb>
    <bbb>
        <jjj>test4</jjj>
    </bbb>
</app>

So at least its all folded in now, but, like I said, you still have to prune it. This sed script will do it for your examples:

sort    -m      /tmp/xml[12]                    |
sed     -ne:n   -e'$!s|/a..> *$|bbb>|;$p'       \
                -e'\|^[^>]*b.*\n|{N;P;D;}'      \
-eN     -e's|\(.*\)\n\(.*\n\)* *\1 *$|\1|'      \
        -e's|\n|&|3;tD' -ebn -e:D -eP\;D

It just ensures its got at least three lines stacked as it works through input and compares the first line in the stack against the last when the first line isnt a <bbb> tag.


<app>
    <bbb>
        <jjj>test1</jjj>
     </bbb>
     <bbb>   
       <jjj>test2</jjj>
    </bbb>
<bbb>
        <jjj>test3</jjj>
    </bbb>
    <bbb>
        <jjj>test4</jjj>
    </bbb>
</app>
share|improve this answer

You can't using "shell" linux - to do XML, you really need an XML parser.

However, there are plenty of scripting tools that do have options - my personal favourite is perl and the XML::Twig library. (This is very commonly available in Unix package managers, despite not being part of 'core').

#!/usr/bin/env perl

use strict;
use warnings;

use XML::Twig;

#load both
my $first  = XML::Twig->new->parsefile('xml1.xml');
my $second = XML::Twig->new->parsefile('xml2.xml');

#iterate bbb elements in second file
foreach my $bbb ( $second->get_xpath('//bbb') ) {

    #extract 'text' of jjj element (of this bbb element)
    my $jjj = $bbb->first_child_text('jjj');

    #use xpath query to check it doesn't exist first.
    if ( not $first->get_xpath("//bbb/jjj[string()='$jjj']") ) {
        print $jjj, " not in first, splicing\n";

        #cut/paste (note -  done in memory, so original file isn't altered)
        $bbb->move( 'last_child', $first->root );
    }
}

#set output formatting - can do some odd things with particularly strange XMl.
$first->set_pretty_print('indented_a');
$first->print;

## if you want to save it:
open( my $output, '>', "combined.xml" ) or die $!;
print {$output} $first->sprint;
close($output);
share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.