I have an XML file and I would like to replace everything that is between the open and closing tag within multiple instances of the g:gtin node with nothing.

Is this possible from the command line, using sed or something similar?

<g:gtin>31806831001</g:gtin>
link|improve this question

75% accept rate
2  
Is the whole tag always in a single line? Is there always only one such tag per line? For xml, which often spans across multiple lines, xmlstarlet is often a better alternative. – user unknown Apr 20 at 15:16
feedback

2 Answers

A simple solution for simple cases - see my comment:

echo "<g:gtin>31806831001</g:gtin>" | sed 's|<g:gtin>.*</g:gtin>|<g:gtin></g:gtin>|'

Result:

<g:gtin></g:gtin>

It depends on the assumption that start and endtag are on the same line, and not more than one tag is on that line.

Since xml files are often generated the same way, over and over again, the assumption might hold.

link|improve this answer
Don't I have to pass the filename into sed though? Say i'm in a dir called /tests and I have a file called feed.xml, surely I need to say replace all instances of <g:gtin>(.*?)</g:gtin> with <g:gtin></g:gtin> or something to that effect? – crmpicco Apr 20 at 15:21
Yes. You either pipe the output of a command through sed, or specify a file to work on. sed 'sedcommands' feed.xml would be your option. If you have multiple sed commands, you can put them into a file too: sed -f commands.sed feed.xml. There are many options. An -i flag: sed -i 'commands' feed.xml would change your file in place. – user unknown Apr 20 at 15:36
@crmpicco: The question mark behind .* is superfluous. .* means 0 or 1 or many; making this optional is meaningless. – user unknown Apr 20 at 15:39
It could be the case that @crmpicco is attempting to use lazy quantifiers, which are not available in the portable POSIX sed specification. sed and regex are not the correct or most robust way to be handling XML anyways. – jw013 Apr 20 at 16:52
feedback

One way using perl:

Content of script.pl:

use warnings;
use strict;
use XML::Twig;

die qq[Usage: perl $0 <xml-file>\n] unless @ARGV == 1;

my $twig = XML::Twig->new(
    twig_roots => { 
        q[g:gtin] => \&handle_gtin,
    },  
    twig_print_outside_roots => 1,
);

$twig->parsefile( shift );

sub handle_gtin {
    my ($t, $gtin) = @_; 
    $gtin->set_text( q[] );
    $gtin->print;
}

Run it like:

perl script.pl file.xml
link|improve this answer
feedback

Your Answer

 
or
required, but never shown

Not the answer you're looking for? Browse other questions tagged or ask your own question.