Tell me more ×
Unix & Linux Stack Exchange is a question and answer site for users of Linux, FreeBSD and other Un*x-like operating systems.. It's 100% free, no registration required.

I have xml file (say input.xml) of the following schema:

<?xml version="1.0"?>
  <TagA>
    <TagB>
      <File Folder="FOLDER1M\1" File="R1.txt" />
    </TagB>
    <TagB>
      <File Folder="FOLDER1M\2" File="R2.txt" />
    </TagB>
    <TagB>
      <File Folder="FOLDER2M\1" File="R3.txt" />
    </TagB>
  </TagA>

I need to parse this file and write the output to another file. The required output should be of the following form:

www.xyz.com\FOLDER1M\1\R1.txt
www.xyz.com\FOLDER1M\2\R2.txt
www.xyz.com\FOLDER2M\1\R3.txt

What I have got so far is:

echo 'cat /TagA/TagB/File/@*[name()="Folder" or name()="File"]' | xmllint --shell input.xml | grep '=' > xml_parsed

This gives me o/p of the form:

/ > cat /TagA/TagB/File/@*[name()="Folder" or name()="File"]
Folder="FOLDER1M\1"
File="R1.txt"
Folder="FOLDER1M\2"
File="R2.txt"
Folder="FOLDER2M\3"
File="R3.txt"

How should I go about getting my required output instead of this current o/p?

share|improve this question
What programming languages are you familiar with? Are you trying to solve this using only bash or some other shell? – slm Apr 16 at 21:48
I'm trying to get it done by bash..it's a part of overall automation I'm trying to achieve using bash scripting – NGambit Apr 16 at 21:49

1 Answer

up vote 1 down vote accepted

Here's one way to do it. I just put your output into a file called sample.txt to make it easier to test, you can just append my commands to the end of your echo command:

sample.txt

Folder="FOLDER1M\1"
File="R1.txt"
Folder="FOLDER1M\2"
File="R2.txt"
Folder="FOLDER2M\3"
File="R3.txt"

command

% cat sample.txt | sed 'h;s/.*//;G;N;s/\n//g' | sed 's/Folder=\|"//g' | sed 's/File=/\\/' | sed 's/^/www.xyz.com\\/'

Breakdown of the command

join every 2 lines together

# sed 'h;s/.*//;G;N;s/\n//g'
Folder="FOLDER1M\1"File="R1.txt"
Folder="FOLDER1M\2"File="R2.txt"
Folder="FOLDER2M\3"File="R3.txt"

strip out Folder= & "

# sed 's/Folder=\|"//g'
FOLDER1M\1File=R1.txt
FOLDER1M\2File=R2.txt
FOLDER2M\3File=R3.txt

Replace File= with a '\'

# sed 's/File=/\\/'
FOLDER1M\1\R1.txt
FOLDER1M\2\R2.txt
FOLDER2M\3\R3.txt

insert www.xyz.com

# sed 's/^/www.xyz.com\\/'
www.xyz.com\FOLDER1M\1\R1.txt
www.xyz.com\FOLDER1M\2\R2.txt
www.xyz.com\FOLDER2M\3\R3.txt

EDIT #1

The OP updated his question asking how to modify my answer to delete the first line of output, for example:

/ > cat /TagA/TagB/File/@*[name()="Folder" or name()="File"]
...
...

I mentioned to him that you can use grep -v ... to filter out lines that aren't relevant like so:

% cat sample.txt | grep -v "/ >" | sed 'h;s/.*//;G;N;s/\n//g' | sed 's/Folder=\|"//g' | sed 's/File=/\\/' | sed 's/^/www.xyz.com\\/'

Additionally to write the entire bit out to a file, that can be done like so:

% cat sample.txt | grep -v "/ >" | sed 'h;s/.*//;G;N;s/\n//g' | sed 's/Folder=\|"//g' | sed 's/File=/\\/' | sed 's/^/www.xyz.com\\/' > /path/to/some/file.txt
share|improve this answer
Why the down vote? – slm Apr 16 at 23:04
2  
My question exactly..The person who voted down should at least give a reason – NGambit Apr 16 at 23:08

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.