This is my sample file:

<?xml version="1.0" encoding="UTF-8" ?>
 <testjar>
 <testable>
  <trigger>Trigger1</trigger>
  <message->2012-06-14T00:03.54</message>
 <sales-info>
  <san-a>no</san-a>
  <san-b>no</san-b>
  <san-c>no</san-c>
  </sales-info>
  </testable>
  </testjar>

I need to extract xml tags from this-

e.g. output of above file should be

testjar
testable
trigger
message
sales-info
....
share|improve this question

2 Answers

up vote 3 down vote accepted
$> cat ./text
<?xml version="1.0" encoding="UTF-8" ?>
 <testjar>
 <testable>
  <trigger>Trigger1</trigger>
  <message>2012-06-14T00:03.54</message>
 <sales-info>
  <san-a>no</san-a>
  <san-b>no</san-b>
  <san-c>no</san-c>
  </sales-info>
  </testable>
  </testjar>

And

$> grep -P -o "(?<=\<)[^>?/]*(?=\>)" ./text
testjar
testable
trigger
message
sales-info
san-a
san-b
san-c 

Regular expression (?<=\<)[^>?/]*(?=\>) consist of 3 parts:

  • (?<=\<): (?<=) is lookbehind operator, so it means "after <";

  • [^>?/]*: not >,?,/ 0 or more times;

  • (?=\>): (?=) is lookahead operator, so it means "before >"

share|improve this answer
Can you explain what does '>' on the start of the command? I.e. $> cat . Пожалуйста. And why you use ./ which means an execution, afaik. – caligula Jul 25 '12 at 8:39
$> is my bash PS1. It indicates that following line is shell query. ./ means "current directory" ⇒ ./texttext. – ДМИТРИЙ МАЛИКОВ Jul 25 '12 at 9:07
working fine for me. thanks :) – Pravin Satav Jul 25 '12 at 9:12
Thanks, understood – caligula Jul 25 '12 at 9:16
I have checked if file comes in single line also, it works perfectly fine – Pravin Satav Jul 25 '12 at 9:18
awk -F">" '{print $1}' xmlfile | sed -e '/<\//d' -e '/<?/d' -e 's/<//g'
share|improve this answer
If file comes in single line then its not providing any output. – Pravin Satav Jul 25 '12 at 9:17

Your Answer

 
or
required, but never shown
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.