Take the 2-minute tour ×
Unix & Linux Stack Exchange is a question and answer site for users of Linux, FreeBSD and other Un*x-like operating systems.. It's 100% free, no registration required.
$ echo '<a href="mailto:NA?Subject=AB42525216 - FOOBAR bla bla - bla">NA</a>'
<a href="mailto:NA?Subject=AB42525216 - FOOBAR bla bla - bla">NA</a>
$ echo '<a href="mailto:NA?Subject=AB42525216 - FOOBAR bla bla - bla">NA</a>' | sed 's/SOMEMAGIC/NA/g'
NA

My question: how can I remove the string in the "echo" with SOMEMAGIC? The delimiter could be > I think

share|improve this question
    
What part of string do you want to remove? –  Gnouc May 30 at 8:36
    
whops :) I updated it! Thanks.. –  evachristine May 30 at 8:52
    
Usually it's better to use a parser to process HTML. –  choroba May 30 at 8:56
    
sed 's/.*>\(.*\)<.*/\1/' would work on that particular input. But so would cut -d \> -f2- | cut -f\< -f1 –  Stéphane Chazelas May 30 at 9:12
    
@evachristine: Do you mean replace the whole line by NA? –  Gnouc May 30 at 9:16

1 Answer 1

I think you are trying to extract the value that are inside <a href>xx</a> tag. If yes then your command should be like this,

GNU sed:

sed -r 's/^<a [^>]*>([^<]*)<.*$/\1/g' file

Traditional sed:

sed 's/^<a [^>]*>\([^<]*\)<.*$/\1/g' file

Example:

$ echo '<a href="mailto:NA?Subject=AB42525216 - FOOBAR bla bla - bla">NA</a>' | sed -r 's/^<a [^>]*>([^<]*)<.*$/\1/g'
NA

$ echo '<a href="mailto:NA?Subject=AB42525216 - FOOBAR bla bla - bla">NA</a>' | sed 's/^<a [^>]*>\([^<]*\)<.*$/\1/g'
NA

$ echo '<a href="mailto:NA?Subject=AB42525216 - FOOBAR bla bla - bla">fooooooooooooooobaaaaaaaaaar</a>' | sed 's/^<a [^>]*>\([^<]*\)<.*$/\1/g'
fooooooooooooooobaaaaaaaaaar
share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.