Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I'm sorry,I can't believe this question is not solved in stackoverflow but I've been searching a lot and I don't find any solution.

I want to change HTML code with regular expressions in this way:

testing <a href="url">anchor</a>

to

testing anchor

Only I want to unlink a text code without use DOM functions, the code is in a string not in the document and I don't want to remove other tags that the a ones.

share|improve this question
2  
There's a reason why it's not solved... –  Juhana May 24 '13 at 11:17
    
Also, just because the HTML is not in the DOM doesn't mean you couldn't parse it. –  Juhana May 24 '13 at 11:18
    
The html code is in a string it's not visible in the window, I just want to parse with a regular expression. –  Oscardrbcn May 24 '13 at 11:25
    
Right. Having the HTML in a string doesn't prevent you from using DOM methods. –  Juhana May 24 '13 at 12:20

4 Answers 4

up vote 5 down vote accepted

If you really don't want to use DOM functions (why ?) you might do

str = str.replace(/<[^>]*>/g, '')

You can use it if you're fairly confident you don't have a more complex HTML but it will fail in many cases, for example some nested tags, or > in an attribute. You might fix some of the problems with more complex regular expressions but they aren't the right tool for this job in the general case.

If you don't want to remove other tags than a, do this :

str = str.replace(/<\/?a( [^>]*)?>/g, '')

This changes

<a>testing</a> <a href="url"><b>a</b>nchor</a><div>test</div><aaa>E</aaa>

to

testing <b>a</b>nchor<div>test</div><aaa>E</aaa>
share|improve this answer
2  
+1 Works beautifully for OP's simple use case, I think this is the simplest regex solution. OP, if you're doing anything more complicated avoid this. –  Benjamin Gruenbaum May 24 '13 at 11:23
    
Thank you very much, is all that I need, definitely I have to study some regular expressions tutorial, I don't know anything about it. It's enough although fails with nested tags. I can't use DOM functions (I suppose) because the code is in a string it's not showed in the document object. –  Oscardrbcn May 24 '13 at 11:26
    
@user1901219 Is this regex clear or do you want me to explain it ? –  dystroy May 24 '13 at 11:27
    
Now I think, it doesn't work because I only want to remove the link tags, if I have <div><a href="url">anchor</a></div> I want the result <div>anchor</div> –  Oscardrbcn May 24 '13 at 11:34
1  
You can create a DOM object from the string, use DOM methods to parse, without having had appended said DOM object to the document. –  andy magoon May 24 '13 at 12:06

I know you only want regex, for future viewers, here is a trivial solution using DOM methods.

var a = document.createElement("div");
a.innerHTML = 'testing <a href="url">anchor</a>';
var wordsOnly = a.textContent || a.innerText; 

This will not fail on complicated use cases, allows nested tags and it's perfectly clear what's happening:

  • Hey browser! Create an element
  • Put that HTML in it
  • Give me back just the text, that's what I want now.

NOTE:

The element we're creating will not be added to the actual DOM since we're not adding it anywhere, it'll stay invisible. Here is a fiddle to illustrate how this works.

share|improve this answer
    
Note to future readers, this is also possible if you're nodejs or another javascript framework. No need to reinvent wheels most of the time. –  Benjamin Gruenbaum May 24 '13 at 11:24
1  
+1 because even while it wasn't what OP asks, it's generally a better solution. Shouldn't that be a little more complex for compatibility with IE8, like a.textContent||a.innerText ? –  dystroy May 24 '13 at 11:26

As has been mentioned, you cannot parse HTML with regular expressions. The principal reason is that HTML elements nest and regular expressions cannot handle that.

That said, with a few restrictions which I will mention, you can do the following :

string.replace (/(\b\w+\s*)<a\s+href="([^"]*)">(.*)<\/a>/g, '$1 $3')

This requires there to be a word before the tag, spacing between the word and the tag is optional, no attributes other than the href specified in the <a> tag and you accept anything between the <a> and the .

share|improve this answer
1  
This gives me "testing url" and not "testing anchor" like OP asked for –  Benjamin Gruenbaum May 24 '13 at 11:34
    
My bad, now fixed. Thanks Benjamin –  HBP May 24 '13 at 11:40
    
It didn't work for my simple code, I don't know if I understood good the "This requires there to be a word before the tag", I've tried with a word before. But anyway the expression of @dystroy is enough for me. Thank you! –  Oscardrbcn May 24 '13 at 11:48

You can create a DOM object from the string, use DOM methods to parse, without having had appended said DOM object to the document

share|improve this answer
1  
Hey andy, did you mean to post it as a comment and not an answer perhaps? –  Benjamin Gruenbaum May 24 '13 at 12:11
    
Yes it's true, but I though it was quicker and elegant to do it with regular expressions, but now I see the Mat answer link and maybe I was wrong. –  Oscardrbcn May 24 '13 at 12:20

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.