Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

In using Javascript to validate a URL, I used the following code from an SO answer:

function validateURL(textval) {
          var urlregex = new RegExp(
                "^(http|https|ftp)\://([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*@)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-zA-Z]{2}))(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*$");
          return urlregex.test(textval);
        }

This function works fine for most URLs I tried, but on the following Amazon url to (ironically) a Creating Mobile Apps with jQuery Mobile book, it hangs. In Chrome dev tools I see nothing but clicking anywhere inside the tab doesn't do anything.

http://www.amazon.com/gp/product/178216006X/ref=s9_simh_gw_p351_d3_i4?pf_rd_m=ATVPDKIKX0DER&pf_rd_s=center-2&pf_rd_r=0SDC8SED1N96XPK44VD2&pf_rd_t=101&pf_rd_p=1389517282&pf_rd_i=507846

The URL is pretty long but there's nothing special in there that I can tell. In fact it passes the Javascript validation on the following Scott's Playground page.

My question is NOT how to do URL validation in Javascript. My question is the following: If I use a Javascript regular expression and it hangs on a piece of text, what makes a regular expression freeze the browser like this? How can I catch the cases that does that?

Is this something that only happens with new RegExp(...) vs /regex/ as mentioned in this answer?

In terms of the actual validation, I switch to a different /regex/ but I still wanted to post this question because it led to a pretty painful debugging process. (Then again, anything that tries to validate URLs or emails with regular expressions will probably be painful).

share|improve this question
1  
No offense, but that regular expression is completely ridiculous. It's unreadable, unmaintainable and quite simply too long. It's so bad you don't even know what's going on, and you wrote it! – Frits van Campen May 21 at 19:46
I didn't write it, as I said in the first sentence, I took it exactly from the SO answer that I linked in the first sentence. That's not the point of this question. We often use regular expressions that other people write. The question is: what can make it hang and how to catch such cases. – Lex May 21 at 19:49
3  
"We often use regular expressions that other people write." - I don't know about you but I never use a regular expression that I haven't scrutinized myself first. This doesn't look like a generic URL matcher, it seems to be trying to match some specific set of numbers, and the non-exhaustive tld list is a joke. – Frits van Campen May 21 at 19:52
Thanks for your constructive criticism of the regular expression, and for completely ignoring my question. – Lex May 21 at 19:56
add comment (requires an account with 50 reputation)

2 Answers

This seems to be something that happens with new RegExp(...) and not with /regex/ for this regular expression. So for URL validation and other types of regex matching, use:

function validFoo(value) {
    return /foo/i.test(value);
}

Where foo is the regular expression.

share|improve this answer
Which is due to the quoting. When using regular quoting for regular expressions you have to use double backslashes. So /\w+\./ would be quoted as "\\w+\\.". And that expression is a joke. You can most likely use something a lot simpler and better. – Qtax May 21 at 20:14
1  
It depends what you want/need. The real regex for URL validation would probably need to be a lot longer, but just better structured. Please don't use the word "joke" to refer to any legitimate attempt by someone. SO shouldn't be an elitist club of hecklers. Otherwise, you risk driving away a lot of smart people. – Lex May 21 at 20:27
add comment (requires an account with 50 reputation)

It looks like there's a problem with the regular expression and how Chrome interprets it. In Firefox it works.

The issue lies at the second request parameter in your url (the &) and how the Chrome javascript engine gets stuck in a loop.

If you don't need to evaluate the port in the url use something like this: /^(https?://)?([\da-z.-]+).([a-z.]{2,6})([/\w .-])/?$/

share|improve this answer
add comment (requires an account with 50 reputation)

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.