How to replace plain URLs with links, in javascript? [closed]

Question

I've nearly got this working. I wanted to know if there is a much better way. One problem is that no matter what there will be cases where a URL is incorrectly identified and as such the end result will be dependent on requirements. My requirements are lackadaisy, I just want something that works most of the time. However some of the URLs in my fiddle are not being detected correctly and I'd like advice on the regex part of this for overcome these. The main issue here is that the overall design is in question, the should be a simpler method.

Root problem

Fiddle

function replaceURLWithHTMLLinks(text) {
    text = text.replace(/a/g, "--ucsps--");
    text = text.replace(/b/g, "--uspds--");
    var arrRegex = [
        /(\([^)]*\b)((?:https?|ftp|file):\/\/[-A-Za-z0-9+&@#\/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#\/%=~_()|])(.?\))/ig,
        /()(\b(?:https?|ftp|file):\/\/[-a-z0-9+&@#\/%?=~_()|!:,.;]*[-a-z0-9+&@#\/%=~_()|])(.?\b)/ig];
    for (i = 0; i < arrRegex.length; i++) {
        text = text.replace(arrRegex[i], "$1a$2b$3");
    }
    text = text.replace(/a([^b]*)b/g, "<a href='$1'>$1</a>");
    text = text.replace(/--ucsps--/g, "a");
    text = text.replace(/--uspds--/g, "b");
    return text;
}
var elm = document.getElementById('trythis');
elm.innerHTML = replaceURLWithHTMLLinks(elm.innerHTML);

Any thoughts?

what browser are you using? all of the links show up for me in chrome. and the code seems to be working. I am not very good at JavaScript, but there doesn't seem to be anything that really jumps out at me. — Malachi, Oct 28 '13 at 20:27
I think that this contains code that does what the OP wants it too, but the OP wants a Better way to Handle what it does. the OP doesn't need the Javascript to catch all of the URLS just Most of them, right?? — Malachi, Oct 29 '13 at 1:43
I'm using Chromium. Some of the URLs have extra 'junk' at the end. Specifically the last few have "." and ".)". — Mike Mestnik, Oct 29 '13 at 16:49
I'd like as many browsers to catch as many URLs as possible. This should be common code that everyone can use for any project, I'm amazed that there are so few good examples. — Mike Mestnik, Oct 29 '13 at 16:51
@Malachi “some of the URLs […] are not being detected correctly and I'd like advice […] to overcome these” That sounds to me like it doesn't work the way Mike wants. — svick, Oct 30 '13 at 1:22

200_success · Accepted Answer · 2013-10-29 00:50:22Z

I think that it's nutty to replace all characters 'a' and 'b' with junk, then un-replace them later, just so that you can use 'a' and 'b' as landmarks to indicate which points your two regular expressions have in common. If you're going to do that, then why not just use "--ucsps--" and "--uspds--" as your landmarks in the first place? Of course, using any special strings as landmarks is a bad idea.

The fundamental problem is that you want to exclude the same number of closing parentheses as there are open parentheses. You can't use a regular expression to match left and right parentheses, because such an expression would not be regular. Well, some flavours of "regular" expressions support backreferences within the expression itself, which could do the job. RegExps in JavaScript don't.

What JavaScript can do, though, is let you specify your own code to generate the replacement text. Here's what I came up with:

function replaceURLWithHTMLLinks(text) {
    var re = /(\(.*?)?\b((?:https?|ftp|file):\/\/[-a-z0-9+&@#\/%?=~_()|!:,.;]*[-a-z0-9+&@#\/%=~_()|])/ig;
    return text.replace(re, function(match, lParens, url) {
        var rParens = '';
        lParens = lParens || '';

        // Try to strip the same number of right parens from url
        // as there are left parens.  Here, lParenCounter must be
        // a RegExp object.  You cannot use a literal
        //     while (/\(/g.exec(lParens)) { ... }
        // because an object is needed to store the lastIndex state.
        var lParenCounter = /\(/g;
        while (lParenCounter.exec(lParens)) {
            var m;
            // We want m[1] to be greedy, unless a period precedes the
            // right parenthesis.  These tests cannot be simplified as
            //     /(.*)(\.?\).*)/.exec(url)
            // because if (.*) is greedy then \.? never gets a chance.
            if (m = /(.*)(\.\).*)/.exec(url) ||
                    /(.*)(\).*)/.exec(url)) {
                url = m[1];
                rParens = m[2] + rParens;
            }
        }
        return lParens + "<a href='" + url + "'>" + url + "</a>" + rParens;
    });
}

This works very well, thank you! I plan on extending this to support other brackets. — Mike Mestnik, Oct 29 '13 at 16:53

asked	4 months ago
viewed	216 times
active	4 months ago

current community

your communities

more stack exchange communities

How to replace plain URLs with links, in javascript? [closed]

closed as off-topic by Stuart, Flambino, Dave Jarvis, Jamal♦, Mat's Mug Oct 29 '13 at 2:57

1 Answer

Not the answer you're looking for? Browse other questions tagged javascript regex or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

How to replace plain URLs with links, in javascript? [closed]

closed as off-topic by Stuart, Flambino, Dave Jarvis, Jamal♦, Mat's Mug Oct 29 '13 at 2:57

1 Answer

Not the answer you're looking for? Browse other questions tagged javascript regex or ask your own question.

Related

Hot Network Questions