Using JavaScript Regular Expressions to Parse a Query String

Question

The goal is to parse out the arguments embedded in the hash portion of a URL. Someone on StackOverflow community provided the following code as a response to a similar question. The community chose this code as a suitable answer and it was therefore integrated into my own code as follows:

Utils.parseQueryStringArgs=function(queryString){
    var result = {}, queryString = queryString.substring(1),
        re = /([^&=]+)=([^&]*)/g, m;

    while (m = re.exec(queryString)) {
            result[decodeURIComponent(m[1])] = decodeURIComponent(m[2]);
    }

    return result;
}

Having zero knowledge of regular expressions, the 're' variable above is meaningless to me but yet the structure of the code makes sense. It's worked flawlessly until choking on the following expression:

friendlySeriesName=Test%3A%20Fund.Feeder%20[class%20A]%20%26gt%3B%26gt%3B%20Morgan%2C%20Harry%20%40%201

The expected behavior is to parse out a property name of "friendlySeriesName" and a property value of "Test Fund.Feeder [class A] >> Morgan, Harry @ 1". What's happening, however, is a proper parsing of the property name and a parsed value of "Test: Fund.Feeder [class A]". Everything beyond the greater than signs (">>") is getting dropped by this parsing function.

QUESTION: Can you modify the value of the variable 're' in such a way that the function above works properly on the sample expression provided or alternatively, provide a foolproof way to parse out key-value pairs from the hash of an encoded url string?

I strongly recommend against using that code to parse query strings. A query string of ?foo=bar&foo=baz should be parsed to ['bar', 'baz']. There are existing JS libraries available to correctly parse URIs, and therefor query strings.

score 5 · Accepted Answer · 2012-12-01 19:03:12Z

The problem is not the JavaScript code. The problem is the value that is supplied in the query string.

Some code somewhere is encoding > as > first (i.e. as an HTML character entity) and then URI-escaping the ampersand and semi-colon leaving %26gt%3b. Doh!

FWIW, one quick "hack" is to first convert %26gt%3b to %3e (or >):

// original stuff
var result = {}, queryString = queryString.substring(1),
    re = /([^&=]+)=([^&]*)/g, m;
// hack
queryString = queryString.replace(/%26gt%3b/g, "%3e");
// rest as normal

This may need to be done for other problematic initial encodings (e.g. < as < and & as &) as well.

Your diagnosis is correct. The inputs were being encoded twice.

asked	6 months ago
viewed	199 times
active	6 months ago

Using JavaScript Regular Expressions to Parse a Query String

1 Answer

Your Answer

Not the answer you're looking for? Browse other questions tagged javascript regex query-string or ask your own question.

Using JavaScript Regular Expressions to Parse a Query String

1 Answer

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged javascript regex query-string or ask your own question.

Related