Code Review Stack Exchange is a question and answer site for peer programmer code reviews. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I'm fetching a string that is JSONP, and looking to convert it to JSON.

I'm using the following regular expression for matching (for removal) the padding on the JSON.

([a-zA-Z_0-9\.]*\()|(\);?$)

In python, here's the line I'm using for converting the JSONP to JSON:

apijson = re.sub(r'([a-zA-Z_0-9\.]*\()|(\);?$)','',jsonp)

Is there anything I should worry about using this regular expression? Any chance I could end up mangling the JSON?

share|improve this question
up vote 8 down vote accepted

The problem with the code is that it doesn't really express what you are trying to do:

apijson = re.sub(r'([a-zA-Z_0-9\.]*\()|(\);?$)','',jsonp)

The code would seem to indicate that you are trying to find and replace many instances of some regular expressions inside the string. However, you are really trying to strip parts off of the beginning and end. I'm also not a huge fan of regular expressions because they are usually pretty dense and hard to read. Sometimes they are awesome, but this is not one of those times.

Additionally, you aren't anchoring the regular expression to the beginning of the string which is what you'd need to strip off. The only case I could see that being a problem is perhaps if there were strings inside the json which matched the regular expression. Its best to be sure.

Also, I think JSON-P allows functions like alpha["beta"] which will doesn't fit your regular expression. Also what about additional whitespace or comments?

I would suggest doing something like:

   apijson = jsonp[ jsonp.index("(") + 1 : jsonp.rindex(")") ]

That way you are more clearly stripping everything outside of the first and last parenthesis.

share|improve this answer

You can simply slice the jsonp text to remove the initial padding and ending bracket by doing something like this:

jsonp_data = "callbackfunc({'count':2345, 'url':"http://stackoverflow.com/})"

jsonp_data[len('callbackfunc('):-1]

This will easily remove the padding. As in most of the cases, you might be just calling some API, then this method might be the best as API would always return the same string. If your response jsonp string padding varies every time, then you'd better write some regex.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.