Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I have not found anything of this sort on Google and would like to know if there is a quicker way of doing the following:

I need to parse build scripts for Java programs which are written in Python. More specifically, I want to parse the dictionaries which are hard-coded into these build scripts.

For example, these scripts contain entries like:

config = {}

config["Project"] = \
    {
        "Name"                          : "ProjName",
        "Version"                       : "v2",
        "MinimumPreviousVersion"        : "v1",
    }   

def actualCode ():
# Some code that actually compiles the relevant files

(The actual compiling is done via a call to another program, this script just sets the required options which I want to extract). For example, I want to extract, "Name"="ProjName" and so on.

I am aware of the ConfigParser library which is part of Python, but that was designed for .ini files and hence has problems (throws exception and crashes) with actual python code which may appear in the build scripts which I am talking about. So using this library would mean that I would first have to read the file in and remove lines of the file which ConfigParser would object to.

Is there a quicker way than reading the config file in as a normal file and parsing it? I am looking for libraries which can do this. I don't mind too much which languages this libraries is in.

share|improve this question
    
Do you have any restrictions on the values and keys of the dictionary? If all key/values are strings then you could parse the contents with regexes quite easily. –  Bakuriu Jun 14 '13 at 13:43
    
@Bakuriu The build script has more than just dictionaries in it (normal code as well), I want to extract some of the dictionaries from the build script –  user929404 Jun 15 '13 at 10:22

2 Answers 2

up vote 2 down vote accepted

I was trying to solve the similar problem. I converted the directory into a JSON object so that I can query keys using JSON object in simplest way possible. This solution worked for multi-level key values pairs for me. I

Here is the algorithm.

  1. Locate the config["key_name"] using a regular expression from string or file. Use the following regular expression

    config(.*?)\\[(.*?)\\]

  2. Get the data within curly brackets into a string. Use some stack based code since there could be nested brackets of type {} or [] in complex directories.
  3. Replace the circular bracket, if any, "()" with square brackets "[]" and backslash "\" with blank character " " as follows

    expression.replace('(', '[') .replace(')', ']') .replace('\\', ' ')

  4. JSONObject json = (JSONObject) parser.parse(expression)

Here is your JSON object. You can use it the way you want.

share|improve this answer

Try Parboiled. It is written in Java and you write your grammars in... Java too.

Use the stack to store elements etc; its parser class is generic and you can get the final result out of it.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.