0

I've got the following code for parsing a string into an array of options:

$options = 'myvalue, test=123, myarray=[1,2]';

function parse_options($options)
{
    $split = '/,(?=[^\]]*(?:\[|$))/';

    if($options && is_string($options)) 
    {
        $temp_options = preg_split($split, $options);
        $options = array();
        foreach($temp_options as $option) 
        {
            $option = trim($option);
            if(strpos($option,'=')) 
            {
                //Is an option with a value
                list($key, $value) = explode('=',$option);
                if(strpos($value,'[') !== FALSE) 
                {
                    //Is an array of values
                    $value = explode(',', substr($value, 1,-1));
                }
                $options[$key] = $value;
            }
            else 
            {
                $options[] = $option;
            }
        }
    }
    else 
    { 
        //Return empty array if not a string or is false
        if(!is_array($options)) { $options = array(); }
    }

    return $options;
}

Basically it splits by comma unless surrounded by brackets. Then it checks for the = for key->value pairs and then tries to figure out if the value is an array.

This works fine, but I would like to improve it so that it can create nested arrays for something like

$options = 'myvalue, test=123, bigwhopper=[ myarray=[1,2], test2=something ]';

Which would output

Array(
    [0] => myvalue,
    [test] => 123,
    [bigwhopper] => Array(
                [myarray] = Array(
                     [0] => 1,
                     [1] => 2
                ),
                [test] => something
            )
)

I'm certainly no RegExp guru so can someone help me make the function understand nested [] separators? Also anything to improve the function's performance is highly appreciated as I use this a lot to easily pass options to my controllers.

1
  • Well, you're trying to write a custom unserializer, and it is not an easy task. Why do you need to parse such results? Commented Jul 11, 2012 at 12:33

2 Answers 2

1

Why are you inventing your own format, over something that's already widely established.

Some options:

  • urlencoding does this
  • json
  • yalm
  • php's serialize()

You're inventing a whole new syntax, and standard regex won't even allow this because it has recursion. You basically need to write a parser, so the best place to start if you insist on your own syntax, is to look at parser generators.

http://wezfurlong.org/blog/2006/nov/parser-and-lexer-generators-for-php/

6
  • I wish I could up vote to 0,5 :) +0,5 to note there are several better solutions, -0,5 because he may really need to do this if the data are given by a third party. (though +1 anyway) Commented Jul 11, 2012 at 12:34
  • well I did give him a starting point for writing a parser too ;) thanks for the vote though :P Commented Jul 11, 2012 at 12:38
  • The idea was to have a simple and quick way to send some options to a function. A typical use would be something like get_stuff(123, 'limit=5, render=grid, something=[1,5]') which is easy to understand and very quick to write. The options are parsed into an array which can be checked for specific keys and then shape what the get_stuff function (usually getting something from database) will do. Its mostly a tool to make development easier so instead of array(123, 'limit' => 5, 'render' => 'grid', 'something' => array(1,5)) I can just pass a simple string and let the parser handle the rest. Commented Jul 11, 2012 at 13:11
  • So yea, my humble opinion is that this is a bad idea.. How, for example are you going to deal with escaping (for instance, if I want my string to contain a quote, or a comma). You are inventing a new format, and require a parser unless you are ok with 'weird bugs'. Commented Jul 11, 2012 at 13:56
  • To be honest I'm not too concerned about that. It's only used to pass simple, typically one word options or an array of options. It's not for inputting anything complex or any real data and is not used by the end user. The code I posted has been quite sufficient until now. Commented Jul 12, 2012 at 7:21
0

Don't reinvent the wheel by creating a new (incomplete, buggy and slow) parser for your options. Use preexisting solutions :

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.