1

I am trying to use regex in PHP to parse a string and make an array. Here is my attempt:

function parseItemName($itemname) {
    // Given item name like "2 of #222001." or "3 of #222001; 5 of #222002."
    // Return array like (222001 => 2) or (222001 => 3, 222002 => 5)
    preg_match('/([0-9]?) of #([0-9]?)(.|;\s)/', $itemname, $matches);
    return $matches;
}

Calling the function with

print_r(parseItemName("3 of #222001; 5 of #222002."));

returns

Array ( [0] => 3 of #22 [1] => 3 [2] => 2 [3] => 2 )

Does anyone know how to make this work? I assume that preg_match() is not the best way to do this, but I'm not sure what else to try. I appreciate any ideas. Thank you!

3
  • you can't match things into the array key with preg_match. Commented Jan 23, 2012 at 16:38
  • have you tried preg_match_all() rather than just preg_match() ... should give you a multidimensional array which might be a little closer to what you're after? Commented Jan 23, 2012 at 16:47
  • What you want to do is not possible. You can rename the key for every match with (?P<myKey>\d+). But you cannot use dynamic text Commented Jan 23, 2012 at 16:48

4 Answers 4

2

Aside from the adjustment to your regex pattern, you want to be using preg_match_all with the PREG_SET_ORDER flag set to make things simpler.

This will return a $matches array arranged like so:

array
  0 => 
    array
      0 => string '3 of #222001' (length=12)
      1 => string '3' (length=1)
      2 => string '222001' (length=6)
  1 => 
    array
      0 => string '5 of #222002' (length=12)
      1 => string '5' (length=1)
      2 => string '222002' (length=6)

The below example function now loops through all of the matches, and constructs a new array, using the second match as the key, and the first match as the value.

function parseItemName($itemname) {
    // Given item name like "2 of #222001." or "3 of #222001; 5 of #222002."
    // Return array like (222001 => 2) or (222001 => 3, 222002 => 5)
    preg_match_all('/(\d+)\sof\s#(\d+)/', $itemname, $matches, PREG_SET_ORDER);

    $newArray = array();

    foreach($matches as $match) {
        $newArray[$match[2]] = intval($match[1]);
    }

    return $newArray;
}

var_dump(parseItemName("3 of #222001; 5 of #222002."));

The output that is dumped will look like this:

array
  222001 => int 3
  222002 => int 5
2
  • @Alex, I just noticed something. In your comment you have "2 of #222001. or 3 of #222001". If the 222001 occurs more than once, the array value will be overwritten, since 222001 is being used as a key. You can turn it into a nested array structure inside the loop to get around this ($newArray[$match[2]][] = intval($match[1]); - note the extra [] before the =) Commented Jan 23, 2012 at 19:40
  • Leigh - Thanks for catching that. Commented Jan 24, 2012 at 19:40
1
([0-9]?)
      ^--- "0 or 1 of ...

What you want is + instead, which is "1 or more", and would capture all of the digits, not just the first one.

2
  • It is now returning Array ( [0] => 3 of #222001; [1] => 3 [2] => 222001 [3] => ; )... Any ideas on how to make the array look like (222001 => 3, 222002 => 5)? Commented Jan 23, 2012 at 16:41
  • That's normal. preg_match returns element 0 as the whole string that matched, element #1 the first capture group, element #2 the 2nd capture group, etc... You'll have to post-process it into that format yourself. Commented Jan 23, 2012 at 19:27
0

Use something like:

/(\d*?)\sof\s#(\d*)\b/

EDIT: Remove the lazy match as commented in this post.

3
  • Are you sure you want a lazy match on those digits? Will that not only match 1 if it's 143 of #222001 ? Commented Jan 23, 2012 at 16:33
  • Thanks for your fast response! How do you think I should use the new regex? I tried updating the fourth line above with preg_match('/(\d*?)\sof\s#(\d*?)\b/', $itemname, $matches); but now the return is Array ( [0] => 3 of # [1] => 3 [2] => ).... Commented Jan 23, 2012 at 16:34
  • 1
    @WouterJ - except that link doesn't point to the RegExp you've written, you've dropped the lazy match on the second digit in that linked version, so it works, /(\d*?)\sof\s#(\d*)\b/ rather than /(\d*?)\sof\s#(\d*?)\b/ - though tbh just /(\d*)\sof\s#(\d*)\b/ should be fine... Commented Jan 23, 2012 at 16:41
0
function parseItemName( $itemname ) {

  preg_match_all( '/([0-9]+) of #([0-9]+)/', $itemname, $matches, PREG_SET_ORDER );

  $items = array();

  foreach ( $matches as $match ) {

    $items[ $match[2] ] = $match[1];

  }
  // foreach


  return $items;

}
// parseItemName
1
  • I read them before writing my own. I'm not going to write one then check for more recent ones. Commented Jan 23, 2012 at 17:29

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.