5

Today while working on text analysing tool for blogs, I found PHP behavior very strange for me and just couldn't wrap my head around it. While normalizing text, I was trying to remove words below minimum length, so I wrote this in my normalization method:

if ($this->minimumLength > 1) {
    foreach ($string as &$word)
    {
        if (strlen($word) < $this->minimumLength) {
            unset($word);
        }
    }
}

Strangely, this would leave some words below allowed length in my array. After searching my whole class for mistakes, I gave a shot at this:

if ($this->minimumLength > 1) {
        foreach ($string as $key => $word)
        {
            if (strlen($word) < $this->minimumLength) {
                unset($string[$key]);
            }
        }
    }

And voila! This worked perfectly. Now, why would this happen ? I checked out PHP Documentation and it states:

If a variable that is PASSED BY REFERENCE is unset() inside of a function, only the local variable is destroyed. The variable in the calling environment will retain the same value as before unset() was called.

Does foreach act here as a calling environment because it has it's own scope?

3
  • Never modify something you're iterating over, because of this sort of unexpected behavior. Commented Jan 12, 2013 at 19:02
  • 2
    Not answering your question about references, but the easiest and cleanest way to do what you want is using the array_filter() function - php.net/manual/en/function.array-filter.php Commented Jan 12, 2013 at 19:09
  • Thanks for this, I have used array_filter earlier but for some different purposes. I wouldn't think of it for such a "simple" action as removing elements from an array but seems like setting them to false and running array_filter over it really is the clearest way to go Commented Jan 12, 2013 at 20:55

2 Answers 2

2

No, there is no function call here and no variable is being passed by reference (you are simply capturing by reference during the iteration).

When you iterate by reference the iteration variable is an alias to the original. When you use this alias to refer to the original and modify its value the change will remain visible in the array being iterated.

However, when you unset the alias the original variable is not "destroyed"; the alias is simply removed from the symbol table.

foreach ($string as $key => &$word)
{
    // This does not mean that the word is removed from $string
    unset($word);

    // It simply means that you cannot refer to the iteration variable using
    // $word from this point on. If you have captured the key then you can
    // still refer to it with $string[$key]; otherwise, you have lost all handles
    // to it for the remainder of the loop body
}
3
  • Oh....somehow I always thought that all actions against value's reference will just be mirrored onto the value itself. Guess I was wrong, thanks for clearing that up. Commented Jan 12, 2013 at 20:51
  • 1
    @igorpan: A related tip is that when you iterate with reference you may want to unset the loop variable immediately after the loop body; otherwise you "risk" assigning to that variable which will overwrite the last value from the iteration. Commented Jan 12, 2013 at 20:55
  • I knew about that, loop variable stays defined even after the loop itself is over. Nevertheless, can be useful if somebody stumbles upon this question :) Commented Jan 12, 2013 at 22:44
1

When you were calling unset($word) inside your if statement, you were removing the $word variable itself, without making any changes to the array $string.

2
  • But why is that so when I passed it by reference? My foreach states foreach ($string as &$word) Commented Jan 12, 2013 at 19:01
  • Because it is a reference to the original variable, not the actual original variable; you're unsetting the reference again, but the original remains Commented Jan 12, 2013 at 19:02

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.