Tell me more ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I have a small function which I regularly use to check if a variable contains only [a-z][0-9] and the special chars '-' and '_'. Currently I'm using the following:

function is_clean($string){
    $pattern = "/([a-z]|[A-Z]|[0-9]|-|_)*/";
    preg_match($pattern, $string, $return);
    $pass = (($string == $return[0]) ? TRUE : FALSE);
    return $pass;
}

1 - Can anyone tell me if this is the best/most efficient way to do this and if not try to explain why?

2 - When I view this function through my IDE I get a warning that $return is uninitialized... should I be initializing variables in advance with php, if so how and why?

share|improve this question

2 Answers

No need for all that. Just use one bracket group, negate it (be prepending a ^), and use the return value directly:

function is_clean ($string) {
    return ! preg_match("/[^a-z\d_-]/i", $string);
}

Here's a quote from the PHP docs:

Return Values
preg_match() returns 1 if the pattern matches given subject, 0 if it does not, or FALSE if an error occurred.

In the regex above, we're looking for any characters in the string that are not in the bracket group. If none are found, preg_match will return 0 (which when negated will result in true). If any of those characters are found, 1 will be returned and negated to false.

share|improve this answer
Certainly cleaner looking code with this approach - thanks. – WebweaverD Mar 15 at 13:31

Just an other method without regex.

function is_clean ($string) {
{
    return ctype_alnum(str_replace(array('-', '_'), '', $input);
}

Maybe I find the time later this day to compare the performance, but I guess 'efficient' in your question was related to the code not the execution time? Letharion did the work for me :)

share|improve this answer
1  
I spent some time with this yesterday. ctype alone is fastest, then comes short preg_match, ctype with str_replace, and finally the original. In many cases, ctype should be the best way to do this, as it will, unlike the regexp, work with characters like é. However, I didn't post an answer, because I couldn't get it to behave properly when trying out it. – Letharion Mar 14 at 7:37
Maybe we should combine both approaches. Only if the plain ctype fails we use the regex. That might result in the best performance in the average if we can assume that there are not to many strings with - and _. – mnhg Mar 14 at 11:00
Thanks for the suggestion - if I can get it working then probably a better than a regex as far as performance is concerned. Will have a play and come back – WebweaverD Mar 15 at 13:34

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.