Sign up ×
Stack Overflow is a community of 4.7 million programmers, just like you, helping each other. Join them; it only takes a minute:

This seems like an obvious question but there does not appear to be an answer anywhere at the moment.

I am looking to attempt to process a search query to go into a postgresql full-text search with correct formatting using PHP. I am a bit of newbie to regular expressions and I just can't seem to work out where to start.

I am looking to take a query along the lines ofSomething AND ("Some Phrase" OR Some Other Phrase) and convert it to 'Something' & (('Some' & 'Phrase')|('Some' & 'Other' & 'Phrase'))

You may be able to point me to a library that does this, I can't seem to find one although I imagine it is a common problem. Thanks for any help!

share|improve this question
up vote 2 down vote accepted

This code will parse the example as given and return the requested output. Be aware that it is somewhat brittle: You may need to validate the format of the expression to ensure that it is similar to the format you posted.

$input = 'Something AND ("Some Phrase" OR Some Other Phrase)';

$formatted_output = format_input( $input );

// Convert a query to a format suitable for a Postgres full-text search
function format_input( $input ) {
    $output = '';
    list ( $part1, $part2 ) = explode( 'AND', $input );

    // Remove any unecessary characters and add the first part of the line format
    $output = "'" . str_replace( array( "\"", "'" ), '', trim( $part1 ) ) . "' & (";

    // Get a list of phrases in the query
    $phrases = explode( 'OR', str_replace( array( '(', ')'), '', $part2 ) );

    // Format the phrase
    foreach ( $phrases as &$phrase ) {
        $phrase = encapsulate_phrase( trim ( str_replace( array( "\"", "'"), '', $phrase ) ) );
    }

    // Add the formatted phrases to the output
    $output .= '(' . implode( ')|(', $phrases ) . ')';

    // Add the closing parenthesis
    $output .= ')';

    return $output;
}

// Split a search phrase into words, and encapsulate the words
function encapsulate_phrase( $phrase ) {
    $output = '';
    $words = explode( ' ', trim( $phrase ) );

    // Remove leading and trailing whitespace, and encapsulate words in single quotes
    foreach ( $words as &$word ) {
        $word = "'" . trim( $word ) . "'";
    }

    // Add each word to the output
    $output .= implode (" & ", $words);


    return $output;
}

You can test your inputs like this:

$desired_output = "'Something' & (('Some' & 'Phrase')|('Some' & 'Other' & 'Phrase'))";

if ( !assert ( $formatted_output == $desired_output ) ) {
    echo "Desired: $desired_output\n";
    echo "Actual:  $formatted_output\n";
}
else {
    echo "Output: $formatted_output\n";
}
share|improve this answer
    
Thanks George, that certainly gets me started. Need to make it a bit more generic but that has given me a massive headstart! – Dan Winer Jul 1 '11 at 11:15
    
Is there an example of this, but reformatted for MySQL? – Marc Maxson Jul 26 '12 at 19:46

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.