3

A postgres query selects the elements which have a name column containing any strings from an array:

select "elements".* from "elements" where "elements"."name" ~* 'hap|bir'

This works well.

Now, what is the regex pattern to select the elements only if the column contains every strings of the array (in no particular order)?

3 Answers 3

13

Your query produces the same result as the following non-regex LIKE expression using an array

SELECT * 
  FROM elements 
WHERE lower(name) LIKE ANY(ARRAY['%hap%', '%bir%']);

So, to select from elements only if the column contains every strings of the array, you change it from ANY to ALL

SELECT * 
  FROM elements 
WHERE lower(name) LIKE ALL(ARRAY['%hap%', '%bir%']);

Demo

Sign up to request clarification or add additional context in comments.

10 Comments

nope, it's not actually equivalent to pass two regexps to each string in the set than to pass only one. Regexp matching in a database is difficult to implement when you allow head wilcard expressions (you need a partial match search index, and you have to pass it through all the regexps in the array, making the search sequential, and used n times, when you have an array of different regexps) A single regexp decides in just one pass through the search string, and as such, is more efficient than having n regexps to match.
@LuisColorado : Pardon my ignorance. I couldn't understand clearly what you are saying. If you have an answer feel free to put one.Thanks!
I'm sorry, you have said Your query is actually equivalent to the following non-regex LIKE expression using an array, and it is not. Both queries require passing over the data trying to match a string (the candidate string) through a regexp matcher, but you feed your regexp in one array of n-regex, and the question uses only one. there's a n-to-1 pass difference between both queries. To be exactly equal, you need to construct a single regex that matches the ones in your array, and that's what the | operator does in normal regex. I dont write an answer...
... because this is not an answer to the original question. Should I have an answer to the original question, I had written one.
@LuisColorado : Ah! ok. When I meant equivalent, I said it gives the same results. I didn't worry about the implementation part of it. I don't know how Postgres implements the first query, so I can't say anything about it if it does in n pass or 1 pass.
|
4

Guess what, you can actually combine the regexp match operator, with ALL over an array.

SELECT * 
  FROM elements 
WHERE name ~* ALL(ARRAY['\mhap\M', '\mbir\M'])

This may be particularly advantageous if your search text is not a single field, but a concatenation of several fields.

(Note: \m is start of word, \M is end of word.)

Comments

1

Here is a regex pattern to select the elements only if the column contains every strings of the array (in no particular order):

select "elements".* from "elements" where "elements"."name" ~* '(?=.*?(hap))(?=.*?(bir))'

from here.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.