4

I have table myTable with a JSONB column myJsonb with a data structure that I want to index like:

{
  "myArray": [
    {
      "subItem": {
        "email": "[email protected]"
      }
    },
    {
      "subItem": {
        "email": "[email protected]"
      }
    }
  ]
}

I want to run indexed queries on email like:

SELECT *
FROM mytable
WHERE '[email protected]' IN (
  SELECT lower(
      jsonb_array_elements(myjsonb -> 'myArray')
      -> 'subItem'
      ->> 'email'
  )
);

How do I create a Postgres JSONB index for that?

1 Answer 1

4

If you don't need the lower() in there, the query can be simple and efficient:

SELECT *
FROM   mytable
WHERE  myjsonb -> 'myArray' @> '[{"subItem": {"email": "[email protected]"}}]'

Supported by a jsonb_path_ops index:

CREATE INDEX mytable_myjsonb_gin_idx ON mytable
USING  gin ((myjsonb -> 'myArray') jsonb_path_ops);

But the match is case-sensitive.

Case-insensitive!

If you need the search to match disregarding case, things get more complex.

You could use this query, similar to your original:

SELECT *
FROM   t
WHERE  EXISTS (
   SELECT 1
   FROM   jsonb_array_elements(myjsonb -> 'myArray') arr
   WHERE  lower(arr #>>'{subItem, email}') = '[email protected]'
   );

But I can't think of a good way to use an index for this.

Instead, I would use an expression index based on a function extracting an array of lower-case emails:

Function:

CREATE OR REPLACE FUNCTION f_jsonb_arr_lower(_j jsonb, VARIADIC _path text[])
  RETURNS jsonb LANGUAGE sql IMMUTABLE AS
'SELECT jsonb_agg(lower(elem #>> _path)) FROM jsonb_array_elements(_j) elem';

Index:

CREATE INDEX mytable_email_arr_idx ON mytable
USING  gin (f_jsonb_arr_lower(myjsonb -> 'myArray', 'subItem', 'email') jsonb_path_ops);

Query:

SELECT *
FROM   mytable 
WHERE  f_jsonb_arr_lower(myjsonb -> 'myArray', 'subItem', 'email') @> '"[email protected]"';

While this works with an untyped string literal or with actual jsonb values, it stops working if you pass text or varchar (like in a prepared statement). Postgres does not know how to cast because the input is ambiguous. You need an explicit cast in this case:

... @> '"[email protected]"'::text::jsonb;

Or pass a simple string without enclosing double quotes and do the conversion to jsonb in Postgres:

... @> to_jsonb('[email protected]'::text);

Related, with more explanation:

6
  • Case-insensitivity is a requirement; the latter appears to work. One workaround I was considering to make the former solution workable was to add a normalized column, but I couldn't figure out the syntax for jsonb_array_elements in an UPDATE statement to backfill the data for more than 1 subItem.
    – JJ Zabkar
    Dec 18, 2017 at 20:06
  • Hmm... I'm using JPA, but having trouble specifying the email as a parameter within the single-quotes...
    – JJ Zabkar
    Dec 18, 2017 at 20:53
  • @JJZabkar: You took note of the double-quotes (required for JSON strings) inside the single quotes (denoting Postgres string literals)? So the passed string has to come with surrounding double-quotes. Alternatively, you could extract a Postgres array consisiting of plain text strings. Recent example: dba.stackexchange.com/a/193378/3684 Dec 18, 2017 at 21:17
  • 1
    Yes, took note, but I can't find the right syntax to parameterize it within a JPA nativeQuery. Since you've (quite elegantly) solved the main question here, I believe it's appropriate to move my secondary issue to a new thread: stackoverflow.com/questions/47877601/…
    – JJ Zabkar
    Dec 18, 2017 at 22:52
  • Yes, fully agree. I think I may have an idea what's going wrong and appended some pointers above. Dec 18, 2017 at 23:42

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge that you have read and understand our privacy policy and code of conduct.

Not the answer you're looking for? Browse other questions tagged or ask your own question.