Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

In PostgreSQL 9.3, there are multiple ways to build an expression, which points to a json field's nested property:

data->'foo'->>'bar'
data#>>'{foo,bar}'
json_extract_path_text(data, 'foo', 'bar')

Therefore PostgreSQL only use these indexes, if the query's expression is an exact match with the index's expression.

CREATE TABLE json_test_index1(data json);
CREATE TABLE json_test_index2(data json);
CREATE TABLE json_test_index3(data json);

CREATE INDEX ON json_test_index1((data->'foo'->>'bar'));
CREATE INDEX ON json_test_index2((data#>>'{foo,bar}'));
CREATE INDEX ON json_test_index3((json_extract_path_text(data, 'foo', 'bar')));

-- these queries use an index, while all other combinations not:

EXPLAIN SELECT * FROM json_test_index1 WHERE data->'foo'->>'bar' = 'baz';
EXPLAIN SELECT * FROM json_test_index2 WHERE data#>>'{foo,bar}' = 'baz';
EXPLAIN SELECT * FROM json_test_index3 WHERE json_extract_path_text(data, 'foo', 'bar') = 'baz';

My questions are:

Is this behaviour intended? I thought the query optimizer should (at least) use the index with the #>> operator, when the query contains the appropriate call of json_extract_path_text() -- and vice versa.

If I want to use more of these expressions in my application (not just one, f.ex. stick to the -> & ->> operators), what indexes should I build? (I hope, not all of them.)

Are there any chance, that some future Postgres versions' optimizers will understand the equivalence of these expressions?

EDIT:

When i create an additional operator for that:

CREATE OPERATOR ===> (
    PROCEDURE = json_extract_path_text,
    LEFTARG = json,
    RIGHTARG = text[]
);

This query (table from the previous example) still not uses its index:

EXPLAIN SELECT * FROM json_test_index3 WHERE data ===> '{foo,bar}' = 'baz';

Bonus question:

While Postgres expands the operators into function calls (behind the scenes), why this still not using its index?

share|improve this question
    
With the current version (9.3) this is "intended" (i.e. it's just not implemented). 9.4 will include a major improvement in the way JSON is stored internally which enables it to be usefull for GIN indexes. And then the #>> operator should be able to make use of the index (similar to the current capabilities when indexing the hstore data type). If you have "flat" key/value pairs that you need to index, currently your only option is hstore (or wait for 9.4) –  a_horse_with_no_name Apr 24 at 13:53
    
@a_horse_with_no_name While GIN indexes on jsonb fields will give a much more flexible index, it will come with a cost of performance. I need to query only a few properties inside the json object and specific properties' own indexes should be faster. I just wonder, if there are multiple ways to get a property, why i need to build an index for all of them, to make use of one (in every case). Or am i missing something? –  pozs Apr 24 at 14:29
    
No, you are not missing something. A function based index is only used when the query contains the same expression as the one that was used in the index. Regarding the costs argument: I would think the costs for a single GIN index on a jsonb column will probably be less than the combined cost for several indexes (one for each property) that can only be used by a single expression (one for exactly that property). But currently I don't see a way how you can avoid that (provided you have to stick to JSON). –  a_horse_with_no_name Apr 24 at 14:37
    
@a_horse_with_no_name combined cost for several indexes is a fair point. I can wait for GIN indexes. –  pozs Apr 24 at 14:51

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged or ask your own question.