Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I can't find a definite answer to this question in the documentation. If a column is an array type, will all the entered values be individually indexed?

I created a simple table with 1 int[] column, and put a unique index on it. What I noticed is that I couldn't add the same array of ints, which leads me to believe the index is a composite of the array items, not an index of each item.

INSERT INTO "Test"."Test" VALUES ('{10, 15, 20}')
INSERT INTO "Test"."Test" VALUES ('{10, 20, 30}')

SELECT * FROM "Test"."Test" WHERE 20 = ANY ("Column1")

Is this index helping this query?

share|improve this question

This question has an open bounty worth +100 reputation from Tregoreg ending tomorrow.

The current answer(s) are out-of-date and require revision given recent changes.

I didn't find the current answers working. Using GIN index on array-typed column does not increase the performance of ANY() operator. Is there really no solution?

3 Answers 3

up vote 46 down vote accepted

Yes you can index an array, but you have to use the array operators and the GIN-index type.

Example:

    CREATE TABLE "Test"("Column1" int[]);
    INSERT INTO "Test" VALUES ('{10, 15, 20}');
    INSERT INTO "Test" VALUES ('{10, 20, 30}');

    CREATE INDEX idx_test on "Test" USING GIN ("Column1");

    -- To enforce index usage because we have only 2 records for this test... 
    SET enable_seqscan TO off;

    EXPLAIN ANALYZE
    SELECT * FROM "Test" WHERE "Column1" @> ARRAY[20];

Result:

Bitmap Heap Scan on "Test"  (cost=4.26..8.27 rows=1 width=32) (actual time=0.014..0.015 rows=2 loops=1)
  Recheck Cond: ("Column1" @> '{20}'::integer[])
  ->  Bitmap Index Scan on idx_test  (cost=0.00..4.26 rows=1 width=0) (actual time=0.009..0.009 rows=2 loops=1)
        Index Cond: ("Column1" @> '{20}'::integer[])
Total runtime: 0.062 ms
share|improve this answer
13  
As the OP surmises, this doesn't actually index individual array values, but instead indexes the entire array. So, while this will help the query in question (see explain plan), this means you can't create unique constraints (easily) on individual array values. That said, if you are using integer arrays, you can use the contrib module "intarray" to index individual array values, which can be much faster in many cases. (IIRC there is some work being done on this for text values, but contributors would probably be welcome to help finish it off). –  xzilla Dec 5 '11 at 18:13

It's now possible to index the individual array elements. For example:

CREATE TABLE test (foo int[]);
INSERT INTO test VALUES ('{1,2,3}');
INSERT INTO test VALUES ('{4,5,6}');
CREATE INDEX test_index on test ((foo[1]));
SET enable_seqscan TO off;

EXPLAIN ANALYZE SELECT * from test WHERE foo[1]=1;
                                                QUERY PLAN                                                    
------------------------------------------------------------------------------------------------------------------
 Index Scan using test_index on test  (cost=0.00..8.27 rows=1 width=32) (actual   time=0.070..0.071 rows=1 loops=1)
   Index Cond: (foo[1] = 1)
 Total runtime: 0.112 ms
(3 rows)

This works on at least Postgres 9.2.1. Note that you need to build a separate index for each array index, in my example I only indexed the first element.

share|improve this answer
8  
Let it not be lost - this approach is hopeless for variable length array where you want to use the ANY() operator. –  Καrτhικ Aug 5 '14 at 11:10
    
This is really not very useful. If you have a fixed number of array elements, you'd rather use individual columns for each element (and plain btree indices) instead of building a more expensive expression index for each array item. Storage of individual columns is much cheaper without array overhead, too. –  Erwin Brandstetter 12 hours ago

@Tregoreg raised a question in the comment to his offered bounty:

I didn't find the current answers working. Using GIN index on array-typed column does not increase the performance of ANY() operator. Is there really no solution?

@Frank's accepted answer tells you to use array operators, which is still correct in 2015.
Per documentation:

the standard distribution of PostgreSQL includes GIN operator classes for one-dimensional arrays, which support indexed queries using these operators:

 <@
 @>
 =
 &&

The complete list of built-in operator classes for GIN indices in the standard distribution is here.

In Postgres indexes are bound to operators, not data types or functions or anything else. That's a heritage from the original Berkeley design of Postgres and very hard to change now. And it's generally working just fine. Here is a thread on pgsql-bugs with Tom Lane commenting on this.

The ANY construct is not an operator and can be used in combination with various operators. When used as = ANY (array_expression) only indexes supporting the = operator qualify, and GIN indexes are out. Postgres is not currently smart enough to derive a GIN-indexable expression from it.

Related answers:

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.