Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I've got a PostgreSQL array of strings as a column in a table. I created an index using the GIN method. But ANY queries won't use the index (instead, they're doing a sequential scan of the whole table with a filter). What am I missing?

Here's my migration:

class CreateDocuments < ActiveRecord::Migration
  def up
    create_table :documents do |t|
      t.string :title
      t.string :tags, array: true, default: []
      t.timestamps
    end

    add_index :documents, :tags, using: 'gin'

    (1..100000).each do |i|
      tags = []
      tags << 'even' if (i % 2) == 0
      tags << 'odd' if (i % 2) == 1
      tags << 'divisible by 3' if (i % 3) == 0
      tags << 'divisible by 4' if (i % 4) == 0
      tags << 'divisible by 5' if (i % 5) == 0

      Document.create(
        title: i,
        tags: tags
      )
    end
  end

  def down
    drop_table :documents
  end
end

Here's my query, with the resulting number of rows.

Document.where("'divisible by 5' = ANY (tags)").explain
    Document Load (249.8ms)  SELECT "documents".* FROM "documents" WHERE ('divisible by 5' = ANY (tags))
    D, [2014-03-07T17:09:49.689709 #41937] DEBUG -- :   Document Load (249.8ms)  SELECT "documents".* FROM "documents" WHERE ('divisible by 5' = ANY (tags))
    => EXPLAIN for: SELECT "documents".* FROM "documents"  WHERE ('divisible by 5' = ANY (tags))
                       QUERY PLAN
    -----------------------------------------------------------------
    Seq Scan on documents  (cost=0.00..3500.00 rows=20057 width=69)
      Filter: ('divisible by 5'::text = ANY ((tags)::text[]))
    (2 rows)

Document.where("'divisible by 5' = ANY (tags)").length
    Document Load (258.0ms)  SELECT "documents".* FROM "documents" WHERE ('divisible by 5' = ANY (tags))
    D, [2014-03-07T17:09:55.536517 #41937] DEBUG -- :   Document Load (258.0ms)  SELECT "documents".* FROM "documents" WHERE ('divisible by 5' = ANY (tags))
    => 20000
share|improve this question
    
How many rows does the table products have? –  Igor Romanchenko Mar 7 at 16:37
    
What is the cardinality of that column?How many unique records reported to the number of rows? –  Mihai Mar 7 at 16:43
    
@IgorRomanchenko The table has just over 100,000 rows –  mohith Mar 7 at 17:50
    
@Mihai There are over 100,000 unique rows. The column in question has over 18,000 unique values –  mohith Mar 7 at 17:52
    
Probably that ANY keyword make the optimizer choose a sequential scan as the best option.How many rows does that query returns? –  Mihai Mar 7 at 18:02

1 Answer 1

up vote 2 down vote accepted

To work with a GIN index use the <@ ("is contained by") operator instead of the ANY construct.

The manual states here that default GIN indexes currently only support these operators (additional functionality is shipped with extensions):

<@
@>
=
&&

So try this query:

Document.where("'{divisible by 5}' <@ tags").explain

Note that the left hand side needs to be in array notation, too, even if it's a single element. The operator <@ works for arrays. Hence '{divisible by 5}'.

share|improve this answer
    
Brilliant. Thanks! –  mohith Mar 13 at 16:09

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.