Postgres query optimization (forcing a index scan)

Question

Below is my query. I am trying to get it to use an index scan, but it will only seq scan.

By the way the metric_data table has 130 million rows in it. The metrics table has about 2000 rows in it.

metric_data table columns:

  metric_id integer
, t timestamp
, d double precision
, PRIMARY KEY (metric_id, t)

How can I get this query to use my PRIMARY KEY index?

SELECT
    S.metric,
    D.t,
    D.d
FROM metric_data D
INNER JOIN metrics S
    ON S.id = D.metric_id
WHERE S.NAME = ANY (ARRAY ['cpu', 'mem'])
  AND D.t BETWEEN '2012-02-05 00:00:00'::TIMESTAMP
              AND '2012-05-05 00:00:00'::TIMESTAMP;

EXPLAIN:

Hash Join  (cost=271.30..3866384.25 rows=294973 width=25)
  Hash Cond: (d.metric_id = s.id)
  ->  Seq Scan on metric_data d  (cost=0.00..3753150.28 rows=29336784 width=20)
        Filter: ((t >= '2012-02-05 00:00:00'::timestamp without time zone)
             AND (t <= '2012-05-05 00:00:00'::timestamp without time zone))
  ->  Hash  (cost=270.44..270.44 rows=68 width=13)
        ->  Seq Scan on metrics s  (cost=0.00..270.44 rows=68 width=13)
              Filter: ((sym)::text = ANY ('{cpu,mem}'::text[]))

Erwin Brandstetter · Accepted Answer · 2014-02-27 15:20:04Z

up vote 4 down vote accepted

For testing purposes you can force the use of the index by "disabling" sequential scans in your current session:

SET enable_seqscan = OFF;

Details in the manual here. I quoted "disabling", because you cannot actually disable sequential table scans. But any other available option is now preferable for Postgres. This will prove that the multicolumn index on (metric_id, t) can be used - just not as effective as an index on the leading column.

You probably get better results (without creating more indexes) if you switch the order in your PRIMARY KEY to (t, metric_id).

Note that you do not normally have to force a better query plan by manual intervention. If setting enable_seq_scan = OFF leads to a much better plan, something is probably off with your installation. Consider this related answer.

edited Feb 27 at 15:20

answered Jan 28 '13 at 4:50

Erwin Brandstetter
118k14134236

1

Setting this flag made that query above run in 150ms compared to 45secs on my machine. Thanks! – Jeff Jan 28 '13 at 5:51

Very instructive answer. And incredible results. – klin Jan 28 '13 at 11:16

@Jeff: I added another hint to my answer. – Erwin Brandstetter Jan 28 '13 at 13:31

add a comment |

mvp · Answer 2 · 2013-01-28 04:00:47Z

You cannot force index scan in this case because it will not make it faster.

You currently have index on metric_data (metric_id, t), but server cannot take advantage of this index for your query, because it needs to be able to discriminate by metric_data.t only (without metric_id), but there is no such index. Server can use sub-fields in compound indexes, but only starting from the beginning. For example, searching by metric_id will be able to employ this index.

If you create another index on metric_data (t), your query will make use of that index and will work much faster.

Also, you should make sure that you have an index on metrics (id).

This is not quite correct. A multi-column index can be used on the second field alone, too. Even though not as effective. Consider this related question on dba.SE. — Erwin Brandstetter, Jan 28 '13 at 4:53

joop · Answer 3 · 2013-01-28 10:52:38Z

It appears you are lacking suitable FK constraints:

CREATE TABLE metric_data
( metric_id integer
, t timestamp
, d double precision
, PRIMARY KEY (metric_id, t)
, FOREIGN KEY metrics_xxx_fk (metric_id) REFERENCES metrics (id)
)

and in table metrics:

CREATE TABLE metrics
( id INTEGER PRIMARY KEY
...
);

Also check if your statistics are sufficient (and fine-grained enough, since you intend to select 0.2 % of the metrics_data table)

asked	1 year ago
viewed	2548 times
active	7 months ago

current community

your communities

more stack exchange communities

Postgres query optimization (forcing a index scan)

3 Answers 3

Your Answer

Not the answer you're looking for? Browse other questions tagged postgresql indexing query-optimization postgresql-9.1 or ask your own question.

Linked

Hot Network Questions

current community

your communities

more stack exchange communities

Postgres query optimization (forcing a index scan)

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged postgresql indexing query-optimization postgresql-9.1 or ask your own question.

Linked

Related

Hot Network Questions