PostgreSQL index not used for query on range

Question

I'm using PostgreSQL (9.2.0) and have a table of IP ranges. Here's the SQL:

CREATE TABLE ips
(
  id serial NOT NULL,
  begin_ip_num bigint,
  end_ip_num bigint,
  country_name character varying(255),
  CONSTRAINT ips_pkey PRIMARY KEY (id )
)

I've added indices on both begin_ip_num and end_ip_num:

CREATE INDEX index_ips_on_begin_ip_num
  ON ips
  USING btree
  (begin_ip_num );

CREATE INDEX index_ips_on_end_ip_num
  ON ips
  USING btree
  (end_ip_num );

The Query being used is:

SELECT "ips".* FROM "ips" WHERE (3065106743 BETWEEN begin_ip_num AND end_ip_num);

The problem is that my BETWEEN query is only using the index on begin_ip_num. After using the index, it filters the result using end_ip_num. Here's the EXPLAIN ANALYZE result:

Index Scan using index_ips_on_begin_ip_num on ips  (cost=0.00..2173.83 rows=27136 width=76) (actual time=16.349..16.350 rows=1 loops=1)
Index Cond: (3065106743::bigint >= begin_ip_num)
Filter: (3065106743::bigint <= end_ip_num)
Rows Removed by Filter: 47596
Total runtime: 16.425 ms

I've already tried various combinations of indices including adding a composite index on both begin_ip_num and end_ip_num.

I don't see your query and your version of Postgres? This obviously isn't a "a table of ip addresses", but a table of IP ranges. Please explain what you are trying to achieve. — Erwin Brandstetter, Jan 18 '13 at 21:22
If IP ranges have a significant impact in the db's target business than you should consider using the ip4r extension. — Clodoaldo Neto, Jan 19 '13 at 10:07

Erwin Brandstetter · Accepted Answer · 2013-01-18 22:39:38Z

Try a multicolumn index, but with reversed order on the second column:

CREATE INDEX index_ips_begin_end_ip_num ON ips (begin_ip_num, end_ip_num DESC);

Ordering is mostly irrelevant for a single-column index, since it can be scanned backwards almost as fast. But it is important for multicolumn indexes.

With the index I propose, Postgres can scan the first column and find the address, where the rest of the index fulfills the first condition. Then it can, for each value of the first column, return all rows that fulfill the second condition, until the first one fails. Then jump to the next value of the first column, etc.
This is still not very effective and Postgres may be faster just scanning the first index column and filtering for the second. Very much depends on your data distribution.

What would really help here is a GiST index for a int8range column, available since PostgreSQL 9.2.

Barring that, you can check out this closely related answer on dba.SE with a rather sophisticated regime with partial indexes. Advanced stuff, but it delivers great performance.

Either way, CLUSTER using the multicolumn index from above can help performance:

CLUSTER ips USING index_ips_begin_end_ip_num

This way, candidates fulfilling your first condition are packed onto the same or adjacent data pages. Can help performance a lot with if you have lots of rows per value of the first column. Else it is not effective.

Also, is autovaccuum running or have you run ANALYZE on the table? You need current statistics for Postgres to pick appropriate query plans.

Thanks a lot! I've settled with the composite index. Adding DESC to end_ip_num did make a difference. Just as a side not, the composite index you've mentioned should only work >= 9.2.0, right? — Zain Zafar, Jan 18 '13 at 22:32
@ZainZafar: The multicolumn (= composite) index I mention works in all versions this side of the millennium (at least since 7.3, probably longer). The linked solution on dba.SE has been tested with 9.1. The range type in combination with a GiST index requires 9.2. — Erwin Brandstetter, Jan 18 '13 at 22:36

pbnelson · Answer 2 · 2014-03-26 20:18:11Z

I had exactly this same problem on a nearly identical dataset from maxmind.com's free geiop table. I solved it using Erwin's tip about range types and GiST indexes. The GiST index was key. Without it I was querying at best about 3 rows per second. With it I queried nearly 500000 rows in under 10 seconds! Since Erwin didn't post detailed instructions on how to do this, I thought I'd add them, here...

First of all, you must add a new column having the range type, note that int8range is required for bigint types. Next set its values appropriately, note that the '[]' parameter indicates to make the range inclusive at lower and upper bounds (rtfm). Finally add the index, note that the GiST index is where all the performance advantage comes from.

alter table ips add column iprange int8range;
update ips set iprange=int8range(begin_ip_num, end_ip_num, '[]');
create index index_ips_on_iprange on ips using gist (iprange);

Having laid the groundwork, you can now use the '<@' contained-by operator to search specific addresses against the table. See http://www.postgresql.org/docs/9.2/static/functions-range.html

SELECT "ips".* FROM "ips" WHERE (3065106743 <@ iprange);

a1ex07 · Answer 3 · 2013-01-18 21:32:46Z

up vote 0 down vote

I believe your query looks like WHERE [constant] BETWEEN begin_ip_num AND end_ipnum or

As far as I know Postgres doesn't have "AND-EQUAL " access plan, so you need to add a composite index on 2 columns as suggested by Erwin Brandstetter.

answered Jan 18 '13 at 21:32

a1ex07
23.7k42859

I've tried using a composite index and it results in a sequential scan. – Zain Zafar Jan 18 '13 at 21:46

add a comment |

asked	1 year ago
viewed	1015 times
active	4 months ago

current community

your communities

more stack exchange communities

PostgreSQL index not used for query on range

3 Answers 3

Your Answer

Not the answer you're looking for? Browse other questions tagged postgresql database-design index between or ask your own question.

Visit Chat

Linked

Hot Network Questions

current community

your communities

more stack exchange communities

PostgreSQL index not used for query on range

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged postgresql database-design index between or ask your own question.

Visit Chat

Linked

Related

Hot Network Questions