PostgreSQL 8.4; Three tables - store (~100k, pk id, fk supplier_id & item_id), supplier(~10 pk supplier_id), item(~1000 pk item_id);
I created the following query to get the data I need:
SELECT store.quantity, store.price, x.supplier_name
FROM store NATURAL JOIN
(SELECT * FROM item NATURAL JOIN supplier) AS x
WHERE store.price > 500 AND store.quantity > 0 AND
store.quantity < 100 AND
x.item_name = 'SomeName';
The query plan:
Nested Loop (cost=20.76..6513.55 rows=8 width=229)
-> Hash Join (cost=20.76..6511.30 rows=8 width=15)
Hash Cond: (store.item_id = item.item_id)
-> Seq Scan on store (cost=0.00..6459.00 rows=8388 width=23)
Filter: ((price > 500::numeric) AND (quantity > 0) AND (quantity < 100))
-> Hash (cost=20.75..20.75 rows=1 width=8)
-> Seq Scan on item (cost=0.00..20.75 rows=1 width=8)
Filter: ((item_name)::text = 'SomeName'::text)
-> Index Scan using supplier_pkey on supplier (cost=0.00..0.27 rows=1 width=222)
Index Cond: (supplier.supplier_id = store.supplier_id)
Now the aim is to reduce the cost by more than 30% by optimizing the query itself. The only instances of this problem I found were solved by modifying the table or the server settings, but I am looking to do this by modifying nothing else than the query and that's where I fell short in research.
Clearly the issue to be solved is the Seq Scan, which brings me to thinking I need to arrange it so that the scanning/filtering is applied only to a subset of the store table - but iirc you need to scan the table in any such case, so maybe use something else than a Seq Scan? Index scan isn't going to help since I wouldn't be filtering by the index... I'm puzzled here because this seems more of a choice that the PostgreSQL optimizer makes and not something I can change at will...
(If you're wondering, this was part of an assignment and I'm asking here because I have spent quite a few hours researching the problem failing to find anything relevant, and I just gave up on it, but I'm still curious...)