EXPLAIN ANALYZE
SELECT count(*)
FROM "businesses"
WHERE (
source = 'facebook'
OR EXISTS(
SELECT *
FROM provider_business_map pbm
WHERE
pbm.hotstepper_business_id=businesses.id
AND pbm.provider_name='facebook'
)
);
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=233538965.74..233538965.75 rows=1 width=0) (actual time=116169.720..116169.721 rows=1 loops=1)
-> Seq Scan on businesses (cost=0.00..233521096.48 rows=7147706 width=0) (actual time=11.284..116165.646 rows=3693 loops=1)
Filter: (((source)::text = 'facebook'::text) OR (alternatives: SubPlan 1 or hashed SubPlan 2))
SubPlan 1
-> Index Scan using idx_provider_hotstepper_business on provider_business_map pbm (cost=0.00..16.29 rows=1 width=0) (never executed)
Index Cond: (((provider_name)::text = 'facebook'::text) AND (hotstepper_business_id = businesses.id))
SubPlan 2
-> Index Scan using idx_provider_hotstepper_business on provider_business_map pbm (cost=0.00..16.28 rows=1 width=4) (actual time=0.045..5.685 rows=3858 loops=1)
Index Cond: ((provider_name)::text = 'facebook'::text)
Total runtime: 116169.820 ms
(10 rows)
This query takes over a minute and it's doing a count that results in ~3000. It seems the bottleneck is the sequential scan but I'm not sure what index I would need on the database to optimize this. It's also worth noting that I haven't tuned postgres so if there's any tuning that would help it may be worth considering. Although my DB is 15GB and I don't plan on trying to fit all of that in memory anytime soon so I'm not sure changing RAM related values would help a lot.