Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.
SELECT c.my_type || '-' || cs.my_site, cs.id
FROM table2 cs
JOIN table1 c ON cs.id = c.id
WHERE c.status = 'ACTIVE'

EXPLAIN ANALYZE:

Hash Join  (cost=927.36..66271.66 rows=1141298 width=24) (actual time=9.221..888.461 rows=825843 loops=1)
  Hash Cond: (cs.id = c.id)
  ->  Seq Scan on table2 cs  (cost=0.00..29630.92 rows=1652792 width=21) (actual time=0.006..320.526 rows=1668338 loops=1)
  ->  Hash  (cost=757.70..757.70 rows=13573 width=7) (actual time=8.995..8.995 rows=13044 loops=1)
        ->  Seq Scan on tabl1 c  (cost=0.00..757.70 rows=13573 width=7) (actual time=0.005..6.005 rows=13044 loops=1)
              Filter: ((status)::text = 'ACTIVE'::text)
Total runtime: 948.947 ms

vs

select c.my_type || '-' || cs.my_site, cs.id 
from (SELECT id, my_site FROM table1 WHERE status = 'ACTIVE') as c
join table2 cs on cs.id = c.id

EXPLAIN ANALYZE

Hash Join  (cost=927.36..66271.66 rows=1141298 width=24) (actual time=8.927..894.598 rows=821287 loops=1)
  Hash Cond: (cs.id = c.id)
  ->  Seq Scan on table2 cs  (cost=0.00..29630.92 rows=1652792 width=21) (actual time=0.005..331.769 rows=1663782 loops=1)
  ->  Hash  (cost=757.70..757.70 rows=13573 width=7) (actual time=8.706..8.706 rows=13044 loops=1)
        ->  Seq Scan on table1  (cost=0.00..757.70 rows=13573 width=7) (actual time=0.005..6.044 rows=13044 loops=1)
              Filter: ((status)::text = 'ACTIVE'::text)
Total runtime: 954.941 ms

I would expect the second query to be relatively faster than the first.. but it doesn't seem to be. Keep in mind that these are very large tables.

share|improve this question
    
Please show us the results from EXPLAIN ANALYZE, otherwise it's impossible to help you. –  Frank Heikens May 30 '14 at 15:20
    
Updated with the explain analyze –  user1243716 May 30 '14 at 15:28
3  
If you want us to help optimize a query, you need to show us the table and index definitions, as well as row counts for each of the tables. Maybe your tables are defined poorly. Maybe the indexes aren't created correctly. Maybe you don't have an index on that column you thought you did. Without seeing the table and index definitions, we can't tell. We also need row counts because that can affect query optimization greatly. If you know how to do an EXPLAIN or get an execution plan, put the results in the question as well. If you have no indexes, visit use-the-index-luke.com ASAP. –  Andy Lester May 30 '14 at 15:32
3  
Please explain why you would expect the second to perform better? You seem to have demonstrated that the Postgres optimizer is pretty smart. Your expectation is wrong. –  Gordon Linoff May 30 '14 at 15:32
    
In both cases your statictics seem to be off (compare the expected vs observed number of rows). An index might help, but I cannot tell the N:M structure from here or the cardinality of status = 'ACTIVE' –  wildplasser May 30 '14 at 15:35

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged or ask your own question.