I have the following query:
Query:
SELECT
a1.id id,
((SELECT SUM(SFF.pnl)
FROM positions SFF
JOIN tag_grid_relations SF ON SFF.stock_id = SF.stock_id
WHERE SF.tag_value_id = a1.id AND SFF.fund_id = 3 AND SFF.date_id >= 42369)) c0
FROM tag_values a1 LEFT JOIN entities a2 ON a2.id = a1.entity_id
WHERE a1.tag_id = 642 AND a2.archive_on IS NULL;
Plan:
Nested Loop Left Join (cost=0.72..4531364.79 rows=157 width=4) (actual time=60.321..8007.218 rows=160 loops=1)
Filter: (a2.archive_on IS NULL)
-> Index Scan using tag_values_tag_id_name_key on tag_values a1 (cost=0.28..29.56 rows=159 width=8) (actual time=0.009..0.394 rows=160 loops=1)
Index Cond: (tag_id = 642)
-> Index Scan using entities_pkey on entities a2 (cost=0.43..2.45 rows=1 width=12) (actual time=0.008..0.009 rows=1 loops=160)
Index Cond: (id = a1.entity_id)
SubPlan 1
-> Aggregate (cost=28859.50..28859.51 rows=1 width=7) (actual time=50.026..50.026 rows=1 loops=160)
-> Merge Join (cost=5578.68..28498.68 rows=144330 width=7) (actual time=27.161..49.928 rows=248 loops=160)
Merge Cond: (sf.stock_id = sff.stock_id)
-> Sort (cost=5578.25..5589.54 rows=4517 width=4) (actual time=0.189..0.195 rows=12 loops=160)
Sort Key: sf.stock_id
Sort Method: quicksort Memory: 25kB
-> Subquery Scan on sf (cost=5213.70..5304.04 rows=4517 width=4) (actual time=0.163..0.177 rows=12 loops=160)
-> HashAggregate (cost=5213.70..5258.87 rows=4517 width=8) (actual time=0.147..0.156 rows=12 loops=160)
" Group Key: stock_fields.tag__2003, stock_fields.stock_id"
-> Append (cost=19.13..5191.12 rows=4517 width=8) (actual time=0.040..0.135 rows=12 loops=160)
-> Bitmap Heap Scan on stock_fields (cost=19.13..188.21 rows=167 width=8) (actual time=0.008..0.008 rows=0 loops=160)
Recheck Cond: ((tag__2003 IS NOT NULL) AND (tag__2003 = a1.id))
-> Bitmap Index Scan on stock_fields_tag__2003_idx (cost=0.00..19.09 rows=167 width=0) (actual time=0.006..0.006 rows=0 loops=160)
Index Cond: ((tag__2003 IS NOT NULL) AND (tag__2003 = a1.id))
-> Bitmap Heap Scan on stock_fields stock_fields_1 (cost=583.49..4907.24 rows=4300 width=8) (actual time=0.006..0.006 rows=0 loops=160)
Recheck Cond: ((tag__2021 IS NOT NULL) AND (tag__2021 = a1.id))
-> Bitmap Index Scan on stock_fields_tag__2021_idx (cost=0.00..582.41 rows=4300 width=0) (actual time=0.005..0.005 rows=0 loops=160)
Index Cond: ((tag__2021 IS NOT NULL) AND (tag__2021 = a1.id))
-> Index Scan using stock_fields_tag__2048_idx on stock_fields stock_fields_2 (cost=0.41..3.46 rows=2 width=8) (actual time=0.006..0.006 rows=0 loops=160)
Index Cond: ((tag__2048 IS NOT NULL) AND (tag__2048 = a1.id))
-> Index Scan using stock_fields_tag__2034_idx on stock_fields stock_fields_3 (cost=0.41..2.44 rows=1 width=8) (actual time=0.012..0.098 rows=12 loops=160)
Index Cond: ((tag__2034 IS NOT NULL) AND (tag__2034 = a1.id))
-> Index Scan using stock_fields_extension1_f102897_idx on stock_fields_extension1 (cost=0.29..44.59 rows=47 width=8) (actual time=0.004..0.004 rows=0 loops=160)
Index Cond: ((f102897 IS NOT NULL) AND (f102897 = a1.id))
-> Materialize (cost=0.43..20847.44 rows=45820 width=11) (actual time=0.023..51.730 rows=34342 loops=130)
-> Index Scan using positions_fund_id_stock_id_date_id_key on positions sff (cost=0.43..20732.89 rows=45820 width=11) (actual time=0.021..39.147 rows=34342 loops=130)
Index Cond: ((fund_id = 3) AND (date_id >= 42369))
Planning time: 2.149 ms
Execution time: 8007.411 ms
The above query takes >8s most of which is spent materializing the same subquery for each row 130 times:
-> Materialize (cost=0.43..20847.44 rows=45820 width=11) (actual time=0.023..52.512 rows=34342 loops=130)
-> Index Scan using positions_fund_id_stock_id_date_id_key on positions sff (cost=0.43..20732.89 rows=45820 width=11) (actual time=0.021..39.828 rows=34342 loops=130)
Index Cond: ((fund_id = 3) AND (date_id >= 42369))
Why is PostgreSQL doing this?