Why is count(*) returning multiple results in this subquery (postgresql)?

Question

I want to group the results of a count(*) query into value buckets. I'm testing this on the dellstore2 postgresql sample database. My query below returns the right answers, but does so once for each row in the table (several thousand identical results). I can fix this by adding LIMIT 1 the the end of the query but I'd like to understand why I'm getting the duplicates in case it points to a wider problem with my approach. The query is:

SELECT
    (SELECT count(*)
        FROM
            orders
        WHERE
            totalamount > 0 AND totalamount <= 100) AS ">0 <= 100",
    (SELECT count(*)
        FROM
            orders
        WHERE
            totalamount > 100 AND totalamount <= 200) AS ">100 <= 200"
...
FROM
    orders;

EDIT Andomar's answer also allowed me to find the following approach (adapted from an example in SQL in a nutshell (O'Reilly)). This lets me have the buckets in one column, with a row for each bucket/answer pairing. I thought I'd include it for anyone with that use-case:

SELECT CASE
        WHEN totalamount IS NULL THEN 'Unknown'
        WHEN totalamount <= 100 THEN 'Not more than 100'
        WHEN totalamount <= 200 THEN 'Not more than 200'
        ELSE 'Over 200'
    END "Bucket",
    COUNT(*) "Number of results"
FROM
    orders
GROUP BY CASE
        WHEN totalamount IS NULL THEN 'Unknown'
        WHEN totalamount <= 100 THEN 'Not more than 100'
        WHEN totalamount <= 200 THEN 'Not more than 200'
        ELSE 'Over 200'
    END
ORDER BY
    MIN(totalamount);

It seems there are a few closing parentheses missing. Please post the real code. — wildplasser, Commented Mar 20, 2012 at 12:11
Thanks wildplasser — now fixed (I'm working across two unconneted machines so couldn't cut and paste). — cms_mgr, Commented Mar 20, 2012 at 12:18

Andomar · Accepted Answer · 2012-03-20 12:13:54Z

6

You're selecting every row from orders, and then for each row, the subqueries are evaluated.

Consider this approach instead:

select  count(case when 0 < totalamount and totalamount <= 100 then 1 end)
            as "<0,100]"
,       count(case when 100 < totalamount and totalamount <= 200 then 1 end)
            as "<100,200]"
from    Orders

This would calculate both aggregates in a single table scan.

answered Mar 20, 2012 at 12:13

Andomar

239k54 gold badges387 silver badges412 bronze badges

Thanks Andomar, that's perfect.
– cms_mgr
Commented Mar 20, 2012 at 12:28

Add a comment |

Jesse van Assen · Accepted Answer · 2012-03-20 12:10:43Z

3

Because you included them as a subquery. From the looks of it, both the count subqueries are executed for each row in the orders table. If you omit the from piece from the main query you should only get one row.

For example, the query

SELECT 'John Doe' FROM orders

returns 'John Doe' for each row in the orders table.

SELECT 'John Doe'

only selects one row.

Disclaimer: this is sql server behaviour.

answered Mar 20, 2012 at 12:10

Jesse van Assen

2,3002 gold badges16 silver badges19 bronze badges

Thanks Jesse, this worked but I've ticked Andomar's solution above because I felt his approach to the query was more readable than my original.
– cms_mgr
Commented Mar 20, 2012 at 12:29

Add a comment |

Collectives™ on Stack Overflow

Why is count(*) returning multiple results in this subquery (postgresql)?

2 Answers 2

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Linked

Related