1

I want to group the results of a count(*) query into value buckets. I'm testing this on the dellstore2 postgresql sample database. My query below returns the right answers, but does so once for each row in the table (several thousand identical results). I can fix this by adding LIMIT 1 the the end of the query but I'd like to understand why I'm getting the duplicates in case it points to a wider problem with my approach. The query is:

SELECT
    (SELECT count(*)
        FROM
            orders
        WHERE
            totalamount > 0 AND totalamount <= 100) AS ">0 <= 100",
    (SELECT count(*)
        FROM
            orders
        WHERE
            totalamount > 100 AND totalamount <= 200) AS ">100 <= 200"
...
FROM
    orders;

EDIT Andomar's answer also allowed me to find the following approach (adapted from an example in SQL in a nutshell (O'Reilly)). This lets me have the buckets in one column, with a row for each bucket/answer pairing. I thought I'd include it for anyone with that use-case:

SELECT CASE
        WHEN totalamount IS NULL THEN 'Unknown'
        WHEN totalamount <= 100 THEN 'Not more than 100'
        WHEN totalamount <= 200 THEN 'Not more than 200'
        ELSE 'Over 200'
    END "Bucket",
    COUNT(*) "Number of results"
FROM
    orders
GROUP BY CASE
        WHEN totalamount IS NULL THEN 'Unknown'
        WHEN totalamount <= 100 THEN 'Not more than 100'
        WHEN totalamount <= 200 THEN 'Not more than 200'
        ELSE 'Over 200'
    END
ORDER BY
    MIN(totalamount);
2
  • It seems there are a few closing parentheses missing. Please post the real code. Commented Mar 20, 2012 at 12:11
  • Thanks wildplasser — now fixed (I'm working across two unconneted machines so couldn't cut and paste).
    – cms_mgr
    Commented Mar 20, 2012 at 12:18

2 Answers 2

6

You're selecting every row from orders, and then for each row, the subqueries are evaluated.

Consider this approach instead:

select  count(case when 0 < totalamount and totalamount <= 100 then 1 end)
            as "<0,100]"
,       count(case when 100 < totalamount and totalamount <= 200 then 1 end)
            as "<100,200]"
from    Orders

This would calculate both aggregates in a single table scan.

1
  • Thanks Andomar, that's perfect.
    – cms_mgr
    Commented Mar 20, 2012 at 12:28
3

Because you included them as a subquery. From the looks of it, both the count subqueries are executed for each row in the orders table. If you omit the from piece from the main query you should only get one row.

For example, the query

SELECT 'John Doe' FROM orders

returns 'John Doe' for each row in the orders table.

SELECT 'John Doe'

only selects one row.

Disclaimer: this is sql server behaviour.

1
  • Thanks Jesse, this worked but I've ticked Andomar's solution above because I felt his approach to the query was more readable than my original.
    – cms_mgr
    Commented Mar 20, 2012 at 12:29

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.