Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I want to get a list of email domains and the top user within each domain. My approach is to sum the questions per email grouped by the domain and then get the top user with a window function. However this does not work:

SELECT 
  domain,
  sum(questions_per_email) as questions_per_domain,
  first_value(email) OVER (PARTITION BY domain ORDER BY questions_per_email DESC) as top_user
FROM (
    SELECT email,
           lower(substring(u.email from position('@' in u.email)+1)) as domain,
           count(*) as questions_per_email
      FROM questions q
      JOIN identifiers i ON (q.owner_id = i.id)
      JOIN users u ON (u.identifier_id = i.id)
    GROUP BY email
  ) as per_user
GROUP BY domain, top_user

And Postgres gives the following message:

ERROR:  column "per_user.questions_per_email" must appear in the GROUP BY clause or be used in an aggregate function
LINE 5: ...t_value(email) OVER (PARTITION BY domain ORDER BY questions_...
                                                             ^

I can't really see why this is. I pretty sure that one should be able to use a window function on the aggregated result. Please advice!

Thanks, Kristoffer

share|improve this question
    
You can't mix an aggregate function sum(questions_per_email) as questions_per_domain with windows function first_value(email) OVER ... in the same SELECT, try to change it into sum(questions_per_email) OVER (PARTITION BY domain ORDER BY questions_per_email DESC) –  kordirko Sep 8 '13 at 22:31

1 Answer 1

up vote 1 down vote accepted

you can change your query like this:

with cte1 as (
    SELECT email,
           lower(substring(u.email from position('@' in u.email)+1)) as domain
      FROM questions q
      JOIN identifiers i ON (q.owner_id = i.id)
      JOIN users u ON (u.identifier_id = i.id)
), cte2 as (
    select
        domain, email,
        count(*) as questions_per_email,
        first_value(email) over (partition by domain order by count(*) desc) as top_user
    from cte1
    group by email, domain
)
select domain, top_user, sum(questions_per_email) as questions_per_domain
from cte2
group by domain, top_user

sql fiddle demo

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.