Take the 2-minute tour ×
Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

I have some tables that include websites and visitors and I am trying to write a query that produces a 'top 10' for each site, the tables look something like this...

websites
-------
website_id (pk)
url
description

visitors
-------
visitor_id (pk)
ip_address

website_visits
--------
visit_id (pk)
website_id (fk)
visitor_id (fk)
visit_date
visit_duration

I know that I can use two GROUP BY 's to produce a single list ordered by website using the query below...

SELECT w.website_id, v.visitor_id, count(wv.visit_id)
FROM website_visits ws, websites w, visitors v
WHERE wv.website_id = w.website_id and wv.visitor_id = v.visitor_id
GROUP BY w.website_id, v.visitor_id
ORDER BY w.website_id ASC, count(wv.visit_id) DESC

But I would like to improve this by a) only showing the top ten for each site, and b) grouping each top 10 result into an array, so that the results look something like this:

website_id    top_ten
432           [{visitor_id:354,visits:958},{visitor_id:987,visits:971}...etc]
325           [{visitor_id:456,visits:955},{visitor_id:1056,visits:465}...etc]

(I have used JSON here but it could be in any format)

I have experimented with using the SELECT ARRAY(...) function in a subquery but I cannot seem to get it working and I am not sure if thats the correct approach?

Can someone advise? Thanks

Using PostgreSQL 9.0 by the way

share|improve this question

1 Answer 1

up vote 3 down vote accepted
+50

Maybe something like this:

with ranked_visits as (
  SELECT w.website_id, 
         v.visitor_id, 
         count(wv.visit_id) as visits,
         row_number() over (partition by w.website_id order by count(wv.visit_id) desc) as rnk
  FROM website_visits wv
    JOIN websites w ON wv.website_id = w.website_id
    JOIN visitors v ON wv.visitor_id = v.visitor_id
  GROUP BY w.website_id, v.visitor_id
  ORDER BY w.website_id ASC, count(wv.visit_id) DESC
)
select website_id, string_agg('visitor_id: '||visitor_id||',visits:'||visits, ', ')
from ranked_visits
where rnk <= 10
group by website_id;
share|improve this answer
    
This query might give more than 10 results because of dense_rank(). If you switch to row_number(), rnk <= 10 would give a maximum of 10. –  Claes Mogren Oct 16 '14 at 13:27
    
@ClaesMogren: good point, fixed. –  a_horse_with_no_name Oct 16 '14 at 13:58
    
@Great, looks like im on the right tracks however its only returning a few visitors for each website (the original dense_rank is returning more but still not 10 for each site)? –  DaveB Oct 16 '14 at 19:13
    
@a_horse_with_no_name Ah ok got it, it just needed a 'partition by w.website_id' inside the window function (otherwise the rank is based on the ungrouped results)...nice one –  DaveB Oct 16 '14 at 19:22
    
@DaveB: you are right. I forgot that. –  a_horse_with_no_name Oct 16 '14 at 19:23

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.