I have a table amongst whose columns are id and created_at and I want to use window functions around the created_at of each entry to count how many entries there are within 48 hours of them. As an example, for the original table:

 id | created_at
----|------------
 01 | 2016/01/04
 02 | 2016/01/05
 03 | 2016/01/05
 04 | 2016/01/06
 05 | 2016/01/07
 06 | 2016/01/08
 07 | 2016/01/08
 08 | 2016/01/09

and the result should be

 id | created_at | count
----|------------|-------
 01 | 2016/01/04 |   4
 02 | 2016/01/05 |   5
 03 | 2016/01/05 |   5
 04 | 2016/01/06 |   7
 05 | 2016/01/07 |   7
 06 | 2016/01/08 |   5
 07 | 2016/01/08 |   5
 08 | 2016/01/09 |   4

The explanation is that since there are 2 transactions on 2016/01/05, 1 on 2016/01/06, 1 on 2016/01/07, 2 on 2016/01/08, and 1 on 2016/01/09, there are a total of 7 transactions within 2 days of transaction 05.

share|improve this question
    
Please consider describing what have you tried and what failed so people can help you – НЛО Mar 14 '16 at 18:23
    
I haven't tried anything, I have no idea how to even start. – Pedro Carvalho Mar 14 '16 at 18:26

It is better to use a date table that have consecutive dates in case dates in your table have gaps. I am wondering what's the role of the id column? Here is how I would do it without considering the id column.

select row_number()over(order by dt) as id
,dt as created_at
,cnt1+cnt2+cnt3+cnt4+cnt5 as cnt
from
(
    select
    date_table.dt
    ,lag(cnt,2,0)over(order by created_at asc) as cnt1
    ,lag(cnt,1,0)over(order by created_at asc) as cnt2
    ,isnull(cnt,0) cnt3
    ,lead(cnt,1,0)over(order by created_at asc) as cnt4
    ,lead(cnt,2,0)over(order by created_at asc) as cnt5
    from
    date_table left join
    (select created_at,count(*) as cnt from your_table group by created_at) c
on date_table.day = c.created_at
) T
share|improve this answer
    
There are several other columns that aren't mentioned, id just identifies each row of the more interesting things. – Pedro Carvalho Mar 14 '16 at 23:46
    
So the other columns are not relevant? And you just want total count by day and then by plus/minus 2 day range? I tested my query with the database I work with and it works. – Lizzy Mar 15 '16 at 17:51
    
48 hours, not 2 days, I edited that in the question. – Pedro Carvalho Mar 15 '16 at 18:38

Using window functions for this purpose is challenging because of the duplicate days. You can get the results using a join or correlated subquery:

select t.*,
       (select count(*)
        from t t2
        where t2 between t.created_at - interval 2 * '1 day' and 
                         t.created_at + interval 2 * '1 day'
       ) as cnt
from t;

EDIT:

You could use window functions by doing a cumulative sum by date and then joining back. This is, of course, a bit challenging because of holes in the dates. But, something like this:

with c as (
      select d.dte, count(t.created_at) as cnt,
             sum(count(t.created_at))) over (order by d.dte) as cumecnt
      from (select generate_series(min(created_at) - interval '2 day', 
                                   max(created_at) + interval '2 day', 
                                   '1 day')
            from t
           ) d(dte) left join
           on d.dte = t.created_at
     )
select t.*, cmax.cumecnt - cmin.cumecnt
from t join
     c cmin
     on t.created_at = cmin.dte + interval '2 day' join
     c cmax
     on t.created_at = cmax.dte - interval '2 day';
share|improve this answer
    
The problem with that is that there are 500 thousand rows in the table, that's really slow. – Pedro Carvalho Mar 14 '16 at 18:35
    
You should try with an index on created_at. It might not be quite as slow as you think. – Gordon Linoff Mar 14 '16 at 18:36
    
I'm not sure what you mean by "try with an index on created_at". – Pedro Carvalho Mar 14 '16 at 18:48
    
@PedroCarvalho . . . The subquery might not be so expensive if you have an index on created_at. – Gordon Linoff Mar 14 '16 at 23:26
    
What's "an index on created_at" is what I mean. – Pedro Carvalho Mar 14 '16 at 23:42

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.