Postgres data aggregation with variable columns

Question

I have a table of data containing time log information.

create table "time_records" (
    "id" serial NOT NULL PRIMARY KEY,
    "start" timestamp not null,
    "end" timestamp not null,
    "duration" double precision not null,
    "project" varchar(255) not null,
    "case" integer not null,
    "title" text not null,
    "user" varchar(255) not null
);

Here's a few lines of data:

"id","start","end","duration","project","case","title","user"
"1","2014-02-01 11:54:00","2014-02-01 12:20:00","26.18","Project A","933","Something done here","John Smith"
"2","2014-02-02 12:34:00","2014-02-02 15:00:00","146","Project B","990","Something else done","Joshua Kehn"
"3","2014-02-02 17:57:00","2014-02-02 18:39:00","41.38","Project A","933","Another thing done","Bob Frank"
"4","2014-02-03 09:30:00","2014-02-03 11:41:00","131","Project A","983","iOS work","Joshua Kehn"
"5","2014-02-03 10:22:00","2014-02-03 13:29:00","187.7","Project C","966","Created views for things","Alice Swiss"

I can pull bits and pieces of information out of this. For example, a list of every project with time logged between two dates or every person that worked between two dates.

What I would like is to be able to generate a report with the date and then each project across the top with the total amount of time logged for that project.

SELECT
    start::date,
    sum(duration / 60) as "time logged",
    project
FROM
    time_records
WHERE
    project = 'Project A'
GROUP BY
    start::date, project
ORDER BY
    start::date, project;

However I want multiple columns on the output, so combining a select distinct project with this in some way.

Final output would be something like:

date, project a total, project b total, project c total,
2014-02-01,0.5, 0.3, 10,
2014-02-02,1.3, 20, 3,
2014-02-03,20, 10, 10
...

I can get the total per-date per-project with something like:

SELECT
    start::date,
    sum(duration / 60) as "time logged",
    project
FROM
    time_records
GROUP BY
    start::date, project
ORDER BY
    start::date, project;

But then I have multiple dates in rows per-project. I need it to be one date with per-project totals on separate lines.

Does this make sense / is possible with just SQL not writing some code after the query?

Check out the 'crosstab' function from the 'tablefunc' module. — Craig Ringer
– Craig Ringer, Commented Mar 2, 2014 at 22:43

Community · Accepted Answer · 2017-05-23 10:26:11Z

1

For a "pivot" table or cross tabulation use the crosstab() function of the additional module tablefunc.

Table definition

Given this sanitized table definition without reserved SQL key words as identifiers (that's a big no-no, even if you can force it with double-quoting):

CREATE TEMP TABLE time_records (
    id serial PRIMARY KEY,
    t_start timestamp not null,
    t_end timestamp not null,
    duration double precision not null,
    project text not null,
    t_case integer not null,
    title text not null,
    t_user text not null
);

Query

Note how I use the variant with two parameters to deal with missing items in the result properly.

SELECT *
FROM  crosstab (
   $$
   SELECT t_start::date
         , project
         , round(sum(duration / 60)::numeric, 2) AS time_logged
   FROM    time_records
   GROUP   BY 1,2
   ORDER   BY 1,2
   $$
  ,$$VALUES ('Project A'), ('Project B'),('Project C')$$
  ) AS t (
      t_start   date
    , project_a text
    , project_b text
    , project_c text
  );

Result:

t_start    | project_a | project_b | project_c
-----------|-----------|-----------|----------
2014-02-01 | 0.44      |           |
2014-02-02 | 0.69      | 2.43      |
2014-02-03 | 2.18      |           | 3.13

Tested with Postgres 9.3.

Explanation, details and links in this related answer:
PostgreSQL Crosstab Query

edited May 23, 2017 at 10:26

CommunityBot

11 silver badge

answered Mar 3, 2014 at 0:56

Erwin Brandstetter

667k159 gold badges1.2k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Josh K Over a year ago

This looks perfect, thanks for the refresher on the reserved words. Given that, is it possible to have project_a etc generated from the table itself, possibly from a subquery?

Erwin Brandstetter Over a year ago

@JoshK: You would need two steps: 1. build a statement like the above based on data in your table. 2. execute that statement. SQL demands to know the return type, so this is not simply possible in a single step. I wrote a lot more under this related question.

Joachim Isaksson · Accepted Answer · 2014-03-03 05:49:35Z

1

A simple way would be to do a "manual" pivot using CASE;

SELECT DATE("start"),
 SUM(CASE WHEN "project"='Project A' THEN duration/60 ELSE 0 END) "Project A",
 SUM(CASE WHEN "project"='Project B' THEN duration/60 ELSE 0 END) "Project B",
 SUM(CASE WHEN "project"='Project C' THEN duration/60 ELSE 0 END) "Project C"
FROM time_records
GROUP BY DATE("start");

An SQLfiddle to test with.

You should be able to do something similar using CROSSTAB(), but I don't have access to PostgreSQL to load the module and test a query using that :-/

edited Mar 3, 2014 at 5:49

answered Mar 2, 2014 at 19:23

Joachim Isaksson

182k28 gold badges296 silver badges307 bronze badges

Collectives™ on Stack Overflow

Postgres data aggregation with variable columns

2 Answers 2

Table definition

Query

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Table definition

Query

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related