(Using Postgres 9.1)
My program deals with insecticide sprayers attempting to spray a number of actual units based on an original target. For example, the sprayer Joe was supposed to spray 10 targets but he actually sprayed 7.
I am provided a table which is the summary of all planned targets (column= targets) and actual targets (column= actual), along with some other data, including the sprayer's name. Here's the schema:
CREATE TABLE spray_summary
(
id character varying(1),
target integer,
ref_id character varying(1),
actual integer,
sprayer character varying(25)
)
The data is a denormalized join between (id, target) and (ref_id, actual, sprayer), but for now this table is all I have to work with. Here's the full results:
SELECT * FROM spray_summary
+----+--------+--------+--------+---------+ | id | target | ref_id | actual | sprayer | +----+--------+--------+--------+---------+ | a | 1 | "l" | 10 | "Joe" | | a | 1 | "m" | 10 | "Joe" | | a | 1 | "p" | 10 | "Joe" | | c | 3 | "n" | 10 | "Joe" | | c | 3 | "o" | 10 | "Joe" | +----+--------+--------+--------+---------+
You can see that id value "a" is repeated three times due to the one-to-many join and "c" repeats twice. Given this, what I want to do is show the SUM of target values, the SUM of "actual" values, and the sprayer. I used the following query:
SELECT SUM(target) targets, SUM(actual) actuals, sprayer FROM spray_summary GROUP BY sprayer
Which returned the result:
+--------+--------+---------+ | target | actual | sprayer | +--------+--------+---------+ | 9 | 50 | "Joe" | +--------+--------+---------+
While the sum is correct (5 * 10 = 50) for actual values, the target values is being multiplied because the dataset is denormalized. I want "target" to be unique relative to id and sprayer, so I tried a window function:
SELECT SUM(target) OVER(PARTITION BY sprayer, id),
sprayer,
SUM(actual)
FROM spray_summary
GROUP BY sprayer, target, id
This gives me the results:
+--------+--------+---------+ | target | actual | sprayer | +--------+--------+---------+ | 1 | 30 | "Joe" | | 3 | 20 | "Joe" | +--------+--------+---------+
Still not right! The right solution would give the following:
+--------+--------+---------+ | target | actual | sprayer | +--------+--------+---------+ | 4 | 50 | "Joe" | +--------+--------+---------+
But no matter how much I try to tweak the window function the rows are split because I have to GROUP BY target which breaks up the rows. Any ideas? I know this can be rewritten by joining the table with itself, once for SUM target and once for SUM actual, but I don't have that option. I can only modify the column definitions.
Thanks in advance.
EDIT: I know this can be solved by pushing some of the logic into a sub-query, but I'm looking for a column-level solution if that's possible. The SQL is auto-generated, so I don't have much control over the structure, but I can modify the column definitions, hence I'm looking for a column-level solution, like a window function.
Worst case scenario is Postgres can't solve this at the column level and I'll have to re-write the SQL generators.