Tell me more ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

This is how the data is formatted:

item_name | item_serial | sub_group | conc_stuff | other_data | more_data
----------+-------------+-----------+------------+------------+-----------
foo        bar-01-a      widget      b-3          towel        dent
foo        bar-02-a      widget      a-1          42           mouse
foo        bar-03-a      widget      p-1          babel        dolphin
foo3       bar-21-f      widget      f-1          42           marvin
foo3       bar-22-f      widget      x-1          poetry       vogon

I have gotten the query to perform the way I want it to, the problem is I need to return more data.

SELECT item_name, 
       array_to_string(array_agg(conc_stuff), ',') as stuff 
FROM dataset 
WHERE some_selector = 'X' 
GROUP BY item_name 
ORDER BY item_name;

I have tried what seems simple yet logical to me:

SELECT item_name, 
       item_serial, 
       sub_group, 
       array_to_string(array_agg(conc_stuff), ',') as stuff 
FROM dataset 
WHERE some_selector = 'X' 
GROUP BY item_name 
ORDER BY item_name;

I need to return something that looks like this:

item_name | item_serial | sub_group |   stuff
----------+-------------+-----------+-------------
foo        bar-01-a      widget      a-1,b-3,p-1
foo3       bar-21-f      widget      f-1,x-1,g-5
foo6       bar-81-z      widget      r-1,d-8,w-0

instead of just this:

item_name |   stuff
----------+--------------
foo        a-1,b-3,p-1
foo3       f-1,x-1,g-5
foo6       r-1,d-8,w-0

When I try to add additional fields to the query, I get :

ERROR: column "stuff.item_serial" must appear in the GROUP BY clause or be used in an aggregate function

But I dont want to GROUP BY item_serial, I just want it to be returned with the aggregate, right?

Do I need to run a subquery? Im sure this is simple. The if there are multiple methods, which is most efficient? Some of the text I will be concatenating are coordinates (LARGE string of text).

share|improve this question
1  
Is item_serial a unique column? If there is more than one item_serial for a given item_name value, which one should be displayed? – a_horse_with_no_name Sep 11 '12 at 19:13
none of the columns are, all basic. there can be multiple item_serials that belong to different sub_group – roberthuttinger Sep 11 '12 at 19:14
So which item_serial (or sub_group) do you want to take for a specific item_name? – a_horse_with_no_name Sep 11 '12 at 19:36
very sorry I just noticed that error... editing – roberthuttinger Sep 11 '12 at 19:57
1  
But the accepted answer will not return your example output. – a_horse_with_no_name Sep 11 '12 at 20:20
show 2 more comments

2 Answers

up vote 2 down vote accepted

You need to pick one value for each name you have. You can't have each name only returned once but the item_serial value multiple times.

Picking one value out of those that are there for a group value is done through aggregate functions:

SELECT item_name,
       min(item_serial),
       min(sub_group),
       array_to_string(array_agg(conc_stuff),',') AS stuff
FROM dataset
WHERE sub_group = 'widget'
GROUP BY item_name
ORDER BY item_name;

This picks th4e "first" item_serial and sub_group for each name.

If you want the last value, use max instead.

But the important thing to understand is that you have to pick one value for the non-grouped columns. And you need to tell the DBMS exactly which one it should use by supplying an aggregate function which picks one value.

SQLFiddle example: http://www.sqlfiddle.com/#!1/58009/1

share|improve this answer
brilliant! this solves the secondary issue! I checked slightly too soon! though @greg did supply a correct solution for the question I asked. I apologize for the confusion. and +10 for introducing me to sqlfiddle! – roberthuttinger Sep 11 '12 at 20:20

Postgres 8.4 is somewhat stricter than more recent versions of Pg regarding this. When using a GROUP BY, all fields not containing an aggregate functions need to be grouped.

From Pg9.1, if you GROUP BY primary key, all other non grouped fields can be omitted.

Hope it helps.

Edit:

SELECT 
  item_name, 
  item_serial, 
  sub_group, 
  array_to_string(array_agg(conc_stuff), ',') as stuff 
FROM 
  dataset 
WHERE 
  some_selector = 'X' 
GROUP BY 
  item_name,
  item_serial,
  sub_group 
ORDER BY 
  item_name;
share|improve this answer
so I can select the others if I group by even though I dont need them grouped? Now I see the example it makes sense. Ill try this method, thanks! – roberthuttinger Sep 11 '12 at 19:12
your answer solves the issue I started with, though after posting I realized that wasnt my issue. it was a faulty line of logic on my part. – roberthuttinger Sep 11 '12 at 20:14

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.