Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I have a query that worked fine on MySQL and now doesn't on PostgreSQL This is the query:

  SELECT "users".* 
    FROM "users" 
    JOIN "favorites" ON "favorites"."user_id" = "users"."id"
   WHERE users.id NOT IN (2)
     AND favorites.favoritable_id IN (1)
GROUP BY favorites.user_id 
ORDER BY RANDOM() 
   LIMIT 5

This is the error:

column "users.id" must appear in the GROUP BY clause or be used in an aggregate function

I have looked up this error but couldn't really find out what I must to do fix this.

share|improve this question

3 Answers 3

up vote 4 down vote accepted

MySQL (and SQLite for that matter) do not require you to specify every column in the GROUP BY clause that are not wrapped in aggregate functions (IE: MIN, MAX, COUNT, SUM, etc). This is by design, and is also ANSI spec. However, this means that the values for those columns will be arbitrary -- they can not be guaranteed every time the query is run.

PostgreSQL/Oracle/SQL Server require you to either specify each column not wrapped in aggregate functions in the group by, or re-write the query so it is not necessary. Use:

  SELECT u.*
    FROM USERS u
   WHERE EXISTS (SELECT NULL
                   FROM FAVORITES f
                  WHERE f.user_id = u.id
                    AND f.favoritable_id = 1)
     AND u.id != 2
ORDER BY RANDOM()
   LIMIT 5
share|improve this answer
    
Can I still use the IN () syntax? I use that because normally tons of values will be there, not just one. –  Draiken Aug 10 '11 at 14:51
    
I never used this EXISTS before (probably because I suck with pure SQL and only used mysql in my life) but, is it preferred instead of joins (where applicable) or is this specific to this case? –  Draiken Aug 10 '11 at 14:58
    
@Draiken: Yes, if you aren't returning columns from the table you're joining to - IN or EXISTS are preferable. Because using a join can inflate the result set if a parent record has more than one child. –  OMG Ponies Aug 11 '11 at 1:39

MySQL is awfully lax when it comes to the GROUP BY-clause. In MySQL, you can add columns to your select clause that are not in the GROUP BY or an aggregate functions. Other RDBMS do not allow this, as it makes no sense semantically. MySQL just returns more or less random row values that fit the given query.

From the MySQL Doc

In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause. [...] MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause.

What you have to do is, find out what the query is supposed to do and rewrite it.

share|improve this answer

Another option in PostgreSQL is the distinct on clause:

select distinct on (col1, col2) * from users;

this will do pretty much what mysql was doing.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.