Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free, no registration required.

I'm working out a query that I've ran successfully in MySQL for a while, but in PostgreSQL it's not working

Here's the query:

    SELECT
  CONCAT('SELECT SUBSTRING(year1,7,4) as year2,',
  GROUP_CONCAT(sums),
  ' FROM yourtable GROUP BY year2')
FROM (
SELECT CONCAT('SUM(ame=\'', ame, '\') AS `', ame, '`') sums
FROM yourtable
GROUP BY ame
ORDER BY COUNT(*) DESC
LIMIT 3
) sm
INTO @sql;

PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

Fiddle created here for MySQL query http://sqlfiddle.com/#!2/f1ca5e/3

The first query generates query text that the second one executes. The generated text is something like:

SELECT 
    SUBSTRING(year1,7,4) as year2,
    SUM(ame='tim') AS `tim`,
    SUM(ame='kim') AS `kim`,
    SUM(ame='pol') AS `pol` 
FROM yourtable
GROUP BY year2;
share|improve this question
2  
Usually this type of situation can be solved by simply Googling, Binging or searching on other search engines something like: PREPARE in PostgreSQL or DEALLOCATE in PostgreSQL etc. –  Radu Gheorghiu Jun 6 at 9:16
    
It is generally helpful if you describe what the query is supposed to do, too. But thanks for showing complete code + SQLFiddle. –  Craig Ringer Jun 7 at 3:02

2 Answers 2

Your query appears to:

  • Find the three names that appear most in the overall data set
  • Extract the year from the data, producing (year, name) pairs
  • and for each year count the number of times each name appears
  • Output the count for each name in a separate column with the name as its alias

All of this, except the final step, can be done in plain SQL.

SELECT
  extract(year FROM year1) AS "year",
  ame,
  count(ame) AS ame_count
FROM yourtable
WHERE ame IN (
  SELECT ame
  FROM yourtable
  GROUP BY ame
  ORDER BY count(ame) DESC
  LIMIT 3
)
GROUP BY 1, 2;

per http://sqlfiddle.com/#!15/be97f/7

The last bit is what's called a cross-tabulation or pivot. Unfortunately, PostgreSQL sucks at these, they're really annoying to write. You have to write them yourself with dynamic SQL or use the crosstab function from the tablefunc extension. I'll do the latter.

SELECT 
    *
FROM
    crosstab('
      SELECT
        extract(year FROM year1)::integer AS "year",
        ame,
        count(ame)::integer AS ame_count
      FROM yourtable
      WHERE ame IN (
        SELECT ame
        FROM yourtable
        GROUP BY ame
        ORDER BY count(ame) DESC
        LIMIT 3
      )
      GROUP BY 1, 2
      ORDER BY 1, 2
    ','
      SELECT ame
      FROM yourtable
      GROUP BY ame
      ORDER BY count(ame) DESC
      LIMIT 3
    ')
top_names("year" integer, "first" integer, "second" integer, "third" integer); 

but even then, the column aliases aren't what you want:

 year | first | second | third 
------+-------+--------+-------
 2001 |     1 |      1 |      
 2002 |     1 |      1 |      
 2003 |     2 |        |     3
 2004 |     1 |      1 |      
(4 rows)

Unfortunately, you'll have the same problem in custom PL/PgSQL. There's no way for a function to specify the column aliases dynamically at runtime. But we can be a bit tricky, and have the function generate a query as a cursor and return the cursor, then fetch the cursor. That leads to something like this monstrosity:

BEGIN;

DO
LANGUAGE plpgsql
$$
DECLARE
    top_names text[];
    name_sql text := '';
    "name" text;
    curs refcursor := 'top_names';
BEGIN
    SELECT INTO STRICT top_names 
       array_agg(ame)
    FROM (
      SELECT
        ame
      FROM yourtable
      GROUP BY ame
      ORDER BY count(ame) DESC
      LIMIT 3
    ) topnames(ame);

  FOREACH "name" IN ARRAY top_names
  LOOP
     name_sql := format(
                   '%s, sum(case when ame = %L then 1 end) as %I',
                   name_sql, "name", "name"
                 );
  END LOOP;

  OPEN curs FOR EXECUTE format(
  '
    SELECT
      extract(year FROM year1) AS "year"
      %s
    FROM yourtable
    WHERE ame = ANY(%L)
    GROUP BY 1
    ORDER BY 1
  ',
  name_sql,
  top_names
  );

  RAISE NOTICE '%', curs;
END;
$$;

FETCH ALL FROM "top_names";

COMMIT;

producing:

regress=> FETCH ALL FROM "top_names";
 year | tim | pol | kim 
------+-----+-----+-----
 2001 |   1 |   1 |    
 2002 |   1 |   1 |    
 2003 |   2 |     |   3
 2004 |   1 |   1 |    
(4 rows)

If you want to include years with zero matches you'll need to do a left outer join on the list of distinct years, too.

share|improve this answer

You should to translate your code to PL/pgSQL language - PostgreSQL doesn't support dynamic SQL on SQL level - it is supported only from PL/pgSQL language - but highly probably you have to little bit redesign your code, because PostgreSQL functions have to have known result columns before execution. You can use a cursors too

see a article about these techniques Using cursors for generating cross tables

share|improve this answer
    
My answer below shows how, and reminds me how how incredibly annoying the inability to specify a dynamic record type from inside a pl/pgsql function is. –  Craig Ringer Jun 7 at 7:02
    
@CraigRinger - this use case is one reason, why I push a "real" stored procedures to Postgres. –  Pavel Stehule Jun 7 at 7:31
    
Couldn't agree more - multiple resultsets and proper dynamic resultsets really require true stored procs. Also something we'll need if we're ever going to have autonomous transactions. –  Craig Ringer Jun 7 at 9:30

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.