Converting MySQL Query to PostgreSQL

Question

I'm working out a query that I've ran successfully in MySQL for a while, but in PostgreSQL it's not working

Here's the query:

    SELECT
  CONCAT('SELECT SUBSTRING(year1,7,4) as year2,',
  GROUP_CONCAT(sums),
  ' FROM yourtable GROUP BY year2')
FROM (
SELECT CONCAT('SUM(ame=\'', ame, '\') AS `', ame, '`') sums
FROM yourtable
GROUP BY ame
ORDER BY COUNT(*) DESC
LIMIT 3
) sm
INTO @sql;

PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

Fiddle created here for MySQL query http://sqlfiddle.com/#!2/f1ca5e/3

The first query generates query text that the second one executes. The generated text is something like:

SELECT 
    SUBSTRING(year1,7,4) as year2,
    SUM(ame='tim') AS `tim`,
    SUM(ame='kim') AS `kim`,
    SUM(ame='pol') AS `pol` 
FROM yourtable
GROUP BY year2;

Usually this type of situation can be solved by simply Googling, Binging or searching on other search engines something like: PREPARE in PostgreSQL or DEALLOCATE in PostgreSQL etc. — Radu Gheorghiu, Jun 6 at 9:16
It is generally helpful if you describe what the query is supposed to do, too. But thanks for showing complete code + SQLFiddle. — Craig Ringer, Jun 7 at 3:02

Craig Ringer · Answer 1 · 2014-06-07 06:59:39Z

Your query appears to:

Find the three names that appear most in the overall data set
Extract the year from the data, producing (year, name) pairs
and for each year count the number of times each name appears
Output the count for each name in a separate column with the name as its alias

All of this, except the final step, can be done in plain SQL.

SELECT
  extract(year FROM year1) AS "year",
  ame,
  count(ame) AS ame_count
FROM yourtable
WHERE ame IN (
  SELECT ame
  FROM yourtable
  GROUP BY ame
  ORDER BY count(ame) DESC
  LIMIT 3
)
GROUP BY 1, 2;

per http://sqlfiddle.com/#!15/be97f/7

The last bit is what's called a cross-tabulation or pivot. Unfortunately, PostgreSQL sucks at these, they're really annoying to write. You have to write them yourself with dynamic SQL or use the crosstab function from the tablefunc extension. I'll do the latter.

SELECT 
    *
FROM
    crosstab('
      SELECT
        extract(year FROM year1)::integer AS "year",
        ame,
        count(ame)::integer AS ame_count
      FROM yourtable
      WHERE ame IN (
        SELECT ame
        FROM yourtable
        GROUP BY ame
        ORDER BY count(ame) DESC
        LIMIT 3
      )
      GROUP BY 1, 2
      ORDER BY 1, 2
    ','
      SELECT ame
      FROM yourtable
      GROUP BY ame
      ORDER BY count(ame) DESC
      LIMIT 3
    ')
top_names("year" integer, "first" integer, "second" integer, "third" integer);

but even then, the column aliases aren't what you want:

 year | first | second | third 
------+-------+--------+-------
 2001 |     1 |      1 |      
 2002 |     1 |      1 |      
 2003 |     2 |        |     3
 2004 |     1 |      1 |      
(4 rows)

Unfortunately, you'll have the same problem in custom PL/PgSQL. There's no way for a function to specify the column aliases dynamically at runtime. But we can be a bit tricky, and have the function generate a query as a cursor and return the cursor, then fetch the cursor. That leads to something like this monstrosity:

BEGIN;

DO
LANGUAGE plpgsql
$$
DECLARE
    top_names text[];
    name_sql text := '';
    "name" text;
    curs refcursor := 'top_names';
BEGIN
    SELECT INTO STRICT top_names 
       array_agg(ame)
    FROM (
      SELECT
        ame
      FROM yourtable
      GROUP BY ame
      ORDER BY count(ame) DESC
      LIMIT 3
    ) topnames(ame);

  FOREACH "name" IN ARRAY top_names
  LOOP
     name_sql := format(
                   '%s, sum(case when ame = %L then 1 end) as %I',
                   name_sql, "name", "name"
                 );
  END LOOP;

  OPEN curs FOR EXECUTE format(
  '
    SELECT
      extract(year FROM year1) AS "year"
      %s
    FROM yourtable
    WHERE ame = ANY(%L)
    GROUP BY 1
    ORDER BY 1
  ',
  name_sql,
  top_names
  );

  RAISE NOTICE '%', curs;
END;
$$;

FETCH ALL FROM "top_names";

COMMIT;

producing:

regress=> FETCH ALL FROM "top_names";
 year | tim | pol | kim 
------+-----+-----+-----
 2001 |   1 |   1 |    
 2002 |   1 |   1 |    
 2003 |   2 |     |   3
 2004 |   1 |   1 |    
(4 rows)

If you want to include years with zero matches you'll need to do a left outer join on the list of distinct years, too.

Pavel Stehule · Answer 2 · 2014-06-06 10:25:12Z

up vote 1 down vote

You should to translate your code to PL/pgSQL language - PostgreSQL doesn't support dynamic SQL on SQL level - it is supported only from PL/pgSQL language - but highly probably you have to little bit redesign your code, because PostgreSQL functions have to have known result columns before execution. You can use a cursors too

see a article about these techniques Using cursors for generating cross tables

answered Jun 6 at 10:25

Pavel Stehule
8,223921

My answer below shows how, and reminds me how how incredibly annoying the inability to specify a dynamic record type from inside a pl/pgsql function is. – Craig Ringer Jun 7 at 7:02

@CraigRinger - this use case is one reason, why I push a "real" stored procedures to Postgres. – Pavel Stehule Jun 7 at 7:31

Couldn't agree more - multiple resultsets and proper dynamic resultsets really require true stored procs. Also something we'll need if we're ever going to have autonomous transactions. – Craig Ringer Jun 7 at 9:30

add a comment |

asked	4 months ago
viewed	70 times
active	4 months ago

current community

your communities

more stack exchange communities

Converting MySQL Query to PostgreSQL

2 Answers 2

Your Answer

Not the answer you're looking for? Browse other questions tagged mysql sql postgresql or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Converting MySQL Query to PostgreSQL

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged mysql sql postgresql or ask your own question.

Related

Hot Network Questions