Take the 2-minute tour ×
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's 100% free.

I have an application at Location A (LA-MySQL) that uses a MySQL database; And another application at Location B (LB-PSQL) that uses a PostgreSQL database. (by location I mean physically distant places and different networks if it matters)

I need to update one table at LB-PSQL to be synchronized with LA-MySQL but I don't know exactly which are the best practices in this area.

Also, the table I need to update at LB-PSQL does not necessarily have the same structure of LA-MySQL. (but I think that isn't a problem since the fields I need to update on LB-PSQL are able to accommodate the data from LA-MySQL fields)

Given this data, which are the best practices, usual methods or references to do this kind of thing?

Thanks in advance for any feedback!

share|improve this question
    
Do you need it synchronous (slow, changes seen on replica on the moment of master commit) or asynchronous (fast, but changes on replica can be seen after some delay). If asynchronous then what kind of delay can you live with: couple of seconds, couple of hours, a day? –  Tometzky Jan 12 '11 at 17:27
    
@Tometzky, it can perfectly be an asynchronous task. As for the delay, I can live with something between a day/week. –  acm Jan 12 '11 at 17:42

3 Answers 3

up vote 2 down vote accepted

If both servers are in different networks, the only chance I see is to export the data into a flat file from MySQL.

Then transfer the file (e.g. FTP or something similar) to the PostgreSQL server and import it there using COPY

I would recommend to import the flat file into a staging table. From there you can use SQL to move the data to the approriate target table. That will give you the chance to do data conversion or do updates on existing rows.

If that transformation is more complicated you might want to think about using an ETL tool (e.g. Kettle) to do the migration on the target server .

share|improve this answer
    
thank you but I was expecting some kind of "automated process" solution, exporting, transferring via FTP and importing isn't ideal. –  acm Jan 12 '11 at 14:23
1  
There will be no 100% automated process without some up front put in. You will have to do something. You could look at an ETL tool (Pentaho maybe) as they are designed for this type of task, but it's still requires doing some development work to create the ETL process. –  Bob Jan 12 '11 at 15:02

Just create a script on LA that will do something like this (bash sample):

TMPFILE=`mktemp` || (echo "mktemp failed" 1>&2; exit 1)
pg_dump --column-inserts --data-only --no-password \
  --host="LB_hostname" --username="username" \
  --table="tablename" "databasename" \
  awk '/^INSERT/ {i=1} {if(i) print} # ignore everything to first INSERT' \
  > "$TMPFILE" \
  || (echo "pg_dump failed" 1>&2; exit 1)
(echo "begin; truncate tablename;"; cat "$TMPFILE"; echo 'commit;' ) \
  | mysql "databasename" < "$TMPFILE" \
  || (echo "mysql failed" 1>&2; exit 1) \
rm "$TMPFILE"

And set it to run for example once a day in cron. You'd need a '.pgpass' for postgresql password and mysql option file for mysql password.

This should be fast enough for a less than a million of rows.

share|improve this answer
    
Could you explain the awk like please? –  DrColossos Jan 12 '11 at 18:50
    
There's a comment - "ignore everything to first INSERT". pg_dump generates some lines for configuration that are incompatible with other databases - this awk ignores everything until a line starting with "INSERT" shows up. –  Tometzky Jan 12 '11 at 18:58
    
I've read the comment ;) I was more interested the why ("pg_dump generates some lines for [...]"). –  DrColossos Jan 12 '11 at 19:03
    
@Tometzky: doesn't this do it the wrong way round? To my understanding MySQL should be the source and Postgres the target. Your solution uses Postgres as the source as far as I can tell. –  a_horse_with_no_name Jan 12 '11 at 20:54
    
Indeed, it's MySQL (source) to Postgres (target)! –  acm Jan 13 '11 at 9:46

Not a turnkey solution, but this is some code to help with this task using triggers. The following assumes no deletes or updates for brevity. Needs PG>=9.1

1) Prepare 2 new tables. mytable_a, and mytable_b. with the same columns as the source table to be replicated:

CREATE TABLE  mytable_a AS TABLE mytable WITH NO DATA;
CREATE TABLE  mytable_b AS TABLE mytable WITH NO DATA;

-- trigger function which copies data from mytable to mytable_a on each insert
CREATE OR REPLACE FUNCTION data_copy_a() RETURNS trigger AS $data_copy_a$
    BEGIN
    INSERT INTO mytable_a SELECT NEW.*;
        RETURN NEW;
    END;
$data_copy_a$ LANGUAGE plpgsql;

-- start trigger
CREATE TRIGGER data_copy_a AFTER INSERT ON mytable FOR EACH ROW EXECUTE PROCEDURE data_copy_a();

Then when you need to export:

-- move data from mytable_a -> mytable_b without stopping trigger
WITH d_rows AS (DELETE FROM mytable_a RETURNING * )  INSERT INTO mytable_b SELECT * FROM d_rows; 

-- export data from mytable_b -> file
\copy mytable_b to '/tmp/data.csv' WITH DELIMITER ',' csv; 

-- empty table
TRUNCATE mytable_b;

Then you may import the data.csv to mysql.

share|improve this answer

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.