Stack Overflow is a community of 4.7 million programmers, just like you, helping each other.

Join them; it only takes a minute:

Sign up
Join the Stack Overflow community to:
  1. Ask programming questions
  2. Answer and help your peers
  3. Get recognized for your expertise

I am using SQLAlchemy 1.0.11 to migrate an existing database from MS SQL 2012 to PostgreSQL 9.2 (upgrade to 9.5 planned).

I've been reading about this and found a couple of different sources (Tyler Lesmann, Inada Naoki, Stefan Urbanek, and Mathias Fussenegger) with a similar approach for this task:

  1. Connect to both databases
  2. Reflect the tables of the source database
  3. Iterate over the tables and for each table
    1. Create an equal table in the target database
    2. Fetch rows in the source and insert them in the target database

Code

Here is a short example using the code from the last reference.

from sqlalchemy import create_engine, MetaData

src = create_engine('mssql://user:pass@host/database?driver=ODBC+Driver+13+for+SQL+Server')
dst = create_engine('postgresql://user:pass@host/database')

meta = MetaData()
meta.reflect(bind=src)

tables = meta.tables

for tbl in tables:
    data = src.execute(tables[tbl].select()).fetchall()
    if data:
        dst.execute(tables[tbl].insert(), data)

I am aware that fetching all the rows at the same time is a bad idea, it can be done with an iterator or with fetchmany, but that is not my issue now.

Problem 1

All the four examples fail with my databases. One of the errors I get is related to a column of type NVARCHAR:

sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) type "nvarchar" does not exist
LINE 5:  "desigOperador" NVARCHAR(100) COLLATE "SQL_Latin1_General_C...
                         ^
 [SQL: '\nCREATE TABLE "Operators" (\n\t"idOperador" INTEGER NOT NULL, \n\t"idGrupo" INTEGER, \n\t"desigOperador" NVARCHAR(100) COLLATE "SQL_Latin1_General_CP1_CI_AS", \n\t"Rua" NVARCHAR(200) COLLATE "SQL_Latin1_General_CP1_CI_AS", \n\t"Localidade" NVARCHAR(200) COLLATE "SQL_Latin1_General_CP1_CI_AS", \n\t"codPostal" NVARCHAR(10) COLLATE "SQL_Latin1_General_CP1_CI_AS", \n\tdataini DATETIME, \n\tdataact DATETIME, \n\temail NVARCHAR(50) COLLATE "SQL_Latin1_General_CP1_CI_AS", \n\turl NVARCHAR(50) COLLATE "SQL_Latin1_General_CP1_CI_AS", \n\tPRIMARY KEY ("idOperador")\n)\n\n']

My understanding from this error is that PostgreSQL doesn't have NVARCHAR but VARCHAR, which should be equivalent. I thought that SQLAlchemy would automatically map both of them to String in its layer of abstraction, but perhaps it doesn't work that way in this case.

Question: Should I define all the classes/tables beforehand, for instance, in models.py, in order to avoid errors like this? If so, how would that integrate with the given (or other) workflow?

In fact, this error was obtained running the code from Urbanek, where I can specify which tables I want to copy. Running the sample above, leads me to...

Problem 2

The MS SQL installation is a geodatabase that is using ArcSDE (Spatial Database Engine). For that reason, some of the columns are of a non-defaultGeometry type. On the PostgreSQL side, I am using PostGIS 2.

When trying to copy tables with those types, I get warnings like these:

/usr/local/lib/python2.7/dist-packages/sqlalchemy/dialects/mssql/base.py:1791: SAWarning: Did not recognize type 'geometry' of column 'geom'
  (type, name))
/usr/local/lib/python2.7/dist-packages/sqlalchemy/dialects/mssql/base.py:1791: SAWarning: Did not recognize type 'geometry' of column 'shape'

Those are later followed by another error (this one was actually thrown when executing the provided code above):

sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) relation "SDE_spatial_references" does not exist
LINE 1: INSERT INTO "SDE_spatial_references" (srid, description, aut...
                    ^

I think that it failed to create the columns referred in the warnings, but the error was thrown at a later step when those columns were needed.

Question: The question is an extension of the previous one: how to do the migration with custom (or defined somewhere else) types?

I know about GeoAlchemy2 that can be used with PostGIS. GeoAlchemy supports MS SQL Server 2008, but in that case I guess I'm stuck with SQLAlchemy 0.8.4 (perhaps with less nice features). Also, I found here that it is possible to do the reflection using types defined by GeoAlchemy. However, my questions remain.

Possibly related

Edit

When I saw the error referring SDE_spatial_references I thought that it could be something related to ArcSDE, because the same machine also has ArcGIS for Server installed. Then I've learned that MS SQL Server also has some Spatial Data Types, and then I confirmed this is the case.

share|improve this question

This question has an open bounty worth +50 reputation from iled ending tomorrow.

This question has not received enough attention.

I would appreciate if an answer also points to possible faults in the logic/reasoning of how I am addressing this task.

    
You could try asking it on dba.stackexchange.com guys there may offer you other options for such data migration. +1 though, well written question. – Jorge Campos Jan 29 at 4:27
    
@JorgeCampos I appreciate your comment. Actually, I didn't think about DBA.SE. I did a quick search over there and it seems that there are less hits about SQLAlchemy. Nevertheless, should I ask for this question to be migrated? Obrigado! – iled Jan 29 at 15:53
    
I think you should ask this question there focusing on the migration data since they are experts on such scenarios. Just mention that you have tried it with SQLAlchemy to see that if they have alternatives to this approach. I think that you don't need to ask to migrate this one since it is a perfectly programming issue related with SQLAlchemy. De nada :) – Jorge Campos Jan 29 at 16:17

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged or ask your own question.