Join the Stack Overflow Community
Stack Overflow is a community of 6.6 million programmers, just like you, helping each other.
Join them; it only takes a minute:
Sign up

I'm looking for an efficient way to take a raw sql file and have it executed synchronously against a postgres database, akin to if you ran it through psql.

I have an sql file which creates all databases, imports data, etc. I need to execute this using node.js but cannot find any module which does this automatically. For the node.js application itself, we use node-postgres ('pg'), knex.js and bookshelf.js. I assume though that pg is best for this.

One alternative I can think of is to read the full file, split it by semicolons, replace newlines with spaces, trim any duplicate space, then feed it into pg one by one in a manner that they're executed sequentially, not asynchronously. I'm a little surprised if this is truly the most efficient way and also if no libraries exist yet to solve this. I'm a little hesitant to jump into it seeing as SQL syntax can itself be a little challenging and I might accidentally mash it up.

Some clarifications in advance:

  • psql cannot be used as it's not installed on the target machine
  • I've chosen to develop and source control sql statements in sql native form, because it's a lot easier for a DBA to use and manipulate it
share|improve this question
    
It's harder than your first thought because you need to distinguish literals from keywords within the SQL. As a trivial example, split by semicolons can't work with insert into table values(';'); – Daniel Vérité Mar 25 '14 at 14:47

You can just separate consequent queries with a semicolon when passed to client.query

That works:

var pg = require('pg');

pg.connect('postgres://test:test@localhost/test', function(err, client, done){
        client.query('CREATE TABLE test (test VARCHAR(255)); INSERT INTO test VALUES(\'test\') ');
        done();
});

And consequently, that works too:

var pg = require('pg');
var fs = require('fs');

var sql = fs.readFileSync('init_database.sql').toString();

pg.connect('postgres://test:test@localhost/test', function(err, client, done){
    if(err){
        console.log('error: ', err);
        process.exit(1);
    }
    client.query(sql, function(err, result){
        done();
        if(err){
            console.log('error: ', err);
            process.exit(1);
        }
        process.exit(0);
    });
});
share|improve this answer
    
I tried this initially but found problems: * Newlines and tabs found traditionally in SQL files could not be stuffed into client.query, (resolvable with some extra steps) * Very difficult error handling per-query (which I have admittedly also left out of my example here) – rgareth Apr 1 '14 at 8:24
    
@rgareth I had completely no problems with newlines and tabs in SQL files whatsoever. You can add some \ns (even \r\ns) and \ts to my example and it will still work. – OhJeez Apr 3 '14 at 8:33
    
that worked for me for ibm_db lib. I was facing the same problem also, but one question raised up to my mind. What about SQL Injection? – user2670818 Jan 18 '16 at 8:48
up vote 3 down vote accepted

I've written the following function which works for my case. It would have been much more simpler if it weren't for:

  • Using batch to manage concurrency
  • Having the tricky PostgreSQL COPY case to consider

Code snippet:

function processSQLFile(fileName) {

  // Extract SQL queries from files. Assumes no ';' in the fileNames
  var queries = fs.readFileSync(fileName).toString()
    .replace(/(\r\n|\n|\r)/gm," ") // remove newlines
    .replace(/\s+/g, ' ') // excess white space
    .split(";") // split into all statements
    .map(Function.prototype.call, String.prototype.trim)
    .filter(function(el) {return el.length != 0}); // remove any empty ones

  // Execute each SQL query sequentially
  queries.forEach(function(query) {
    batch.push(function(done) {
      if (query.indexOf("COPY") === 0) { // COPY - needs special treatment
        var regexp = /COPY\ (.*)\ FROM\ (.*)\ DELIMITERS/gmi;
        var matches = regexp.exec(query);
        var table = matches[1];
        var fileName = matches[2];
        var copyString = "COPY " + table + " FROM STDIN DELIMITERS ',' CSV HEADER";
        var stream = client.copyFrom(copyString);
        stream.on('close', function () {
          done();
        });
        var csvFile = __dirname + '/' + fileName;
        var str = fs.readFileSync(csvFile);
        stream.write(str);
        stream.end();
      } else { // Other queries don't need special treatment
        client.query(query, function(result) {
          done();
        });
      }
    });
  });
}

Beware that this would fail if you used semicolons anywhere except to terminate SQL statements.

share|improve this answer
    
Can you add details on what batch is? – James Ward May 16 '16 at 23:03
    
github.com/visionmedia/batch – rgareth Jun 27 '16 at 8:36

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.