3

I'd like my Python script to read some data out of a postgresql dump file. The Python will be running on a system without postgresql, and needs to process the data in a dump file.

It looks fairly straightforward to parse the CREATE TABLE calls to find the column names, then the INSERT INTO rows to build the contents. But I'm sure there would be quite a few gotchas in doing this reliably. Does anyone know of a module which will do this?

6
  • Why do you want to read a PG dump? What is your use case? Oct 28, 2013 at 15:22
  • I've said my use case - I want to read some postgres data, I need to do it on a system that doesn't have postgres, but I do have access to a database dump.
    – xorsyst
    Oct 28, 2013 at 15:40
  • providing the input file format would be helpful (I believe postgresql has several ways you can export the data). Any chance of getting the data via csv?
    – cmd
    Oct 28, 2013 at 15:47
  • @cmd Interesting idea. Unfortunately the dump is provided to me by a third party. It's a pretty standard SQL affair though, I was hoping someone else had already written a parser for this.
    – xorsyst
    Oct 28, 2013 at 16:26
  • 1
    beware that there are several formats available for a pg_dump command (pure SQL, SQL made for pgsql, compressed format, using inserts, or not, etc). Seems to me that writing a full SQL parser is a very huge work, seems really wrong.
    – regilero
    Oct 28, 2013 at 17:51

2 Answers 2

Help us improve our answers.

Are the answers below sorted in a way that puts the best answer at or near the top?

1

So ... 7 years later, there is now a Python package written specifically for this:

pip install pgdumplib

From the PyPI page, pgdumplib is a "Python3 library for reading and writing pg_dump files using the custom format."

2
  • 2
    this should rather be a comment and not an answer. Because it merely gives a suggestion and not an implementation.
    – mnm
    Sep 30, 2020 at 1:24
  • Thanks, I shall have to try that and compare speed! Now ... is there one for mysql? ;)
    – xorsyst
    Oct 1, 2020 at 6:15
1

Thanks for all the comments, even if they are mostly "don't do this!" ;)

Given:

  1. The dump is always produced in the same format from a 3rd-party system
  2. I need to be able to automate reading it on another 3rd-party system without postgres

I've gone for writing my own basic parser, which is doing a good enough job for what I require.

2
  • looking for the same thing. Would you mind sharing a link to your script?
    – kthouz
    Jul 6, 2020 at 20:30
  • @kthouz, sorry what I do have is very specific to my particular requirements and I'm not able to share it.
    – xorsyst
    Jul 7, 2020 at 9:50

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.