Any dump format: restore and query
The most practical thing to do is restore them to a temporary PostgreSQL database then query the database. It's by far the simplest option. If you have a non-superuser with createdb
rights you can do this pretty trivially and safely with pg_restore
.
SQL-format
If it's a plaintext (.sql) format dump, if desperate and you know the dumps were not created with the --inserts
or --column-inserts
options and you don't use the same table name in multiple schemas, you could just search for the text
COPY tablename (
at the start of a line, then read the COPY
-format data (see below) until you find \.
at the start of a line.
If you do use the same table name in different schemas you have to parse the dump to find the SET search_path
entry for the schema you want, then start looking for the desired table COPY
statement.
Custom-format
However, if the dump is in the PostgreSQL custom format, which you should always prefer and request by using -Fc
with pg_dump
, it is IIRC really a tar file with a custom header. You can either seek within it to find the tar header then extract it, or you can use pg_restore
to list its header and then extract individual tables.
For this task I'd do the latter. To list tables in the dump:
pg_restore --list out.dump
To dump a particular table as tab-separated COPY
format by qualified name, e.g. table address
in schema public
:
pg_restore -n public -t address out.dump
The output has a bunch of stuff you can't get pg_restore
to skip at the start, but your script can just look for the word COPY
(uppercase) at the start of a line and start reading on the next line, until it reaches a \.
at the end of a line. For details on the format see the PostgreSQL manual on COPY
Of course you need the pg_restore
binary for this.
Make sure there is no PGDATABASE
environment variable set when you invoke pg_restore
. Otherwise it'll restore to a DB instead of printing output to stdout.
pg_restore... | psql ...
, or usepg_restore ...|grep ...
for simple cases. 4) or just restore into a (temporary) database and work from there.