Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I have a quick question for you all:

First, I did a pg_dump on a JIRA database that I was hosting in a PostgreSQL 8.3 database. Size of the database after vacuum full was 217132652 (aproximately 207 MBytes).

Then, I restored that JIRA database on a PostgreSQL 9.4 server with the following command:

$ psql -X -v ON_ERROR_STOP=1 -d jira2 -U jira -h localhost < jiradb2017_03_12.sql

I am assuming that the restore would exit on error (since I used ON_ERROR_STOP=1 when calling psql command), and the sql script finished correctly (despite some warnings not related to data restore).

I ended up with a database with a size of 158019348 (approximately 151 MBytes).

So, what's the story here? Can I just assume database was restored successfully and PostgreSQL optimized its storage (somewhere between 8.3 and 9.4 versions) engine and is using space more efficiently?

Thanks!

share|improve this question
2  
Pablo, have you tried restoring to 8.3 and checking the size? The would confirm or eliminate any effect of the version cahnge – Jack Douglas 15 hours ago
up vote 8 down vote accepted

When you restore a database you have all the information on it packed, with no empty space between rows (or in indices), unless some specific settings are in place (basically: FILLFACTOR for tables and FILLFACTOR for indices).

On the other hand, when your database has been in use for some time, and you've had your share of inserts, updates and deletes, free unused space will appear. This is because of the way PostgreSQL and Multiversion Concurrency Control, a.k.a. MVCC work. MVCC allows for less lockings, but you pay a price in terms of "space":

  1. Every UPDATE is equivalent to an INSERT together with a DELETE, with the overhead (at least in terms of space used) associated to both.
  2. When you have several transactions running, and every one is INSERTing, UPDATEing or DELETEing, you have simultaneously several copies of every row involved.
  3. The space allocated to these row versions won't be freed immediately after commit, and for a while, will be unused space within the files where your table data (and indexes) is being stored.

Autovacuum takes care of this space being made reusable by default, or you could have some specific procedure for routine vacuuming.

This fact already can explain the size change.

Optimizations between versions probably also took place; and can explain further improvements. Optimizations could also have been made for speed and not for size, and the actual size could actually grow from one version to the next. I really don't know the specifics to be able to tell.

To distinguish between the two effects, if you're curious, you could just, as @Jack Douglas suggests, restore your database on 8.3. It will most probably shrink in size. If it shrinks to less than 151 MB (a size smaller that what you get with 9.4 version), then the version change actually made your DB grow.


For a better understanding of MVCC, look at Bruce Momjian's presentation.

share|improve this answer
    
Awesome answer - great suggestions! – Mr. Kennedy 13 hours ago

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.