I am coding a program which has persistent data (the MELT monitor, related to GCC MELT). The data is persistent because it is expected to be read and overwritten by most executions. (That data is in particular made of abstract syntax trees)
Currently, this persistent data is an Sqlite database. Of course, I am backing up it in textual format, obtained by sqlite dump, and I want to manage this textual dump with a version control system (probably git, but perhaps also subversion ....).
Unless I take special precautions, these SQL dumps will probably have quite long lines (e.g. several dozens of kilobytes). Wide SQL columns would probably contain JSON text of many kilobytes.
Would git
(and svn
) be more happy with shorter lines (in particular would they perform slower with long lines, or have repositories using much more disk space)?
I'm probably not mostly interested in the diff
commands (e.g. git diff
or svn diff
), because I expect that using them for such textual dump is not very interesting for the human developers.
I am coding the persistency routines, so I am able to change slightly the format (e.g. to add newlines in some JSON text, which sqlite3 dump
is dumping verbatim).
If you are curious, my current code snapshot is on http://starynkevitch.net/Basile/monimelt-bgc-21apr2014.tar.bz2 and contains a state-monimelt.sql
dump file (which will of course become much larger, perhaps near or above a megabyte)