17 Dec 2014 Sql Backups vs CSV Backups
Database Backups can take several formats. Backup size, backup time, and restore time will vary for each of these formats.
In this post, we will use a 13 GB database with a total of 66 million rows to compare and evaluate the backup and restore time of the CSV and SQL backup format on a 4 Gig RAM and 2 cpu cores with mysql 5.5 cloud instance.
SQL Backups
SQL dump file is a dump of data or schema or both in SQL executable format, which can be inserted directly into MySQL using:
[code]shell> mysql < sqldump.sql[/code]
Or from within MySQL:
[code]mysql> SOURCE sqldump.sql;[/code]
Either way, it will execute SQL statements directly into MySQL.
One of the most popular tools for taking MySQL backups is “mysqldump”. “mysqldump” can be used to take SQL or CSV backups for all databases, certain databases or tables in addition to multiple functionalities.
[code]# time mysqldump –single-transaction testdb tbl1 > dump.sql[/code]
The usage of –single-transaction ensures that a consistent backup of the InnoDB tables will be taken. –single-transaction works by putting mysql mode to REPEATABLE READ and issue a START TRANSACTION statement before taking the backup.
The output of the previous command is:
[code]real 1m44.418s
user 0m48.184s
sys 0m11.621s
[/code]
The size of the sql dump is:
[code]# du -h dump.sql
2.5G dump.sql
[/code]
Delimited text backup
CSV or (Comma Separated Values) backup is a file contain only the data of tables separated by specific delimiter (comma for example). There are two methods that can be used to take CSV backups from MySQL:
The first method is to use mysqldump with –tab option which generates the CSV file plus the schema file The second method is to use LOAD DATA INTO OUTFILE command from within mysql which results in generating normal text file with specific delimiter specified by FIELDS TERMINATED BY ‘,’
Testing the same database but with different kind of the backup, it results in:
[code]# time mysqldump –single-transaction –tab=/tmp/ dbtest
real 0m45.793s
user 0m0.006s
sys 0m0.000s[/code]
[code] # du -h /tmp/*
4.0K /tmp/table1.sql
2.2G /tmp/table1.txt
[/code]
The total size of the database CSV dump is 2.2G, which is 12% less than the SQL backup.
Restoring the dump
Restoring is a straight-forward process with both types of dumps. Initially, without any tweaks to the MySQL configuration, the restoring time will be horrifying – it took more than a day to restore the dump.
Tweaking The Restore Performance
There are some tweaks steps that needs to be done before importing data into MySQL:
– Since we use InnoDB tables, the innodb_buffer_pool_size needs to be increased to contain the biggest chunk of the database when performing insert operations.
[code]innodb_buffer_pool_size = 3000M[/code]
– Stopping the autocommit will result into improving the restore time by not committing each transaction.
[code]autocommit = 0[/code]
– Setting innodb_flush_log_at_trx_commit to 2, will result into flushing the data from the buffer to the log file at each commit. Combining this option with the stopping of autocommit will improve the restore performance dramatically.
[code]innodb_flush_log_at_trx_commit = 2[/code]
– Also turning off the unique and foreign key checks temporarily (turns off the check for uniqueness and foreign key constraints on insert operations).
[code]> SET unique_checks=0;[/code]
[code]> SET foreign_key_checks=0;[/code]
After configuring the server, we re-compare the restore time between the SQL and CSV backups.
Restore SQL Backup
[code]# time mysql dbtest < dump.sql
<span class="message_content">real 146m36.868s
user 0m37.566s
sys 0m2.672s </span>[/code]
which means an average of 7500 row insertion per second.
Restore CSV Backup
[code]# mysql dbtest < /tmp/table1.sql
# time mysql -e ‘LOAD DATA INFILE "/tmp/table1.txt" INTO TABLE dbtest.tbl1;’
[/code]
[code]real 120m34.836s
user 0m35.566s
sys 0m2.752s[/code]
Conclusion
Logical backups are expensive and slow compared to the raw data backup but more reliable. By comparing between the SQL logical backups and the delimited text file logical backup it turns out that the size of the CSV backup is less than the sql backup and takes less time in backup and restore.
Backup Type | Backup Size | Backup Time | Restore Time |
SQL | 2.5 G | 1min 44sec | 146 min |
CSV | 2.2 G (12%) | 0min 45sec (42%) | 120 min (14%) |