If you have multiple websites running on one machine (or multiple machines), you may run into the same desire as I: “backing those up at once” instead of backup up the individual websites.
If you have a single website, it’s easy.
E.g. if its a WordPress you could install Updraft Plus and configure it will every week make a backup and pushes it to an external location (Dropbox, S3, …) .
On some of my servers there are Drupal websites, WordPress websites, matomo (piwik) systems, mautic systems, … so I wanted to have a more uniform way of backing up.
In the end all of these websites & systems consist of a set of files and a (mysql in my cae) database, so it would be easier to make the backup at once covering the multiple websites all at once.
I am using DigitalOcean and yes you can enable backups there as well (every week there is a backup made at every backup is kept 4 weeks = retention period).
But I decided to go for another backup solution because of
- having backups isolated from the machine / company where the websites / run
You never know something ‘bad’ happens with the company behind the server. Could be really anything: bankruptcy, fire, hack, … - I didn’t want backups on a system level. Mainly because restoring on that level would be a ‘restore all’ or ‘restore none’ scenario. And in some scenario’s you want to only restore specific websites whilst leaving the others (the others might even have new/fresh/more accurate data already in their database).
Also if I want to restore a database I don’t want to deal e.g. with database internal files, I want to rather deal with sql dumps (that’s why in the script there will be a mysql dumping activity). - I want to control to a certain point WHEN exactly the backup is scheduled (to have least performance impact).
STEP1: Make a Dropbox app
Generate an access token using OAuth 2.
https://www.dropbox.com/developers/apps/create
Preferably don’t give access to your full Dropbox (limited it to “App folder” instead of “Full Dropbox”)
Also make sure you enable “files.content.write” in the permissions tab. This will also automatically enable “files.metadata.read”.
Note: by NOT enabling “files.content.write”, you make sure the script is only allows one-way communication (apart from reading metadata). Meaning even if someone would get a hold of you api key (see further) and/or script, they ‘only’ can push data to your Dropbox and not e.g. getting previously stored data.
FYI: See all your previous (already connected) apps.
https://www.dropbox.com/account/connected_apps
https://www.dropbox.com/developers/apps
STEP2: Backup To Dropbox
I used the existing “backup to dropbox” script at https://pypi.org/project/backup-to-dropbox/
TIP: make a ‘dedicated’ python environment.
Why? I noticed when trying do run “pip install backup-to-dropbox” on the different ‘raw’ machines, I ran into different errors. By making one dedicated predictable environment, you prevent that ‘unstable’ behavior.
cd /srv/users/serverpilot/
python3.6-sp -m venv .venv
source .venv/bin/activate
pip install backup-to-dropbox
TIP: to go out of the created python environment type:
deactivate
Backing up a full directory is as easy as running this command:
backup-to-dropbox.py --api-key <api-key> --backup-name myserver1-logs --max-backups 10 /srv/users/serverpilot/apps/
Scheduling the backup using CRON
Of course we want to run this automatically. So we will schedule the script using Cron.
As we are using an isolated python environment, we need to do something (not so) ‘special’. Two options:
Option 1
From within cron call a wrapper bash script.
28 2 * * 1 /srv/users/serverpilot/apps/0SCRIPTS/cron_wrapper_script.sh 2>&1 | mail -s "Backup: 123" [email protected]
cron_wrapper_script.sh:
#!/bin/bash
cd /srv/users/serverpilot
source /srv/users/serverpilot/.venv/bin/activate
backup-to-dropbox.py --api-key <api-key> --backup-name myserver1-logs --max-backups 10 /srv/users/serverpilot/apps/
Option 2
Directly in cron do the following
0 3 * * * cd /srv/users/serverpilot && /srv/users/serverpilot/.venv/bin/python /srv/users/serverpilot/.venv/bin/backup-to-dropbox.py --api-key <api-key> --backup-name myserver1-logs --max-backups 10 /srv/users/serverpilot/
EXTENDED VERSION
Now, we off course want to backup the files and the databases.
So we first dump the databases using a couple of script lines.
DUMP DATABASE: Simple Version
Having a zip and raw SQL version (needed for the extended version)
cd /srv/users/serverpilot/apps/
mysql -s -r -u root -e 'show databases' | grep -Ev 'Database|mysql|information_schema|phpmyadmin|performance_schema|sys'|
while read db
do
mysqldump --max_allowed_packet=2048M -u root $db -r /srv/users/serverpilot/apps/${db}.sql; gzip -c /srv/users/serverpilot/apps/0DATABASES/${db}.sql > ${db}.sql.gz;
done
Straight to a zipped file:
cd /srv/users/serverpilot/apps/
mysql -s -r -u root -e 'show databases' | grep -Ev 'Database|mysql|information_schema|phpmyadmin|performance_schema|sys'|
while read db
do
mysqldump --max_allowed_packet=2048M -u root $db | gzip > ${db}.sql.gz;
done
DUMP DATABASE: Extended Version
I used this script before to only send SQL files if it contained new data (using a simple size check). For the final script, I wont use this extended version.
cd /srv/users/serverpilot/apps/
mysql -s -r -u root -e 'show databases' | grep -Ev 'Database|mysql|information_schema|phpmyadmin|performance_schema|sys'|
while read db
do
mysqldump --max_allowed_packet=2048M -u root $db -r /srv/users/serverpilot/apps/${db}.sql; gzip -c /srv/users/serverpilot/apps/${db}.sql > ${db}2.sql.gz;
#du -b is too precious as it gives difference even when there is not
size_previous=$(du -k ${db}.sql.gz | cut -f 1)
#size_previous=$(stat --printf='%s\n' ${db}.sql.gz)
size_new=$(du -k ${db}2.sql.gz | cut -f 1)
#size_new=$(stat --printf='%s\n' ${db}2.sql.gz)
echo $size_previous
echo $size_new
COUNT=$((size_previous-$size_new))
echo $COUNT
pwd
rm ${db}.sql
if [ "$size_previous" -eq "$size_new" ]
then
echo "${db} same! It's ok to delete the newly created"
rm ${db}2.sql.gz
else
echo "${db} NOT the same! Delete the old one and rename the new one"
rm ${db}.sql.gz
mv ${db}2.sql.gz ${db}.sql.gz
fi
done
Improved version
Sometimes I noticed memory issues (because of large files transfers?).
After a reboot of the machine it seems to work. But wanted to have a more elegant system.
This is the error I got
Traceback (most recent call last):
File "/srv/users/serverpilot/.venv/bin/backup-to-dropbox.py", line 58, in <module>
main()
File "/srv/users/serverpilot/.venv/bin/backup-to-dropbox.py", line 51, in main
backup_service.backup_paths(args.paths)
File "/srv/users/serverpilot/.venv/lib/python3.6/site-packages/backup_to_dropbox/services.py", line 28, in backup_paths
self.__dropbox_client.upload_file(backup_file, self._get_dropbox_path(filename))
File "/srv/users/serverpilot/.venv/lib/python3.6/site-packages/backup_to_dropbox/clients.py", line 39, in upload_file
chunk = file_to_upload.read(DropboxClient.SINGLE_REQ_UPLOAD_SIZE_LIMIT)
MemoryError
A way is probably clearing some cache before running the script
www.tecmint.com/clear-ram-memory-cache-buffer-and-swap-space-on-linux/
But finally I came up with a solution to do multiple “backup sessions” instead of one large backup session.
So instead of backing up one main parent folder, I run the back up command in an iterative way: for every sub directory.
A nice side advantage of this is, when browsing through my Dropbox via a browser. I can easily navigate through the different tar.gz ‘s. Whereas when it was one BIG tar.gz, Dropbox does not allow you to browse zip files that are too big.
Wrapping it up.
So finally this is the script I came up with. Thanks to Google, Dropbox and DigitalOcean 😉
#!/bin/bash
#Part 1: dump databases to Apps Folder
NAME_IN_DROPBOX=machineXYZ
NR_BACKUPS=4
cd /srv/users/serverpilot/apps/
mkdir 0DATABASES
mysql -s -r -u root -e 'show databases' | grep -Ev 'Database|mysql|information_schema|phpmyadmin|performance_schema|sys'|
while read db
do
mysqldump --max_allowed_packet=2048M -u root $db | gzip > 0DATABASES/${db}.sql.gz;
done
#Part 2: Transfer Apps Folder to Dropbox
for directory in */ ; do
echo $directory
FINAL_NAME="${NAME_IN_DROPBOX}/${directory}"
echo $FINAL_NAME
source /srv/users/serverpilot/.venv/bin/activate
backup-to-dropbox.py --api-key <YourAPIKEY> --backup-name $FINAL_NAME --max-backups $NR_BACKUPS /srv/users/serverpilot/apps/$directory
deactivate
done
#Part 3: Clean Up
mail -s 'backup to dropbox: $NAME_IN_DROPBOX' [email protected] <<< 'testing message body'
Nevertheless I sometime still got some MemoryErros, I fixed it by adding swap page (from 512MB to 1 GB) on system level (these are Linux/Ubuntu servers).
Recent Comments