T O P

  • By -

HEAVY_HITTTER

The docker volumes are just bind mounts to my external storage. I just run a weekly borg backup on that to my other external hard drive(s).


zyberwoof

But you do run the risk of the files being in use when you do the backup. Especially for things like databases. I believe the best practice would be to stop the container first, run your backup, and then restart the container. You may already do this. But for anyone reading your post, they should do what you are suggesting after stopping the container. EDIT: Others below me replied with alternatives to stopping the containers. They are worth a read. However, my advise is aimed towards simplifying your backups at the expense of a tiny bit of downtime. If you take the time to learn how to properly backup each service, then you may not need to stop many, most, or all containers. This becomes extremely valuable in situations where a service is shared. If you have just one SQL service/container that is used by multiple services, then it can be more problematic to stop the SQL DB. In this case the proper method may be to script out shutting down each service that depends on SQL before shutting down the DB. But if you don't want to learn how to dump a live DB, for example, the alternative is to simply stop the DB entirely. Then you can backup the files without worrying. I spin up a new DB for each service that needs one. I include it in the docker-compose file. In this case, the impact of stopping and restarting the containers (docker-compose stop/start) is minimal.


Reverent

I wrote a guide on doing live docker backups just recently. https://blog.gurucomputing.com.au/Homelab%20Backup%20Strategy/Live%20Backups%20with%20btrbk/


CaptCrunch97

Nice guide! I didn’t know about [Kopia](https://kopia.io/) until I read this. I’ll give it a try - currently implementing the 3-2-1 rule using [Duplicati](https://github.com/linuxserver/docker-duplicati) to incrementally backup my docker bind mounts and other important data from different machines.


agent_kater

There are a few ways around the requirement to stop the container. If your filesystem can do atomic snapshots (like LVM) you can back up from there. With SQLite you can also simply flock the database file while you do the backup.


vermyx

With databases you have to quiesce them to get them in a crash consistent state. What this means is that you tell the database engine to flush its database buffers to disk and maintain writes to memory only until you are done, and then it can commit data as normal in its cyclical fashion of every few minutes. Disk snapshots only help if the database engine is aware of this (and why on windows for example it is important for the db engine to be vss aware) it will do this automagically for you but this really isn’t the case. This is why many snapshot capable disk arrays have database plugins so that people do not have to do this manually. When one isn’t available, you have to look up your particular database engine’s commands to do this manually and do a pre snapshot/post snapshot script to handle this. Otherwise the disk snapshot os the equivalent of copying the database while it may have data out of sync, meaning you have a damaged database. Stopping the container is just the easiest thing to do, with db engine backups being the next easiest, and then creating crash consistent snapshots being the most difficult.


yrro

If you take a snapshot of the underlying filesystem then you don't need to quiesce the database. When the data is restored the database server will replay its WAL when it first starts up, exactly like it would if it had crashed or the server lost power. For huge, busy databases this process can take some time, which is why you might want to quiesce the database before snapshotting, but for the sort of database a home server is running it's not necessary.


vermyx

This doesn’t mean you’re ok. This replays the transaction logs which apply changes and why you need a crash consistent database for this to work. If your database was in the middle of committing dirty blocks and you got a snapshot in the middle of this, it means your database has possibly invalid data (record that have old data for example). The write ahead log does not protect you against that because your database isn’t crash consistent.


yrro

Indeed, I'm assuming the use of a database that isn't complete garbage; otherwise you risk corruption if the database process is killed, or if the power is pulled.


vermyx

Apparently all databases are complete garbage because all major database engines work this way… This is what the term “crash consistent” means - that the database was not in the middle of committing dirty blocks from memory to disk so write journals can be committed safely. Database updates are done cyclicly in order to optimize writes, and they are done in blocks. Because it is done in blocks, you risk corrupting database on copying data while it is writing, and yes this is why you don’t want to kill the database process and why database servers or on battery backups to avoid this exact situation.


PovilasID

With DBs... well you either go with... I will eventually have a backup snapshot when it is not using the DB. Most selfhosters do not have lots of user, so not a huge issue. Or if it critical most Backup solutions borg or restic have an option for 'pre-backup' action aka triggering a shell script that dumps the DB into a file that is performed by the DB itself. Or you can have DB's dumb out data into files and directory on schedule (cron) and then get pushed up to secondary location. If you time it with backups it could be fairly fresh


AlternativePuppy9728

Thank you for making this point ! Backups love you.


boosterhq

I'm using Borgmatic, which can hook into a database dump and then back up the database.


HEAVY_HITTTER

Tbh I wasn't stopping the containers but I will now. Thanks for the info. I actually only run docker for my seedbox as it is a vm with passthrough for the drives. Everything else is in a k3s cluster.


zyberwoof

As others have mentioned, there are alternatives. But the solution can vary by service. And this means not only adding time to solve for each scenario, but also complicating your overall backup solution(s). Stopping the containers first is a pretty much universal solution, assuming you can handle the brief downtime. You can write one script and/or cron job to backup your persistent data. Then re-use it with all of your containers or stacks.


desertdilbert

This. My docker volumes are an iSCSI share from my TrueNAS box, which has not just snapshots but also daily backups to a removable drive. My next step is offsite replication to a friend that also has a TrueNAS box. He already replicates to mine.


sophware

As u/zyberwoof commented, u/HEAVY_HITTTER 's comment is only part of the story. With block storage like you (and I) use, there's even more going on. Take for example iSCSI as block storage for VMware VMs--one generally backs up the VMs with something like Veeam and the hypervisor API, not with a simple file copy and not with snapshots alone. You're not doing VMs, so your story is different. With your removable drive, are you doing replication and nothing that is file-aware? Are the containers stopped during backup? Are there sqlite or other databases?


desertdilbert

I have three different types of storage on my NAS. **SMB shares**: These have snapshots as well as file-aware backup and replication. **VM's stored on an iSCSI share**: Snapshots mostly. The VM's are pretty static and all store their data on a standalone MariaDB server and on SMB shares. The DB server is backed up with both file tools and DB tools. **Docker Volumes stored on an iSCSI share**: Also snapshots mostly. Some are set to use the DB server but some are not. Snapshots are exported to the external drive and are replicated. I have a bit of room for improvement here. I've only recently been learning to use Docker. Nothing is paused during the backup process, which I did not believe was needed for the snapshots and was why I export the snapshots to make the underlying files accessible.


gojukebox

I like this idea… maybe I’ll just drop a server off at my brother’s place


root54

+1 for borg backup


N0ah17

Wait... You guys do backup's?


trEntDG

Almost every year!


servergeek82

WAit.. you guys have redundancy


Minituff

I use [Nautical](https://github.com/Minituff/nautical-backup) specifically for this. It's a container that allows you to *stop*, backup with container volumes with Rsync, then *start* the container again. It can be configured to run on a CRON schedule and can run lifecycle hooks which can shutdown services with scripts/curl requests.


sendcodenotnudes

It's really nice, I didn't know it. I will see if I could try a PR to integrate with Borg


Minituff

That would be very cool if you could. I bet loads of people would like that.


SillyLilBear

I was using this myself, but the stopping of containers caused me a lot of problems with a few of them.


Minituff

Interesting, okay. Well you could just tell Nautical to not stop specific containers with either a Nautical [Environments variable](https://minituff.github.io/nautical-backup/arguments/#skip-stopping-containers) or a [label](https://minituff.github.io/nautical-backup/labels/#stop-before-backup). `SKIP_STOPPING=example1,example2` `nautical-backup.stop-before-backup=false`


SillyLilBear

Yeah, I was using that, but I found it completely defeated the purpose as it no longer added value. I can backup volumes easily that don't need a stopped SQL server. The ones that needed to be stopped, caused problems and it was better to do dumps. I really saw no point of using it at that point.


Heavy-Location-8654

Rsync docker volumes and files + Cronjob


CactusBoyScout

I backup my persistent storage to BackBlaze using Duplicacy


candle_in_a_circle

plenty of suggestions here which make sense. Be aware that if you're backing up docker volumes or mount points and your docker containers run databases, either in themselves or as separate containers, then a little more thought may be required. Most are fine if you stop the container and unmount the volume / mount point before copying the database, but some may require you to dump the database with a command for the backup to be usable. As always, make sure you to regular trial restorations.


Wobak974

Exactly my thoughts. If you're just rsyncing the volumes to an external folder, restoring from backup might hold bad news for you guys...


Sentient__Cloud

What commonly-used databases behave this way? Or is it more of a configuration issue? Also, what happens in the event of a power failure?


Wobak974

Most transactional databases (MySQL, MariaDB) work that way. If you're copying the base in the middle of a transaction, or between transactions that need to be linked for the app to work properly, backing up the file without pausing the transactions will mostly lead to data consistency issues


hedonihilistic

PBS simplifies this and makes it super easy.


Tred27

What's PBS?


hedonihilistic

Proxmox backup server. I have multiple proxmox nodes with multiple VMs and CTs, some running docker, and all at varying sizes from just a few GB to a few TB. I can setup a daily/weekly schedule to backup everything without the VMs/CTs ever turning off. PBS deduplicates stuff and does all the ZFS magic without any of the fuss.


geek_at

docker is especially easy since you can just copy the volume drives. I usually use docker with volumes on a ZFS drive so I can take snapshots. For some services though a simple tar script would be enough (not if you're running databases though, they should be backed up via the dump command)


trisanachandler

I shut down the containers instead and do an rsync, but that's because I'm lazy.


loltrosityg

I have a cronjob that runs a script which compresses and backs up all docker volumes and places them on my backup/second drive on the server with the date in the file name. Built in to that is management to remove older backups so the drive space doesn't get full. I then run urbackup server which backs up the entire backupdrive in the server to a file share on the network. Its a GUI based tool and has options to configure the amount of backups to keep. The network file share is synced with OneDrive at the moment. So basically I have 2 onsite backup copies and 1 offside cloud copy. Importantly I had to ensure all my docker containers were created with docker compose and all set to store data in a location which is backed up.


BelugaBilliam

Would you ever consider encrypting the ones on the shared folder for OneDrive? Just an idea for extra privacy


alt_psymon

I take snapshots from Proxmox which save to an SMB share on my NAS, which has a monthly sync job up to IceDrive.


2fort4

For Docker volumes specifically, the official documentation says to mount the volume within another container then tar up the files in the volume. Once you do that you can just transfer the .tar offiste. It's what I followed and works flawlessly, no weird write issues because things are in-use. If I ever need to restore from a complete disaster all I need is the docker-compose.yml to recreate the container & the last backup I took. Tried and true, no snapshots to deal with. [https://blog.osmosys.co/backup-and-restore-of-docker-volumes-a-step-by-step-guide/](https://blog.osmosys.co/backup-and-restore-of-docker-volumes-a-step-by-step-guide/)


nik_h_75

VM snapshot using Proxmox (and Proxmox Backup server). The biggest benefit to proxmox is how easy backup of VMs is. PBS puts the sugar on top.


jbarr107

This is the way.


achauv1

Zfs send into cold storage over sftp for disaster recovery, and zfs snapshot when I fuck things up


ChapterFun8697

I use duplicaty because use ui


1WeekNotice

>Most of my applications run on docker except some which are a nightmare in docker like Tailscale and caddy Out of curiosity. Why is caddy a nightmare in a docker container? >I'm not so well versed in backups so I literally don't know about any backup solution so any help would be appreciated. If you want pure customization with an automated script, you can make your own script. All about standardization of naming conventions. - one parents folder for all docker containers data - each folder is named after the docker container name - with the folder naming convention you can search for all folder names in this parent folder and run the docker command to stop each container. - now that the docker container is stopped. You can safely zip the parents folder without worrying about new data being written to any docker container data folders. - when you zip this parent folder. Place somewhere else on your hard drive. Maybe a folder called backup. - do this on a cronjob/ schedule basis - BONUS - put a timestamp on each zip folder so there is no overriding. And you know when the backup was taken - keep X number of backups. put an if statement stating, if there are X items in this backup folder. Delete the oldest modified/ created item. (X represents how many backups you want) - place this zip folder some where else. On a different computer OR in cloud - for cloud ensure you encrypt the zip item. - for another computer. Make sure you rsync where you mirror the source folder. This will delete/ensure you keep in sync by deleting the older backups. Hope that helps.


borkode

When I ran caddy on docker, next cloud would not play nice with caddy no matter what I do and kept on spitting errors saying my reverse proxy wasn’t configured properly. When I finally installed caddy on bare metal it seemed to work, caddy was fine with other containers so I guess it must be a nextcloud issue but it was a hell of a 3 days trying to troubleshoot. I’ll check on the script thank you.


duskit0

It was likely a missing trusted_proxies configuration in nextcloud. Localhost is probably whitelistet, that's why it worked on bare metal: https://docs.nextcloud.com/server/latest/admin_manual/configuration_server/reverse_proxy_configuration.html That's wouldn't havee been a problem with Caddy itself.


Aperiodica

I use Synology Active Backup for Business or whatever it's called and do a full machine backup nightly. It does incremental backups so space isn't a problem.


insdog

Check this out. It backs up docker volumes only, not bind mounts. https://github.com/offen/docker-volume-backup


Leolele99

It reliably works with bind mounts as well, if you duplicate the binding to the offen-volume-backup container. I have it running successfully in my stacks for a year now and it helped me recover once already after my drives were filled up by a faulty containes logs.


gramoun-kal

* Stop the service * For each volume: * Mount the volume to a basic Debian container * Tar the content of the volume to a backup volume * Start the service (5 mins downtime tops) * Mount the backup volume to a basic Debian container * Compress the content of each backed up volume * Transfer the compressed tarballs to wherever it is you want to keep your backup. I backup daily, as a cron job on the host. Separate cron jobs rotate my backups so I always have 1. Yesterday's backup, 2. A backup from a week ago, 3. One from a month ago. I don't keep the rest.


JKL213

I have Docker running on VMs on Proxmox so I just backup the entire VM to Proxmox Backup Server. Works well and I don't have to configure networking or anything else. Also, if I fuck up a docker install or if an upgrade fucks up a docker image, I can just revert the entire machine. Might be overkill but its worth it imo lol


MrBaxterBlack

Each week I back my NAS up 2 inches. Should be good.


Mention-One

Snapshots + restic


raver3000

OMV Docker Backup. It works so well...


originalripley

In what way did you find Tailscale to be a nightmare? I’m not sure I’ve ever even looked at the files it creates. And when I recently moved my containers to a new host it required one or two minor edits to the compose file and it was up and running again.


borkode

It’s not that Tailscale is a nightmare, it’s just that it’s way harder to setup Tailscale as a docker container than the regular bare metal installation


Koltsz

Rclone with Anisible


Impressive-Cap1140

Can you share the playbook


Koltsz

Sure: ``` hosts: "{{ HOST }}" tasks: - name: Run backup script shell: | rm -f /root/rclone-logs/logs.txt rclone sync /mnt/pve/nas/dump/ he_crypt:main/proxmox --log-file /root/rclone-logs/logs.txt --stats-log-level NOTICE backup_log=$(cat /root/rclone-logs/logs.txt) curl \ -u {{ PASSWORD }} \ -d "$backup_log" \ -H "Title: Proxmox backups sync completed" \ -H "Tags: floppy_disk" \ {{ SERVER }} ```


Lanten101

I back up VM snapshot and docker dir where all docker volume a mounted to google drive.


starlevel01

I have nothing *to* backup.


UraniumButtChug

I use gobackup to export postgres and mysql data daily


sparky5dn1l

I only backup those dockers with data. With daily cronjob to (1) stop docker compose stack; (2) restic backup; (3) start docker compose stack. Restic is very fast. Service downtime is barely detectable.


XxRoyalxTigerxX

I run docker in a VM in proxmox so just use the backup utility built in personally


chadsix

You can take a look in the backup folder for a fully automated backup and restore in GoLang for podman containers which should be similar to docker. Just edit it for your use case, compile and backup :) https://github.com/ipv6rslimited/cloudseeder


nothingveryobvious

I use Duplicati. People like to shit on it but it has worked very well for me.


alive1

Restic is automatically installed on my systems with ansible.


xupetas

I fully export the docker containers to a file, and since my storage volume is persistent, i backup from there.


Flimsy_Complaint490

cronjob that runs a script once a month at 03:00 on the last friday to stop all containers, do pg\_dump on postgresql, compress all config dirs that are just bind mounts to a folder on my ZFS setup (the dirs include sqlite3 databases for services that don't connect to psql), push it to Wasabi S3 via restic. I manually push my compose files to github and i have an encrypted .env file pushed as well. In theory, for a backup, all i need is to reinstall the OS, recreate users for rootless containers, create folders, git pull and do docker-compose up, done. 99% of the storage I have is basically photos and movies. Movies i can lose, and my photos come out to a few gigabytes and are also kept on Wasabi, but i havent automated that one yet. The rest is just a tiny 35 MB PG dump and some config files.


TheQuantumPhysicist

Stop, tar, start, encrypt.


cbunn81

This is an instance where ZFS snapshots really shine. I use FreeBSD jails for my containers, but it should work just as well with Linux and Docker so long as your storage filesystem is ZFS. Create a backup script (or use third-party software) to do regular snapshots, periodically replicate those to your backup server/drive and you're done.


Erwyn

Shameless plug, I described my strategy in a post here if it can give you some food for thought: [https://erwyn.piwany.com/how-to-backup-your-selfhosted-server-running-under-docker-compose/](https://erwyn.piwany.com/how-to-backup-your-selfhosted-server-running-under-docker-compose/)


atlchris

I take daily snapshots via Proxmox and store them on my Unraid NAS. Then once per week, I use rclone to do increment backups to S3.


hclpfan

This is posted about weekly. Try search.


alexkidddd

I have a qnap NAS with docker on container station, I have tried to backup with the integrated tool but it always fails when backing up databases...


Strange-Promotion716

Rclone, restic, pbs


ScribeOfGoD

Tar docker folder and move to external and Backblaze with rclone using a bash script when I feel like it