Server crash during update lead to database loss


#1

From Github:

enbytv:

I was in the process of updating the server for my Mastodon Docker instance (appdot.net) to v2.5.2 when it crashed due to loss of network connection. On reboot the instance appeared to be new - it was completely empty. This is probably because despite the relevant lines having being uncommented in docker-compose the database ended up in /var/lib/docker/volumes .

I have restored from a mirrored backup (from four days ago) which has given me data up until then. I would now like to restore the database from an AWS backup generated using pg_dump , but the pg_restore commands I have tried have all failed.

Honestly I’m unsure how to get the data from the AWS pg_dump backup back into the server without losing more data.


#2

so there are two issues here:

  • your database weren’t being persisted to the disk even though you thought they were. did you ever test that the docker volumes were persistent after you uncommented the lines?

  • your pg_restores are failing. can you give an example of the restore command you’re running and the failure message you’re seeing? and also include the original pg_dump command that generated the backup.


#3

Hi, thanks for moving it over here.

Everything worked fine when I rebooted the server previously (about ten times), and it had all loaded properly then. So as far as I was aware, the docker volume was persistent. But this was the first time I’d used docker in about 3 years, so I didn’t quite know what I was expecting to happen. For what it’s worth it wasn’t clear that I had to uncomment the lines until after I had installed the instance, and the docker install instructions weren’t as good when I did the install as they are now.

I can’t remember the commands I tried and I didn’t write them down, silly me, but it was something like docker exec -i live_db_1 pg_restore --verbose --U postgres -d postgres livedb1backup-Oct-20-18.dump. And the error messages were along the lines of records already existing so not being able to amend the table. The original pg_dump command was docker exec live_db_1 pg_dump -Fc -U postgres postgres > $DESTINATION.

I’m at a point where I think restoring the data from the backup isn’t important - we only lost 100 posts more or less - so all I want to do is just make sure the database is/becomes persistent!

Thanks!


#4

If it was persistent across reboots, then it was persistent. i’m wondering if there’s some other problem happening here—you’re using a different database name, or if postgres is refusing to recover the database from the WAL for some reason. Or your edits to docker-compose.yml were later erased when you upgraded, or something like that.

The error with the restore is that you have to restore into a fresh database. Now that you’ve started using the new one, there’s no way to reconcile the two—you’ll have to choose one or the other to keep. The best way to do this is to take the server down, drop the new database, and restore the old one.

Also, you should always use docker-compose run, not docker exec. Docker containers aren’t meant to have more then 1 root process running in them.


#5

Thanks for the reply. I don’t think I’ve changed the database name or anything. I just checked the docker compose file and it has it listed as ./postgres:/var/lib/postgresql/data which doesn’t seem right for some reason? I’ve looked and /var/lib/postgresql/data doesn’t exist. Where is the database actually stored if it’s not where it says in docker-compose.yml?

I’m fine with having the new database now, it’s no big deal.


#6

./postgres is the location on the host machine, /var/lib/postgresql/data is the location inside the container.


#7

So ./postgres would be inside the Mastodon running folder, or at the root of the machine?


#8

inside of the folder you run the containers from, I believe. (docker-compose up)


#9

So, in practical terms, relative to the docker-compose file.


#10

Okay. I’ve had a look there and there’s nothing there, which I’m confused about. Is it possible that the database is still inside a docker container somewhere else on the machine?


#11

Sorry, but this isn’t indicative. Rebooting a container even without volumes would persist the data. It’s destroying and re-creating the container that would “lose” data by losing the connection to the previously used volume.


#12

Ah, my bad. I assumed that rebooting the server would also destory the container (since that’s how it works in docker for mac)


#13

Yes, very likely. Even when a volume is not defined in docker-compose.yml, Docker creates a nameless volume for it. The problem is, of course, that it’s hard to find a nameless volume, and re-creating the container for any reason means a new volume is created and the old one is disconnected. But it should still be among all the volumes. Somewhere in /var/lib/docker, but I don’t remember where. You’d need to look through the directories and find one that looks like it contains PostgreSQL files.

List Docker volumes with docker volume ls. Use docker volume inspect HASH to find where the volume is stored on the filesystem. Look there with standard filesystem tools.


#14

Thanks for the added info on the rebooting thing - and that destroying and recreating would lose the data.

I’ve found a few volumes in /var/lib/docker/volumes and found one containing a Redis file and one containing all the PostgreSQL files. So at least I know where they are now!


#15

Regarding restore errors:

Maybe roll back to previous Mastodon version, restore DB, and only then rerun the upgrade?


#16

Also, please check the format of the backup. Plain pg_dump backups are stored as SQL commands in the plain text, and those have to be restored with psql and not with pg_restore.