docs: make backup and restore procedure container-first

This commit is contained in:
Marco Allegretti 2026-02-16 11:26:37 +01:00
parent c75a15bc06
commit 932c514666

View file

@ -5,7 +5,7 @@ Protecting your Likwid data.
## What to Backup
| Component | Location | Priority |
|-----------|----------|----------|
| ----------- | -------- | -------- |
| PostgreSQL database | Database server | Critical |
| Uploaded files | `/uploads` (if configured) | High |
| Configuration | `.env` files | High |
@ -13,61 +13,147 @@ Protecting your Likwid data.
## Database Backup
### Manual Backup
Likwid's recommended backup mechanism is a logical PostgreSQL dump (via `pg_dump`).
### Where backups live (recommended)
Store backups under the deploy user, next to the repo:
```bash
# Full backup
pg_dump -h localhost -U likwid -F c likwid_prod > backup_$(date +%Y%m%d).dump
# SQL format (readable)
pg_dump -h localhost -U likwid likwid_prod > backup_$(date +%Y%m%d).sql
mkdir -p ~/likwid/backups
```
### Automated Backup Script
Retention guidance:
- Keep at least 7 daily backups.
- For production instances, also keep at least 4 weekly backups.
- Keep at least one offsite copy.
### Backup now (containerized, recommended)
#### Production compose (`compose/production.yml`)
The production database container is named `likwid-prod-db`.
```bash
#!/bin/bash
# /etc/cron.daily/likwid-backup
BACKUP_DIR="/var/backups/likwid"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30
# Create backup
pg_dump -h localhost -U likwid -F c likwid_prod > "$BACKUP_DIR/likwid_$DATE.dump"
# Compress
gzip "$BACKUP_DIR/likwid_$DATE.dump"
# Remove old backups
find "$BACKUP_DIR" -name "*.dump.gz" -mtime +$RETENTION_DAYS -delete
# Optional: sync to remote storage
# aws s3 cp "$BACKUP_DIR/likwid_$DATE.dump.gz" s3://bucket/backups/
ts=$(date +%Y%m%d_%H%M%S)
podman exec -t likwid-prod-db pg_dump -U likwid -F c -d likwid_prod > ~/likwid/backups/likwid_prod_${ts}.dump
```
### Containerized Backup
#### Demo compose (`compose/demo.yml`)
The demo database container is named `likwid-demo-db`.
```bash
# If using podman compose
podman exec likwid-prod-db pg_dump -U likwid likwid_prod > backup.sql
ts=$(date +%Y%m%d_%H%M%S)
podman exec -t likwid-demo-db pg_dump -U likwid_demo -F c -d likwid_demo > ~/likwid/backups/likwid_demo_${ts}.dump
```
#### Notes
- The `-F c` format is recommended because it is compact and supports `pg_restore --clean`.
- If you are using a shell that does not handle binary stdout redirection well, write the dump inside the container and use `podman cp`.
## Recovery
### Full Restore
### Restore into a fresh environment (containerized)
```bash
# Drop and recreate database
psql -h localhost -U likwid -d postgres -c "DROP DATABASE IF EXISTS likwid_prod;"
psql -h localhost -U likwid -d postgres -c "CREATE DATABASE likwid_prod OWNER likwid;"
This procedure is designed to work for a brand new server (or a clean slate on the same server).
# Restore from dump
pg_restore -h localhost -U likwid -d likwid_prod backup.dump
1. Ensure you have backups of:
# Or from SQL
psql -h localhost -U likwid likwid_prod < backup.sql
```
- `compose/.env.production` (or `compose/.env.demo`)
- Reverse proxy config
- The database dump file (`*.dump`)
1. If you are restoring over an existing instance, stop the stack.
Production:
```bash
cd ~/likwid
podman compose --env-file compose/.env.production -f compose/production.yml down
```
Demo:
```bash
cd ~/likwid
podman compose --env-file compose/.env.demo -f compose/demo.yml -f compose/demo.vps.override.yml down
```
1. If you need an empty database, remove the database volume (destructive).
Production (removes the `likwid_prod_data` volume):
```bash
cd ~/likwid
podman compose --env-file compose/.env.production -f compose/production.yml down -v
```
Demo (removes the `likwid_demo_data` volume):
```bash
cd ~/likwid
podman compose --env-file compose/.env.demo -f compose/demo.yml -f compose/demo.vps.override.yml down -v
```
1. Start only the database container so Postgres recreates the database.
Production:
```bash
cd ~/likwid
podman compose --env-file compose/.env.production -f compose/production.yml up -d postgres
```
Demo:
```bash
cd ~/likwid
podman compose --env-file compose/.env.demo -f compose/demo.yml -f compose/demo.vps.override.yml up -d postgres
```
1. Restore from the dump:
- Production restore:
```bash
podman exec -i likwid-prod-db pg_restore -U likwid -d likwid_prod --clean --if-exists < /path/to/likwid_prod_YYYYMMDD_HHMMSS.dump
```
- Demo restore:
```bash
podman exec -i likwid-demo-db pg_restore -U likwid_demo -d likwid_demo --clean --if-exists < /path/to/likwid_demo_YYYYMMDD_HHMMSS.dump
```
1. Verify the restore:
```bash
podman exec -t likwid-prod-db psql -U likwid -d likwid_prod -c "SELECT now();"
```
1. Start the full stack again (backend + frontend):
Production:
```bash
cd ~/likwid
podman compose --env-file compose/.env.production -f compose/production.yml up -d
```
Demo:
```bash
cd ~/likwid
podman compose --env-file compose/.env.demo -f compose/demo.yml -f compose/demo.vps.override.yml up -d
```
### Restore notes
- `pg_restore --clean --if-exists` drops existing objects before recreating them.
- If you are restoring between different versions, run the matching app version first, then upgrade normally.
### Point-in-Time Recovery
@ -91,17 +177,19 @@ The demo instance can be reset to initial state:
./scripts/demo-reset.sh
```
This removes all demo data by recreating the demo database volume; on startup the backend runs core migrations and demo seed migrations to restore the initial demo dataset.
This is destructive and removes all demo data by recreating the demo database volume; on startup the backend runs core migrations and demo seed migrations to restore the initial demo dataset. This is not a backup mechanism.
## Disaster Recovery Plan
### Preparation
1. Document backup procedures
2. Test restores regularly (monthly)
3. Keep offsite backup copies
4. Document recovery steps
### Recovery Steps
1. Provision new server if needed
2. Install Likwid dependencies
3. Restore database from backup
@ -111,14 +199,17 @@ This removes all demo data by recreating the demo database volume; on startup th
7. Update DNS if server changed
### Recovery Time Objective (RTO)
Target: 4 hours for full recovery
### Recovery Point Objective (RPO)
Target: 24 hours of data loss maximum (with daily backups)
## Testing Backups
Monthly backup test procedure:
1. Create test database
2. Restore backup to test database
3. Run verification queries