Backup & restore
What to backup
| Data | Location | Backup method |
|---|---|---|
| Postgres | lumen-postgres container | pg_dump |
| Document files | upload_data volume | tar or rsync |
| Model cache | model_cache volume | Skip — re-downloads on boot |
| Redis queue | lumen-redis volume | Skip — ephemeral job state |
Manual backup (pre-migration)
Before any destructive migration, snapshot Postgres:
ssh jaeger "docker exec lumen-postgres pg_dump -U lumen -d lumen \
| gzip > /tmp/lumen-backup-$(date +%Y%m%d-%H%M%S).sql.gz"
# Confirm size
ssh jaeger "ls -lh /tmp/lumen-backup-*.sql.gz"
Move off-host:
scp jaeger:/tmp/lumen-backup-*.sql.gz ./backups/
Document files
# On jaeger
docker run --rm -v upload_data:/source:ro \
-v /tmp:/dest alpine \
tar czf /dest/upload-backup.tgz -C /source .
scp jaeger:/tmp/upload-backup.tgz ./backups/
Restore Postgres
Destructive — nukes current DB state. Confirm first.
# Upload backup
scp backup.sql.gz jaeger:/tmp/
# On jaeger — stop services that write to DB
ssh jaeger "docker stop lumen-api lumen-worker"
# Drop + recreate schema, then restore
ssh jaeger "gunzip -c /tmp/backup.sql.gz | \
docker exec -i lumen-postgres psql -U lumen -d lumen"
# Restart services
ssh jaeger "docker start lumen-api lumen-worker"
Restore document files
scp upload-backup.tgz jaeger:/tmp/
ssh jaeger "docker run --rm -v upload_data:/dest \
-v /tmp:/source alpine \
tar xzf /source/upload-backup.tgz -C /dest"
Daily backup cron
Gani runs a daily Hermes cron job (hermes-daily-backup) at 03:00 WIB that:
pg_dumpPostgrestarthe uploads volume- Push both to GitHub
codename-zen/hermes-backup(private repo)
If you're operating independently, set up a similar cron on jaeger directly:
crontab -e
# Add:
0 3 * * * /usr/local/bin/lumen-backup.sh
Where lumen-backup.sh does the pg_dump + rclone upload to a B2/R2 bucket.
Recovery time objective
- Schema restore: ~30 seconds for current DB size
- Full restore: a few minutes (DB) + document file copy time
- Fresh deploy from empty: ~6 minutes (Docker build + embedder model download)
What a broken migration looks like
During an access-control rewrite, splitting an enum extension + schema change into one migration failed because Postgres doesn't allow new enum values to be used in the same transaction they were added in. Symptom:
ERROR: unsafe use of new value of enum type
Resolution: split into part1_enums.sql (add values, commit) → part2_schema.sql (use values, separate TX). Always test destructive migrations against a pre-migration backup — if it fails mid-run, you have a safe point to restore to.
Healthcheck before declaring restore done
# API health
curl https://lumen-api.zenmail.my.id/bootstrap/status
# Should return {"isInitialized":true}
# Test a logged-in endpoint
TOKEN=$(curl -s -X POST .../auth/login ...)
curl -H "Authorization: Bearer $TOKEN" https://lumen-api.zenmail.my.id/projects
# Should return expected projects
# Doc count
ssh jaeger "docker exec lumen-postgres psql -U lumen -d lumen \
-c 'SELECT count(*) FROM documents;'"
If counts match pre-backup numbers, you're restored.