Lumen Docs

Backup & restore

What to backup

| Data | Location | Backup method | |---|---|---| | Postgres | lumen-postgres container | pg_dump | | Document files | upload_data volume | tar or rsync | | Model cache | model_cache volume | Skip — re-downloads on boot | | Redis queue | lumen-redis volume | Skip — ephemeral job state |

Manual backup (pre-migration)

Before any destructive migration, snapshot Postgres:

ssh jaeger "docker exec lumen-postgres pg_dump -U lumen -d lumen \
  | gzip > /tmp/lumen-backup-$(date +%Y%m%d-%H%M%S).sql.gz"

# Confirm size
ssh jaeger "ls -lh /tmp/lumen-backup-*.sql.gz"

Move off-host:

scp jaeger:/tmp/lumen-backup-*.sql.gz ./backups/

Document files

# On jaeger
docker run --rm -v upload_data:/source:ro \
  -v /tmp:/dest alpine \
  tar czf /dest/upload-backup.tgz -C /source .

scp jaeger:/tmp/upload-backup.tgz ./backups/

Restore Postgres

Destructive — nukes current DB state. Confirm first.

# Upload backup
scp backup.sql.gz jaeger:/tmp/

# On jaeger — stop services that write to DB
ssh jaeger "docker stop lumen-api lumen-worker"

# Drop + recreate schema, then restore
ssh jaeger "gunzip -c /tmp/backup.sql.gz | \
  docker exec -i lumen-postgres psql -U lumen -d lumen"

# Restart services
ssh jaeger "docker start lumen-api lumen-worker"

Restore document files

scp upload-backup.tgz jaeger:/tmp/
ssh jaeger "docker run --rm -v upload_data:/dest \
  -v /tmp:/source alpine \
  tar xzf /source/upload-backup.tgz -C /dest"

Daily backup cron

Gani runs a daily Hermes cron job (hermes-daily-backup) at 03:00 WIB that:

  1. pg_dump Postgres
  2. tar the uploads volume
  3. Push both to GitHub codename-zen/hermes-backup (private repo)

If you're operating independently, set up a similar cron on jaeger directly:

crontab -e

# Add:
0 3 * * * /usr/local/bin/lumen-backup.sh

Where lumen-backup.sh does the pg_dump + rclone upload to a B2/R2 bucket.

Recovery time objective

  • Schema restore: ~30 seconds for current DB size
  • Full restore: a few minutes (DB) + document file copy time
  • Fresh deploy from empty: ~6 minutes (Docker build + embedder model download)

What a broken migration looks like

During an access-control rewrite, splitting an enum extension + schema change into one migration failed because Postgres doesn't allow new enum values to be used in the same transaction they were added in. Symptom:

ERROR: unsafe use of new value of enum type

Resolution: split into part1_enums.sql (add values, commit) → part2_schema.sql (use values, separate TX). Always test destructive migrations against a pre-migration backup — if it fails mid-run, you have a safe point to restore to.

Healthcheck before declaring restore done

# API health
curl https://lumen-api.zenmail.my.id/bootstrap/status
# Should return {"isInitialized":true}

# Test a logged-in endpoint
TOKEN=$(curl -s -X POST .../auth/login ...)
curl -H "Authorization: Bearer $TOKEN" https://lumen-api.zenmail.my.id/projects
# Should return expected projects

# Doc count
ssh jaeger "docker exec lumen-postgres psql -U lumen -d lumen \
  -c 'SELECT count(*) FROM documents;'"

If counts match pre-backup numbers, you're restored.