Backup and Recovery Checklist — SaaS Builder Playbooks

Use this checklist to verify your SaaS can recover from data loss, bad deploys, accidental deletes, server failure, and provider issues. Focus on backups you can actually restore, not just jobs that appear to run.

This page is for MVPs and small production SaaS apps running on a VPS, Docker, or managed cloud services.

Related production checklists:

Quick Fix / Quick Setup

Run a minimum viable backup setup now.

bash

# 1) Create a database backup now
mkdir -p /var/backups/myapp
pg_dump -Fc "$DATABASE_URL" > /var/backups/myapp/db-$(date +%F-%H%M).dump

# 2) Archive uploaded files now
rsync -a /var/www/myapp/media/ /var/backups/myapp/media-$(date +%F-%H%M)/

# 3) Copy env/secrets snapshot securely
cp /opt/myapp/.env /var/backups/myapp/env-$(date +%F-%H%M).bak
chmod 600 /var/backups/myapp/env-*.bak

# 4) Verify backup files exist
ls -lh /var/backups/myapp

# 5) Test restore into a temporary database
createdb myapp_restore_test
pg_restore -d myapp_restore_test /var/backups/myapp/db-$(date +%F-%H%M).dump

# 6) Put backups on a different machine or object storage
# example with aws cli
aws s3 sync /var/backups/myapp s3://YOUR-BACKUP-BUCKET/myapp/

# 7) Add a daily cron job
crontab -e
# 15 2 * * * pg_dump -Fc "$DATABASE_URL" > /var/backups/myapp/db-$(date +\%F-\%H\%M).dump

Minimum safe setup:

automated daily database backups
off-server storage
uploaded file backup
env/secrets escrow
a restore test at least monthly

If you only do one thing today, do a backup and a restore test.

What’s happening

Backups fail in production for predictable reasons:

jobs run on the same server that later dies
dumps are created but never copied offsite
uploads are not included
.env and deployment config exist only on the live machine
restore steps are undocumented
backup files are corrupt, empty, or too old to help

A working recovery plan for a small SaaS usually needs all of these:

database backup
uploaded file backup or object storage versioning
environment variable and secret escrow
deployment/config backup
offsite storage
restore drill with written steps
retention policy
monitoring for backup failures

Backups are not complete until you can restore them into a separate environment and boot the app.

Step-by-step implementation

1) Inventory what must be recoverable

At minimum, list:

primary database
uploaded files and private media
.env or secret references
Nginx config
systemd units
Docker Compose files
cron jobs
worker schedules
storage bucket names
webhook endpoints
DNS records that affect app recovery

Example inventory file:

txt

Database:
- postgres://... production primary
- nightly pg_dump custom format
- managed snapshot retention: 7 days

Files:
- /var/www/myapp/media
- S3 bucket: myapp-prod-uploads
- bucket versioning: enabled

Config:
- /etc/nginx/sites-available/myapp.conf
- /etc/systemd/system/myapp.service
- /opt/myapp/docker-compose.yml
- /opt/myapp/.env

Recovery targets:
- RPO: 24h
- RTO: 2h

2) Choose backup method by component

Recommended mapping:

Component	Preferred method	Notes
Postgres	pg_dump -Fc	Portable, flexible restore
MySQL	mysqldump	Test against exact server version
Managed DB	provider snapshots + logical dumps	Use both if possible
Local uploads	rsync, tar, or sync to object storage	Keep separate from code
S3-compatible storage	bucket versioning + replication	Durability is not enough without restore steps
Config files	Git/private repo + encrypted archive	Do not rely on live box only
Secrets	secret manager or encrypted escrow	Never leave as single copy on server

3) Automate database backups

Postgres cron example

bash

mkdir -p /var/backups/myapp
chmod 700 /var/backups/myapp
crontab -e

cron

15 2 * * * /usr/bin/pg_dump -Fc "$DATABASE_URL" > /var/backups/myapp/db-$(date +\%F-\%H\%M).dump

A safer pattern is a script with logging and retention:

bash

#!/usr/bin/env bash
set -euo pipefail

BACKUP_DIR="/var/backups/myapp"
STAMP="$(date +%F-%H%M)"
FILE="$BACKUP_DIR/db-$STAMP.dump"

mkdir -p "$BACKUP_DIR"
pg_dump -Fc "$DATABASE_URL" > "$FILE"

test -s "$FILE"
find "$BACKUP_DIR" -name 'db-*.dump' -mtime +7 -delete

Save as:

bash

/usr/local/bin/myapp-db-backup.sh
chmod +x /usr/local/bin/myapp-db-backup.sh

Cron:

cron

15 2 * * * /usr/local/bin/myapp-db-backup.sh >> /var/log/myapp-db-backup.log 2>&1

MySQL example

bash

mysqldump --single-transaction --quick --routines --triggers myapp_prod > /var/backups/myapp/db-$(date +%F-%H%M).sql

4) Back up uploaded files

If uploads are local:

bash

rsync -a --delete /var/www/myapp/media/ /var/backups/myapp/media-latest/

Timestamped archive example:

bash

tar -czf /var/backups/myapp/media-$(date +%F-%H%M).tar.gz /var/www/myapp/media

If uploads are in S3 or compatible object storage:

enable bucket versioning
document restore commands
consider bucket replication for another region/account

AWS versioning:

bash

aws s3api put-bucket-versioning \
  --bucket YOUR-BACKUP-BUCKET \
  --versioning-configuration Status=Enabled

5) Back up secrets and config

Keep copies of these outside the app server:

.env
deployment manifests
Docker Compose files
Nginx config
systemd units
cron jobs
TLS renewal config
DNS records
webhook endpoints and callback URLs

Example secure archive:

bash

tar -czf /tmp/myapp-config-$(date +%F-%H%M).tar.gz \
  /opt/myapp/.env \
  /etc/nginx/sites-available/myapp.conf \
  /etc/systemd/system/myapp.service \
  /opt/myapp/docker-compose.yml
chmod 600 /tmp/myapp-config-*.tar.gz

If secret exposure is a concern, prefer encrypted storage or a secret manager. Also review:

Security Checklist

6) Push backups off the primary server

Backups on the same machine are not recovery.

Example to S3:

bash

aws s3 sync /var/backups/myapp s3://YOUR-BACKUP-BUCKET/myapp/

Example to another VPS:

bash

rsync -az /var/backups/myapp/ backupuser@backup-host:/srv/backups/myapp/

Minimum rule:

at least one copy off the primary server
ideally in another zone, region, or provider

7) Define retention

Use multiple restore points.

Example minimum retention for a small SaaS:

daily backups: 7 to 14 days
weekly backups: 4 to 8 weeks
monthly backups: 3 to 12 months

You need enough history to recover from:

accidental deletes discovered late
bad migrations
silent corruption
compromised app behavior

8) Write a recovery runbook

Document exact commands and order.

Recommended runbook structure:

txt

1. Put app into maintenance mode
2. Confirm most recent valid backup set
3. Restore database to temp target
4. Restore uploads/media
5. Restore .env and deployment config
6. Start database/app/workers
7. Run validation checks
8. Switch traffic or disable maintenance mode
9. Monitor logs/errors
10. Record incident timeline and follow-ups

Include:

owner
where backups live
credentials path
restore commands
service restart order
RPO/RTO targets
rollback criteria

9) Test restore regularly

Restore into:

staging
a temporary VM
a temporary database
a separate Docker environment

Postgres restore test:

bash

createdb restore_test_db
pg_restore -d restore_test_db /var/backups/myapp/latest.dump
psql restore_test_db -c "\dt"

Validation checks after restore:

app boots
login works
dashboard loads
uploads are accessible
billing flows still function
email sending works
webhooks verify correctly
workers process jobs

Start

Process

End

recovery flowchart showing database restore, file restore, app boot, validation checks, and go-live decision

10) Monitor backup jobs

Alert on:

missed scheduled jobs
zero-byte dump files
low disk space
failed upload to offsite storage
restore test failures

If you already maintain production checks, also review:

SaaS Production Checklist

Common causes

Backups are written to the same server that later fails
Database dumps run but never get copied offsite
Uploaded media is omitted from backup scope
Secrets and environment variables exist only on the production box
Restore steps are undocumented or depend on tribal knowledge
Backups are corrupted, zero-byte, or incomplete due to disk space issues
Cron jobs fail silently because of missing environment variables or permissions
Retention is too short, so the only good restore point has already been deleted
Version mismatch between dump tool and database server breaks restore
Teams assume managed hosting means application-level data is fully recoverable without testing

Debugging tips

Use these commands to verify backup jobs, files, and restore readiness.

Scheduler and logs

bash

crontab -l
systemctl status cron || systemctl status crond
journalctl -u cron -n 100 --no-pager || journalctl -u crond -n 100 --no-pager
grep -R "backup\|pg_dump\|mysqldump" /etc/cron* /opt /srv 2>/dev/null

Backup files and disk space

bash

ls -lh /var/backups/myapp
du -sh /var/backups/myapp
df -h

Database tool and connectivity checks

bash

echo "$DATABASE_URL"
pg_dump --version
pg_restore --version
pg_isready -d "$DATABASE_URL"
mysqldump --version
mysql --version

Postgres restore checks

bash

createdb restore_test_db
pg_restore -l /var/backups/myapp/latest.dump | head
pg_restore -d restore_test_db /var/backups/myapp/latest.dump
psql restore_test_db -c "\dt"

Object storage checks

bash

aws s3 ls s3://YOUR-BACKUP-BUCKET/myapp/
aws s3 cp /var/backups/myapp/latest.dump s3://YOUR-BACKUP-BUCKET/myapp/latest.dump

File backup checks

bash

rsync --dry-run -a /var/www/myapp/media/ /var/backups/myapp/media-test/
tar -tzf backup.tar.gz | head

Docker checks

bash

docker ps
docker exec -it your-db-container pg_dump --version

If restore failures are caused by app connectivity after recovery, also review:

/database-connection-errors
/database-migration-strategy

Checklist

Coverage

✓ Database backups are enabled
✓ Uploaded files are backed up or versioned
✓ Secrets and environment variables have a secure second copy
✓ Deployment-specific config is backed up
✓ Recovery steps are written down

Storage and retention

✓ At least one backup copy is off the primary server
✓ Backup access follows least privilege
✓ Backups are encrypted at rest and in transit
✓ Retention includes daily, weekly, and monthly restore points
✓ Disk usage and backup storage usage are monitored

Validation

✓ Backup files are non-zero and recent
✓ Restore has been tested in a separate environment
✓ App boots successfully after restore
✓ Login, dashboard, uploads, billing, and email are validated
✓ Restore time is within target RTO
✓ Restore point freshness is within target RPO

Operational readiness

✓ Backup jobs do not depend on a developer laptop
✓ Alerts go to a real channel
✓ Runbook includes commands, owners, and restart order
✓ Backups are taken before risky migrations or infra changes
✓ Last restore drill was completed within 30 days

For broader launch readiness, pair this with:

Related guides

FAQ

Do I need both database backups and server snapshots?

Usually yes. Database dumps are portable and easier to restore selectively. Snapshots help recover full machine state faster. For small SaaS apps, combine logical database backups with offsite file/config backups at minimum.

How often should I test restores?

At least monthly for active production apps, and always before major infrastructure or database changes. Also test after changing backup scripts, storage providers, or database versions.

Should I back up the whole application codebase?

Code should already live in Git and your CI/CD system. Back up deployment-specific config, environment files, Nginx/systemd/Docker setup, and anything not reproducible from source control.

If I use S3 for uploads, do I still need backups?

Yes. Durable storage is not the same as a recovery plan. Enable versioning where possible and document how to restore or roll back deleted or overwritten objects.

What is the minimum viable backup setup for an MVP SaaS?

Daily automated database dump, offsite storage, uploaded file backup or object storage versioning, a secure copy of production env vars, and one tested restore procedure.

What should I exclude from backups?

Caches, temporary files, build artifacts, dependency directories, and anything easily reproducible. Focus on data, uploads, secrets, and deployment-specific config.

Final takeaway

A backup is only real if you have restored it successfully.

For a small SaaS, the minimum standard is:

automated backups
offsite copies
protected secrets
regular restore drills

Treat restore testing as part of production readiness, not as an optional ops task.