Database Migration Strategy

The essential playbook for implementing database migration strategy in your SaaS.

A migration strategy is the process you use to change your production database safely during deploys. For MVPs and small SaaS products, the goal is simple: make schema changes predictable, reversible where possible, and compatible with the running app during rollout.

This page covers a practical deployment-focused migration workflow for Flask, FastAPI, PostgreSQL, and common CI/CD setups.

Quick Fix / Quick Setup

bash
# 1) Back up production before schema changes
pg_dump "$DATABASE_URL" -Fc -f backup-predeploy.dump

# 2) Review pending migrations locally
alembic heads
alembic history --verbose
alembic upgrade head --sql

# 3) Run migration in production before switching traffic
alembic upgrade head

# 4) Verify schema state
alembic current
psql "$DATABASE_URL" -c "SELECT version_num FROM alembic_version;"

# 5) Restart app only after migration succeeds
sudo systemctl restart gunicorn

Use expand-and-contract changes for live systems: add nullable columns first, deploy app code that writes both formats if needed, backfill data, then remove old columns in a later deploy. Avoid destructive schema changes in the same release as app changes that depend on them.


What’s happening

Database migrations apply versioned schema changes such as:

  • creating tables
  • adding columns
  • creating indexes
  • updating constraints
  • running small data transforms

Deploy failures usually happen when app code and schema are out of sync during rollout.

Safe deploys require temporary compatibility:

  • old app version must still work while migration completes
  • new app version must work immediately after migration
  • destructive changes must be delayed until old code is fully gone

Safe default process for small SaaS:

  1. back up production
  2. review migration SQL
  3. run migrations once
  4. verify schema state
  5. restart or switch traffic
  6. monitor
backup
review
migrate
verify
app reload
monitor

Process Flow


Step-by-step implementation

1. Use a real migration tool

For Flask or FastAPI with SQLAlchemy, use Alembic.

Install:

bash
pip install alembic
alembic init migrations

Typical project layout:

txt
app/
migrations/
alembic.ini

Store migration files in git. Review them like code.

If you are still organizing your app structure, see Structuring a Flask/FastAPI SaaS Project.

2. Generate migrations, then review them manually

Example:

bash
alembic revision --autogenerate -m "add billing columns"

Autogenerated output is a draft, not a final answer.

Check for unsafe operations:

  • accidental table drops
  • column type rewrites
  • non-null constraints without defaults or backfill
  • index creation on large tables without concurrent strategy
  • rename operations that should be expand-and-contract instead

Preview SQL before merge:

bash
alembic upgrade head --sql

Also inspect migration history:

bash
alembic heads
alembic history --verbose

3. Design migrations for safe deploys

Prefer additive schema changes first:

  • add nullable columns
  • add new tables
  • add new indexes
  • add compatible constraints carefully

Avoid in the same release:

  • dropping columns used by running code
  • renaming columns directly
  • changing column types on large hot tables
  • destructive cleanup plus app rollout together

Safe column rename pattern:

  1. add new_column
  2. deploy app that writes both old_column and new_column
  3. backfill existing rows
  4. switch reads to new_column
  5. remove old_column in a later release

Example migration:

python
from alembic import op
import sqlalchemy as sa

def upgrade():
    op.add_column("users", sa.Column("full_name", sa.String(), nullable=True))

def downgrade():
    op.drop_column("users", "full_name")

4. Test against realistic data

Do not only test on an empty local database.

At minimum:

  • restore a recent snapshot locally or in staging
  • run migration timing tests
  • verify app startup against the migrated schema
  • confirm rollback or recovery steps

Example local snapshot restore:

bash
createdb myapp_staging_test
pg_restore -d myapp_staging_test backup-predeploy.dump
DATABASE_URL=postgresql:///myapp_staging_test alembic upgrade head

If you need a broader deployment pipeline around this, see CI/CD Pipeline for SaaS Deployment.

5. Back up production before release

Before applying schema changes:

bash
pg_dump "$DATABASE_URL" -Fc -f backup-predeploy.dump

If you rely on managed backups or point-in-time recovery, verify that:

  • backups are recent
  • restore permissions work
  • restore runbook exists
  • restore target and timing are known

Do not treat untested backups as a rollback plan.

6. Run migrations as a dedicated release step

Do not run migrations on every app container startup.

Problems with startup migrations:

  • multiple instances can race
  • failed migration blocks boot
  • release order becomes unpredictable
  • rollback becomes harder

Preferred patterns:

VPS / systemd

Create a release script:

bash
#!/usr/bin/env bash
set -euo pipefail

export DATABASE_URL="postgresql://user:pass@localhost/appdb"

pg_dump "$DATABASE_URL" -Fc -f "/var/backups/app-$(date +%F-%H%M%S).dump"
alembic upgrade head
alembic current
sudo systemctl restart gunicorn

Example run:

bash
bash deploy/release.sh

For server setup details, see Environment Setup on VPS.

Docker / Compose

Run migrations in a one-off container:

bash
docker compose run --rm app alembic upgrade head
docker compose up -d app

Check state:

bash
docker compose run --rm app alembic current
docker compose logs app

CI/CD release job

Typical sequence:

  1. build image
  2. deploy artifact to target environment
  3. run migration job once
  4. verify schema
  5. switch traffic or restart app
  6. monitor

For zero-downtime requirements, both old and new app versions must tolerate the in-between schema state. See Zero Downtime Deployment.

7. Verify immediately after migration

Run verification commands:

bash
alembic current
psql "$DATABASE_URL" -c "SELECT version_num FROM alembic_version;"
psql "$DATABASE_URL" -c "\dt"
psql "$DATABASE_URL" -c "\d+ users"

Check application health after restart:

bash
curl -f http://127.0.0.1:8000/health

If app startup fails after migration, inspect logs:

bash
journalctl -u gunicorn -n 200 --no-pager
docker compose logs app

8. Separate schema changes from heavy backfills

Large updates should not block deploys.

Avoid putting large data rewrites inside migrations such as:

  • updating millions of rows
  • re-computing derived values for every record
  • moving large JSON fields to normalized tables during request-serving hours

Instead:

  1. deploy additive schema
  2. deploy app compatibility logic
  3. run separate backfill job
  4. verify completion
  5. remove compatibility code later

Example backfill script:

python
from sqlalchemy import create_engine, text

engine = create_engine("postgresql://user:pass@localhost/appdb")

BATCH_SIZE = 1000

with engine.begin() as conn:
    while True:
        rows = conn.execute(text("""
            SELECT id, old_name
            FROM users
            WHERE full_name IS NULL
            LIMIT :limit
        """), {"limit": BATCH_SIZE}).fetchall()

        if not rows:
            break

        for row in rows:
            conn.execute(text("""
                UPDATE users
                SET full_name = :full_name
                WHERE id = :id
            """), {"id": row.id, "full_name": row.old_name})

9. Document rollback and recovery

Do not assume alembic downgrade -1 is enough.

Some migrations are reversible. Many are not.

Keep a simple decision model:

  • app issue only, schema still valid: deploy hotfix
  • reversible migration issue: tested downgrade
  • destructive or partial migration issue: restore from backup or execute manual repair
  • large data corruption risk: stop writes and restore carefully

decision tree for hotfix vs downgrade vs restore-from-backup.

What is the recovery approach for this failed migration?
Hotfix
Apply a targeted code fix forward without reverting the migration
Downgrade
Revert application code to a previous version keeping DB state as-is
Restore-from-backup
Restore the database from backup, then restart the app against the clean state

Common causes

Common reasons production migrations fail:

  • running breaking schema changes before the app is compatible
  • skipping backups or restore validation
  • accepting autogenerated migrations without review
  • long-running locks from indexes, type changes, or constraints
  • running migrations from multiple instances at the same time
  • embedding large backfills inside schema migrations
  • different migration histories across environments
  • wrong DATABASE_URL or environment variables
  • missing production DB privileges
  • restarting the app before migration finished successfully

Debugging tips

Use these commands during review or incident response:

bash
alembic current
alembic heads
alembic history --verbose
alembic upgrade head --sql
alembic downgrade -1
psql "$DATABASE_URL" -c "SELECT version_num FROM alembic_version;"
psql "$DATABASE_URL" -c "\dt"
psql "$DATABASE_URL" -c "\d+ users"
psql "$DATABASE_URL" -c "SELECT pid, state, query FROM pg_stat_activity;"
psql "$DATABASE_URL" -c "SELECT * FROM pg_locks l JOIN pg_stat_activity a ON l.pid = a.pid;"
pg_dump "$DATABASE_URL" -Fc -f backup.dump
journalctl -u gunicorn -n 200 --no-pager
docker compose logs app
docker compose run --rm app alembic current

What to check:

Migration state mismatch

If app says a table or column is missing:

bash
alembic current
psql "$DATABASE_URL" -c "SELECT version_num FROM alembic_version;"

Confirm the production database actually reached the expected revision.

Lock contention

If migration hangs:

bash
psql "$DATABASE_URL" -c "SELECT pid, state, query FROM pg_stat_activity;"
psql "$DATABASE_URL" -c "SELECT * FROM pg_locks l JOIN pg_stat_activity a ON l.pid = a.pid;"

Look for:

  • long-running transactions
  • blocked DDL
  • background jobs touching the same table

Wrong target database

Before running production migration, print the current connection target from your deploy script:

bash
echo "$DATABASE_URL"

Verify host, database name, and user.

App restart failures after migration

Check service logs:

bash
journalctl -u gunicorn -n 200 --no-pager
docker compose logs app

If you are troubleshooting post-release behavior, see Debugging Production Issues. If migration connectivity itself is failing, see Database Connection Errors.


Checklist

  • Migration file reviewed manually
  • SQL preview checked for locks or destructive operations
  • Production backup completed or PITR confirmed
  • Migration tested on staging or realistic snapshot
  • Schema change is backward-compatible with rolling deploy
  • Backfill separated from schema change when needed
  • Release step defined in CI/CD or deploy script
  • Post-migration verification commands prepared
  • Monitoring and error tracking ready during release
  • Rollback or restore plan documented

For final pre-launch and post-launch checks, use the SaaS Production Checklist.


Related guides


FAQ

Should I run migrations before or after deploying new code?

For backward-compatible changes, either can work during a controlled rollout. In practice, run migrations as a release step before switching traffic or before restarting the app, as long as old code remains compatible with the new schema.

How do I handle a column rename safely?

Do not rename in one step on a live system. Add the new column, dual-write, backfill existing data, switch reads to the new column, then remove the old column in a later release.

Are database downgrades enough for rollback?

Not always. Some migrations are not safely reversible, especially destructive ones. Keep backups and a restore procedure as part of your rollback plan.

Can I put large backfills inside Alembic migrations?

Avoid that for production deploys. Large updates can lock rows, run too long, or fail midway. Use separate jobs or scripts with progress tracking.

What is the minimum safe process for an indie SaaS?

Review migration SQL, test on staging or a snapshot, back up production, run migration once in a controlled release step, verify schema state, then monitor after deploy.


Final takeaway

The safest migration strategy for small SaaS products is repeatable and explicit:

  • review
  • back up
  • migrate
  • verify
  • monitor

Favor backward-compatible schema changes. Separate heavy backfills from deployment. Make database changes a first-class release step, not an implicit side effect of app startup.