Performance Checklist
The essential playbook for implementing performance checklist in your SaaS.
Use this checklist to verify the main performance layers of a small SaaS before launch and after major changes. It focuses on practical bottlenecks: slow queries, missing caching, oversized assets, blocking requests, bad worker setup, and weak monitoring. The goal is to catch the common issues that make MVPs feel slow in production.
Related production checklists:
- SaaS Production Checklist
- Security Checklist
- Auth System Checklist
- Metrics and Performance Monitoring
- Database Performance Monitoring
Quick Fix / Quick Setup
Run this after deployment or before launch. If TTFB is high, start with app/database profiling. If static assets are slow, fix compression, caching headers, and file delivery first.
# Quick production performance sweep
# 1) Check app/server resource pressure
uptime
free -m
df -h
ps aux --sort=-%mem | head
ps aux --sort=-%cpu | head
# 2) Check web/app errors and slow behavior
sudo journalctl -u gunicorn -n 200 --no-pager
sudo tail -n 200 /var/log/nginx/access.log
sudo tail -n 200 /var/log/nginx/error.log
# 3) Check database pressure (Postgres)
psql "$DATABASE_URL" -c "select pid, now()-query_start as duration, state, query from pg_stat_activity where state <> 'idle' order by duration desc limit 10;"
psql "$DATABASE_URL" -c "select query, calls, total_exec_time, mean_exec_time from pg_stat_statements order by total_exec_time desc limit 10;"
# 4) Validate HTTP performance
curl -o /dev/null -s -w 'dns=%{time_namelookup} connect=%{time_connect} ttfb=%{time_starttransfer} total=%{time_total}\n' https://yourdomain.com/
# 5) Check worker/queue backlog
redis-cli ping
redis-cli llen default
# 6) Test compression and caching headers
curl -I https://yourdomain.com/static/app.css
curl -H 'Accept-Encoding: gzip' -I https://yourdomain.com/What’s happening
Performance problems usually come from a small set of bottlenecks:
- slow database queries
- too few app workers
- blocking network calls
- oversized frontend assets
- missing cache layers
- background jobs running in the request path
A checklist is more useful than isolated tuning because most SaaS latency issues are cross-layer:
- app code
- database
- web server
- queue workers
- browser delivery
For small SaaS products, the target is not maximum complexity. The target is:
- predictable response times
- stable memory use
- enough monitoring to catch regressions early
Process Flow
Step-by-step implementation
1. Define performance budgets
Set concrete targets before tuning.
Example starting budgets:
- homepage TTFB:
< 500ms - API p95:
< 800ms - DB query p95 for common reads:
< 100ms - background queue delay:
< 30s - CSS/JS initial payload: as small as possible, preferably
< 300KB gzippedfor critical paths - hero images and dashboard assets: resized and compressed
Track these in release reviews and in your main production checklist at SaaS Production Checklist.
2. Enable request timing and query visibility
If you cannot see slow requests, do not optimize yet.
Minimum setup:
- access logs with request duration
- app-level timing for key routes
- Postgres query visibility with
pg_stat_statements - error tracking and APM if available
Enable pg_stat_statements in Postgres:
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;Example checks:
psql "$DATABASE_URL" -c "select query, calls, total_exec_time, mean_exec_time from pg_stat_statements order by total_exec_time desc limit 20;"
psql "$DATABASE_URL" -c "select pid, now()-query_start as duration, wait_event_type, wait_event, state, query from pg_stat_activity where state <> 'idle' order by duration desc limit 20;"3. Profile the slowest endpoints
Identify the slowest 5 endpoints first.
Check:
- endpoint path
- request duration
- SQL time
- external API time
- response size
- auth/session overhead
- cache hit/miss behavior
Use timing output from curl:
curl -o /dev/null -s -w 'dns=%{time_namelookup} connect=%{time_connect} appconnect=%{time_appconnect} ttfb=%{time_starttransfer} total=%{time_total}\n' https://yourdomain.com/dashboardIf authenticated pages are much slower than public pages, inspect:
- session backend
- user/tenant lookup queries
- permission checks
- dashboard aggregate queries
4. Optimize database access
Common fixes:
- add missing indexes
- remove N+1 queries
- paginate large result sets
- avoid
SELECT * - reduce repeated counts/aggregations in dashboards
- use precomputed summaries where needed
Examples of indexes often missed in SaaS apps:
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_projects_tenant_id ON projects(tenant_id);
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_invoices_tenant_status ON invoices(tenant_id, status);
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_events_created_at ON events(created_at DESC);
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_members_user_id ON members(user_id);Quick checks for table/index usage:
SELECT schemaname, relname, seq_scan, seq_tup_read, idx_scan, idx_tup_fetch
FROM pg_stat_user_tables
ORDER BY seq_scan DESC
LIMIT 20;SELECT indexrelname, relname, idx_scan
FROM pg_stat_user_indexes
ORDER BY idx_scan ASC
LIMIT 20;If you see high sequential scans on large tables, investigate indexing.
5. Add caching deliberately
Do not add Redis everywhere by default.
Good cache candidates:
- pricing/config data
- tenant settings
- dashboard summary cards
- expensive computed responses
- rate limit state
- session storage if required by your stack
Define:
- cache key format
- TTL
- invalidation rule
- tenant/user isolation
Example cache key pattern:
tenant:{tenant_id}:dashboard_summary:v3
user:{user_id}:permissions:v2
pricing:public:v1Rules:
- include tenant or user context where needed
- version keys when response shape changes
- avoid caching mutable objects without invalidation strategy
6. Move blocking work to background jobs
These should usually not run in the web request:
- email sending
- image processing
- webhook retries
- exports
- report generation
- billing sync tasks
- non-critical third-party updates
Check queue health:
redis-cli ping
redis-cli llen defaultIf queue depth grows and workers are healthy, inspect:
- slow job handlers
- retry loops
- large payloads
- dead jobs
- missing worker concurrency
Related setup and troubleshooting:
7. Tune app concurrency
Set Gunicorn/Uvicorn worker counts based on CPU and memory, not guesswork.
Example Gunicorn config:
# gunicorn.conf.py
bind = "0.0.0.0:8000"
workers = 3
threads = 2
worker_class = "gthread"
timeout = 30
graceful_timeout = 30
keepalive = 5
max_requests = 1000
max_requests_jitter = 100
accesslog = "-"
errorlog = "-"Review after deploy:
- CPU saturation
- memory per worker
- request queueing
- worker restarts
- timeout rates
Check process pressure:
ps aux --sort=-%mem | head
ps aux --sort=-%cpu | head
top -o %CPU
vmstat 1 5
iostat -xz 1 3If memory grows over time, inspect:
- worker recycling
- in-process caches
- large ORM objects retained too long
- unbounded task payloads
8. Optimize static asset delivery
Static assets should not be a hidden bottleneck.
Requirements:
- minified CSS/JS
- gzip or Brotli enabled
- fingerprinted filenames
- long-lived cache headers for versioned assets
- image resizing
- optional CDN/object storage if traffic justifies it
Example Nginx config:
gzip on;
gzip_types text/plain text/css application/javascript application/json application/xml image/svg+xml;
gzip_min_length 1024;
location /static/ {
alias /var/www/app/static/;
expires 1y;
add_header Cache-Control "public, immutable";
access_log off;
}Validate headers:
curl -I https://yourdomain.com/static/app.css
curl -H 'Accept-Encoding: gzip' -I https://yourdomain.com/Check for:
Cache-ControlContent-Encoding- correct
Content-Type - static files being served by Nginx, not app workers
9. Add timeouts and retry rules for external dependencies
Never allow external APIs to hang web workers indefinitely.
Cover:
- Stripe
- email provider
- OAuth providers
- internal HTTP services
- webhook deliveries
Rules:
- set connect timeout
- set read timeout
- retry only where safe
- use background jobs for non-critical retries
- add circuit breaking or failure isolation if traffic grows
If p95 is poor but CPU is low, external I/O is often the problem.
10. Add monitoring and alerts
Minimum performance monitoring:
- p50/p95 request latency
- 4xx/5xx rates
- CPU
- memory
- disk
- DB connections
- slow queries
- queue depth
- worker health
- uptime
You should also review:
11. Re-test after changes
After every optimization:
- compare baseline vs new metrics
- verify no correctness regressions
- verify cache invalidation
- verify queue behavior
- verify memory remains stable over time
Do not keep tuning without confirming measurable gains.
Common causes
- Missing database indexes on
tenant_id,user_id,status,created_at, or foreign keys - N+1 ORM queries in dashboards, admin tables, and API serializers
- Too many synchronous tasks inside request handlers
- App worker count too low or too high for available memory
- No caching headers on static files
- Static files served by the app instead of Nginx or object storage
- Large unoptimized images and JavaScript bundles
- External API calls without timeouts causing hung workers
- Connection pool exhaustion between app and database
- Long-running database transactions or locks
- Queue workers down, causing delayed async work to fall back into user-facing flows
- Memory leaks from unbounded in-process caches or large object retention
- No observability, so regressions are only noticed after users report them
Debugging tips
Start by isolating the layer before changing infrastructure.
Basic host checks
uptime
free -m
df -h
ps aux --sort=-%cpu | head
ps aux --sort=-%mem | head
top -o %CPU
vmstat 1 5
iostat -xz 1 3App and web logs
sudo journalctl -u gunicorn -n 200 --no-pager
sudo tail -n 200 /var/log/nginx/access.log
sudo tail -n 200 /var/log/nginx/error.logHTTP timing
curl -o /dev/null -s -w 'dns=%{time_namelookup} connect=%{time_connect} ttfb=%{time_starttransfer} total=%{time_total}\n' https://yourdomain.com/Database pressure
psql "$DATABASE_URL" -c "select pid, now()-query_start as duration, state, query from pg_stat_activity where state <> 'idle' order by duration desc limit 10;"
psql "$DATABASE_URL" -c "select query, calls, total_exec_time, mean_exec_time from pg_stat_statements order by total_exec_time desc limit 10;"Queue checks
redis-cli ping
redis-cli llen defaultStatic asset checks
curl -I https://yourdomain.com/static/app.css
curl -H 'Accept-Encoding: gzip' -I https://yourdomain.com/Diagnostic rules
- Compare app latency vs database latency before changing server size
- Use access logs with request timing to isolate slow endpoints first
- If only authenticated pages are slow, inspect DB queries, session storage, and permission checks
- If CPU is low but responses are slow, suspect I/O waits, DB locks, external APIs, or queue contention
- If memory keeps rising after deploys, inspect worker recycling and large caches
- If static assets are slow but HTML is fast, inspect compression, cache headers, and asset size
- If p95 is bad but p50 is fine, look for lock contention, cache misses, or one expensive code path
Process Flow
Checklist
Application
- ✓ Request timing enabled for main endpoints
- ✓ No debug mode, dev reloaders, or verbose SQL logging left on in production
- ✓ Expensive computations are cached or precomputed where appropriate
- ✓ External API calls have connect/read timeouts and retry rules
- ✓ File uploads and report generation do not block web requests
Database
- ✓ Slow query visibility is enabled
- ✓ Indexes exist for frequent filters, joins, and tenant/user lookups
- ✓ N+1 query patterns are removed from dashboard and list pages
- ✓ Pagination is used for large result sets
- ✓ Connection pool settings are sane for app worker count
Caching
- ✓ Redis or equivalent is used only where it reduces repeated expensive work
- ✓ Cache keys include tenant/user context where required
- ✓ Cache invalidation strategy is defined for mutable data
Static assets
- ✓ CSS/JS are minified and compressed
- ✓ Cache-Control headers are set for versioned assets
- ✓ Images are resized and modern formats considered where useful
Workers
- ✓ Background queue is running and monitored
- ✓ Retries and dead-letter behavior are defined for critical jobs
Infrastructure
- ✓ CPU, memory, disk, and open file limits are monitored
- ✓ Web server and app timeouts are configured intentionally
Monitoring
- ✓ p95 latency, error rate, queue depth, and DB health alerts exist
Release process
- ✓ Performance is checked after major feature launches and migration-heavy releases
- ✓ Performance review is included in SaaS Production Checklist
- ✓ Monitoring review is included in Monitoring Checklist
- ✓ Security changes did not accidentally degrade performance in Security Checklist
- ✓ Auth/session changes were reviewed in Auth System Checklist
Related guides
- Deploy SaaS with Nginx + Gunicorn
- Database Performance Monitoring
- Metrics and Performance Monitoring
- Scaling Basics (Vertical & Horizontal)
- Monitoring Checklist
FAQ
What should I optimize first?
Start with measurement, then fix the slowest endpoints and queries before changing servers or adding complexity.
Do I need Redis for every SaaS?
No. Use it when repeated reads, sessions, rate limits, or background jobs justify it.
What is a good first latency target?
For core pages and APIs, aim for consistent sub-second behavior with a reasonable p95 under normal load.
Should I optimize frontend or backend first?
Whichever is dominating user wait time. Measure TTFB, asset weight, and render delays before deciding.
How often should I run this checklist?
Before launch, after major features, after infra changes, and during recurring production reviews.
What should I check before scaling servers?
Confirm whether the bottleneck is actually compute. Slow queries, blocking external API calls, missing caching, or static asset issues are often cheaper and more effective to fix first.
How do I know if the database is the bottleneck?
Look for slow query durations, lock waits, high connection counts, and endpoints whose latency tracks query time. pg_stat_statements and slow query logs are the fastest way to confirm this.
Should every background task be moved off the request path?
Move anything user-facing that can safely be async: email, report generation, image processing, webhook retries, and non-critical syncing. Keep only work required for immediate correctness in the request path.
What is the minimum monitoring needed for performance?
Track request latency, error rate, CPU, memory, DB health, queue depth, and uptime. Add alerts for sustained p95 latency increases and worker or database failures.
How often should this checklist be reviewed?
Run it before launch, after major releases, after infrastructure changes, and during periodic production reviews to catch regressions early.
Final takeaway
Performance is a release discipline, not a one-time tuning task.
For a small SaaS, the biggest wins usually come from:
- query fixes
- background jobs
- caching
- asset delivery
- basic monitoring
Use this checklist to establish a baseline, catch regressions early, and only add complexity when metrics justify it.