Theres a moment of terror when you need to alter a production table with millions of rows. Run the wrong migration and your site goes down for hours.
But with careful planning, you can change schemas without any downtime. Heres how.
The Problem
New code expecting new schema + old schema = crashes. The order and approach matters.
Rule 1: Backwards Compatible First
Make changes that work with both old AND new code:
Adding a Column
Safe approach:
-- Step 1: Add nullable column
ALTER TABLE users ADD COLUMN email_verified boolean;
-- Step 2: Backfill (in batches!)
UPDATE users SET email_verified = false
WHERE email_verified IS NULL
LIMIT 10000;
-- Step 3: Deploy code that uses new column
-- Step 4: Add constraint (after all rows filled)
ALTER TABLE users
ALTER COLUMN email_verified SET NOT NULL;
Dont do this:
-- This locks the table and sets default for every row
ALTER TABLE users
ADD COLUMN email_verified boolean NOT NULL DEFAULT false;
On large tables, this can lock for minutes or hours.
Renaming a Column
Never rename directly. Use the expand-contract pattern:
// Phase 2-4: Write to both, read from new
async function updateUser(id: string, name: string) {
await db.query(`
UPDATE users
SET name = $1, full_name = $1
WHERE id = $2
`, [name, id]);
}
async function getUser(id: string) {
const result = await db.query(`
SELECT full_name as name FROM users WHERE id = $1
`, [id]);
return result.rows[0];
}
Deleting a Column
- Remove all code that reads/writes the column
- Deploy and verify
- Wait for all old instances to drain
- Drop the column
Large Data Backfills
Never update millions of rows at once:
async function backfillInBatches() {
const BATCH_SIZE = 5000;
let processed = 0;
while (true) {
const result = await db.query(`
UPDATE users
SET email_verified = false
WHERE email_verified IS NULL
LIMIT $1
RETURNING id
`, [BATCH_SIZE]);
processed += result.rowCount;
console.log(`Processed ${processed} rows`);
if (result.rowCount < BATCH_SIZE) break;
// Small delay to reduce load
await sleep(100);
}
}
Index Creation
Creating indexes locks the table by default. Use CONCURRENTLY:
-- This blocks writes
CREATE INDEX idx_users_email ON users(email);
-- This doesn't block (but takes longer)
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);
Note: CONCURRENTLY cant run in a transaction.
Migration Checklist
Before running:
- [ ] Tested on production-size dataset?
- [ ] Migration is backwards compatible?
- [ ] Batched for large tables?
- [ ] Indexes created concurrently?
- [ ] Rollback plan ready?
After running:
- [ ] Monitor query performance
- [ ] Check for locks
- [ ] Verify data integrity
Tools That Help
- gh-ost - Online schema migrations for MySQL
- pg-osc - Similar for PostgreSQL
- Flyway / Liquibase - Migration management
Further Reading
Zero-downtime migrations take more steps, but they let you ship changes without the 3am maintenance windows. Plan ahead, go in phases, and always have a rollback ready.
