- Learn
- PostgreSQL
- PostgreSQL VACUUM: Maintenance and Optimization
PostgreSQL VACUUM: Maintenance and Optimization
Keep your PostgreSQL database healthy and fast.
Why VACUUM Exists
PostgreSQL uses a multi-version concurrency control (MVCC) system. When you UPDATE or DELETE a row, PostgreSQL doesn't immediately remove the old data—it marks it as "dead" and keeps it around for active transactions that might still need it.
Over time, these dead rows accumulate. VACUUM cleans them up.
Basic VACUUM
Run VACUUM on a specific table:
Or all tables in the database:
This marks dead rows as reusable space but doesn't return space to the operating system.
VACUUM FULL
For serious space reclamation:
VACUUM FULL rewrites the entire table, removing all dead space and shrinking the file. But there's a catch:
Warning: VACUUM FULL locks the table exclusively. No reads or writes during the operation. For large tables, this can take hours. Only use this during maintenance windows.
VACUUM ANALYZE
Combine cleanup with statistics gathering:
ANALYZE updates statistics that the query planner uses. After bulk data changes, this helps PostgreSQL choose better query plans.
Autovacuum: Let PostgreSQL Handle It
PostgreSQL runs autovacuum in the background. It automatically vacuums tables when dead rows exceed thresholds.
Check if autovacuum is working:
Tuning Autovacuum
Default settings are conservative. For high-write tables, you might need to tune:
Or globally in postgresql.conf:
autovacuum_vacuum_scale_factor = 0.1
autovacuum_analyze_scale_factor = 0.05
autovacuum_vacuum_cost_delay = 10ms
When to Run Manual VACUUM
Autovacuum handles most cases, but run manual VACUUM after:
- Bulk deletes — Deleted millions of rows? Don't wait for autovacuum.
- Bulk updates — Each update creates dead rows.
- After initial data load — Fresh tables need statistics.
Monitoring Bloat
Dead rows cause "bloat"—tables larger than they need to be. Check table bloat:
If dead_pct is consistently high (>20%), autovacuum might not be keeping up.
VACUUM vs VACUUM FULL vs pg_repack
| Method | Locks Table? | Reclaims Disk Space? | Speed | |--------|-------------|---------------------|-------| | VACUUM | No | No (marks reusable) | Fast | | VACUUM FULL | Yes (exclusive) | Yes | Slow | | pg_repack | No | Yes | Medium |
pg_repack is an extension that reclaims space without exclusive locks—best of both worlds for production systems.
Transaction ID Wraparound
PostgreSQL uses 32-bit transaction IDs. After ~2 billion transactions, it needs to "freeze" old rows to prevent wraparound. VACUUM handles this.
Check how close you are:
If xid_age approaches freeze_max, PostgreSQL will force aggressive vacuuming—or in extreme cases, shut down to prevent corruption.
Quick Reference
| Command | Description |
|---------|-------------|
| VACUUM table | Clean dead rows, mark space reusable |
| VACUUM FULL table | Rewrite table, reclaim disk space (locks!) |
| VACUUM ANALYZE table | Clean + update statistics |
| ANALYZE table | Update statistics only |
| VACUUM VERBOSE table | Show detailed progress |
Best Practices
-
Trust autovacuum — It handles most cases. Don't disable it.
-
Tune for hot tables — High-write tables need lower thresholds.
-
Monitor bloat — Check
pg_stat_user_tablesregularly. -
VACUUM after bulk operations — Don't wait for autovacuum after large deletes/updates.
-
Avoid VACUUM FULL in production — Use pg_repack if you need space reclamation without downtime.
-
Watch for wraparound warnings — Don't ignore "preventing transaction ID wraparound" messages.
Summary
VACUUM is essential PostgreSQL maintenance that runs automatically via autovacuum. For most databases, you just need to monitor it's working. For high-write workloads, tune autovacuum thresholds. After bulk operations, run manual VACUUM. And if you need to reclaim disk space without downtime, use pg_repack instead of VACUUM FULL.