PostgreSQL Performance: Indexes, Query Plans, and Avoiding Common Bottlenecks

2026年6月19日 Tech & Tools

PostgreSQL is the most capable open-source relational database and the default choice for most new applications, but it requires understanding its internals to avoid common performance traps. Here are the patterns that cause most PostgreSQL performance problems and how to fix them.

Understanding EXPLAIN ANALYZE

The most important performance tool: `EXPLAIN ANALYZE SELECT …;` shows the actual query execution plan with timing. Key things to look for: Seq Scan vs Index Scan: a Seq Scan (sequential scan, reading every row) on a large table where you are filtering by a specific value is the most common source of slowness. The fix: ensure an index exists on the column(s) you are filtering by. Index Scan vs Index Only Scan: an Index Only Scan is more efficient — it means PostgreSQL can satisfy the query entirely from the index without reading the heap (table data). This happens when all columns you select are included in the index. Nested Loop vs Hash Join vs Merge Join: for joins, PostgreSQL chooses the join algorithm. Nested Loop is efficient for small result sets; Hash Join is typical for larger joins; Merge Join requires sorted inputs. If the planner chooses a Nested Loop on a large table join, it may be working with incorrect statistics — run ANALYZE. Rows estimates: the planner estimates how many rows each step will produce. If the estimate is wildly wrong (e.g., estimates 1 row, actually processes 50,000), the plan will be wrong. Fix: run VACUUM ANALYZE to update statistics. Cost numbers: EXPLAIN shows cost in abstract units. The total cost of the plan is what you want to minimise, but actual timing from EXPLAIN ANALYZE is more actionable.

Index Types and When to Use Each

B-tree (default): good for equality (`=`), range (`>`, `<`, `BETWEEN`), and `ORDER BY`. The default index type — use it for most purposes. Hash: faster than B-tree for equality-only lookups, but not WAL-logged before PostgreSQL 10 and less commonly needed. GIN (Generalized Inverted Index): for full-text search, JSONB containment (`@>`), and array operations. If you search inside JSONB documents or text columns frequently, GIN is the right index type. GiST (Generalized Search Tree): for geometric types, range types, and full-text search. BRIN (Block Range Index): for naturally-ordered large tables (time-series, append-only logs). Very small index, efficient for range scans on monotonically increasing columns like timestamps. Partial indexes: index only a subset of rows. Example: `CREATE INDEX ON orders (status) WHERE status = ‘pending’;` — if 99% of orders are completed and you only query pending orders, this index is far smaller and faster than a full index on status. Covering indexes: include additional columns in the index to enable Index Only Scans. `CREATE INDEX ON users (email) INCLUDE (name, role);` — a query selecting name and role by email can be answered without reading the table.

Common Bottlenecks and Fixes

N+1 query problem: executing one query to get a list, then one query per item in the list. Fix: use JOIN or IN clause to fetch in one query, or use a library that does this automatically. Missing index on foreign key: when you JOIN on a foreign key column that has no index, PostgreSQL does a sequential scan. In PostgreSQL, foreign key constraints do NOT automatically create an index (unlike MySQL). Always create indexes on foreign key columns. `SELECT *` on wide tables: selecting all columns on a table with many columns or large JSONB/text columns transfers unnecessary data. Select only the columns you need. LIKE ‘%pattern%’ is not indexable: a leading wildcard prevents B-tree index use. Solutions: pg_trgm extension with GIN index enables substring search; full-text search with GIN for text content. Connection overhead: PostgreSQL handles each connection with a separate process. Hundreds of short-lived connections (typical in serverless or auto-scaling environments) create significant overhead. Fix: PgBouncer connection pooler sits between the application and PostgreSQL, maintaining a pool of database connections. Lock contention: long-running transactions hold locks. `SELECT * FROM pg_locks JOIN pg_stat_activity USING (pid)` shows what is holding locks. Autovacuum lag: PostgreSQL’s MVCC model accumulates dead row versions (tuples) that autovacuum must clean up. On high-write tables, check `n_dead_tup` in `pg_stat_user_tables` — if this is large, autovacuum may need tuning (increase `autovacuum_vacuum_scale_factor` or `autovacuum_vacuum_cost_delay`).

作者：

链接：https://www.sunqi.org/postgresql-performance-guide.html

文章版权归作者所有，未经允许请勿转载。

PostgreSQL Performance: Indexes, Query Plans, and Avoiding Common Bottlenecks

Understanding EXPLAIN ANALYZE

Index Types and When to Use Each

Common Bottlenecks and Fixes

探索站点内容