PostgreSQL Performance: The Index Strategies That Actually Matter

Most PostgreSQL performance issues are index problems. This is a practical guide to the indexing strategies that solve the majority of real-world performance problems.

Why Indexes Matter More Than Hardware

A sequential table scan of a 10 million row table checks every row. With an appropriate index, the same query touches only the relevant rows — potentially 1,000x fewer. This difference is not compensated by faster hardware: a 10x faster server still needs to check all 10 million rows without the index. Understanding query execution plans (EXPLAIN ANALYZE) is the prerequisite for meaningful optimisation. The output shows whether a query uses an index (Index Scan, Index Only Scan) or a sequential scan (Seq Scan), and the estimated vs actual row counts. A Seq Scan on a large table for a selective query almost always indicates a missing or unused index.

B-Tree Indexes: The Default

CREATE INDEX idx_users_email ON users(email) — the standard B-tree index, appropriate for: equality lookups (WHERE email = ?), range queries (WHERE created_at BETWEEN ? AND ?), ORDER BY on indexed columns, and JOIN conditions. B-tree indexes support all comparison operators (<, <=, =, >=, >). They are the right choice for the vast majority of indexing needs. A composite index on (user_id, created_at) can satisfy queries filtering on user_id alone OR filtering on both user_id AND created_at — but not queries filtering on created_at alone (the leading column rule). Design composite indexes to start with the most selective filter or the equality condition, then the range condition.

Partial Indexes: Often Overlooked

A partial index indexes only rows that satisfy a WHERE condition: CREATE INDEX idx_orders_pending ON orders(created_at) WHERE status = ‘pending’. If your table has 10 million orders and only 50,000 are pending, the partial index is 200x smaller, faster to create, faster to query, and uses far less disk space and memory. Common use cases: indexing only active/non-deleted records (soft-delete tables), indexing only recent data, indexing only the subset of rows that queries actually need to find. Check if your most common queries have a consistent WHERE condition that filters to a small subset of the table — if so, a partial index is likely the right optimisation.

The N+1 Problem and Index Usage

The N+1 query problem (loading a collection then making one query per item to load related data) produces many individually fast queries that collectively degrade performance. Example: loading 100 users then querying the orders table 100 times for each user’s orders. Fix: use a JOIN (loads all data in one query) or eager loading in your ORM. For the JOIN to be efficient, the foreign key column in the related table must be indexed: CREATE INDEX idx_orders_user_id ON orders(user_id). Missing foreign key indexes are one of the most common index omissions — many ORMs and schema tools don’t create them automatically. Check pg_stat_user_tables for tables with high seq_scan counts and low idx_scan counts as a starting point for finding missing indexes.

上一篇 AI生成内容的伦理:创作者需要思考的问题
下一篇 PostgreSQL性能:真正重要的索引策略