PostgreSQL Performance Tuning: Complete Optimization Guide

Maximize PostgreSQL performance through proper indexing, query optimization, configuration tuning, and monitoring strategies.

Performance Fundamentals

Key Performance Factors:

Proper indexing strategy
Query optimization
Configuration tuning
Hardware resources (CPU, RAM, I/O)
Connection pooling
Vacuuming and maintenance

Monitoring Goals:

Query response time under 100ms for simple queries
Connection pool efficiency above 80%
Cache hit ratio above 95%
Index usage for filtered queries
Minimal table scans on large tables

Indexing Strategies

B-Tree Indexes (Default)

Best for equality and range queries:

-- Single column index
CREATE INDEX idx_users_email ON users(email);

-- Multi-column index (order matters!)
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date);

-- Partial index (conditional)
CREATE INDEX idx_active_users ON users(email)
WHERE status = 'active';

-- Index with included columns (covering index)
CREATE INDEX idx_users_email_covering ON users(email)
INCLUDE (name, created_at);

Usage:

-- Uses idx_users_email
SELECT * FROM users WHERE email = 'john@example.com';

-- Uses idx_orders_customer_date
SELECT * FROM orders
WHERE customer_id = 123 AND order_date >= '2024-01-01';

GIN Indexes (Generalized Inverted Index)

Best for array, JSONB, full-text search:

-- JSONB index
CREATE INDEX idx_products_metadata ON products
USING GIN (metadata jsonb_path_ops);

-- Array index
CREATE INDEX idx_tags ON articles USING GIN (tags);

-- Full-text search
CREATE INDEX idx_content_search ON articles
USING GIN (to_tsvector('english', content));

Usage:

-- JSONB query
SELECT * FROM products
WHERE metadata @> '{"category": "electronics"}';

-- Array containment
SELECT * FROM articles WHERE tags @> ARRAY['postgresql', 'performance'];

-- Full-text search
SELECT * FROM articles
WHERE to_tsvector('english', content) @@ to_tsquery('postgresql & performance');

BRIN Indexes (Block Range Index)

Best for large, naturally ordered tables:

CREATE INDEX idx_events_timestamp ON events
USING BRIN (created_at);

Advantages:

Tiny index size (1000x smaller than B-tree)
Great for time-series data
Minimal maintenance overhead

Trade-offs:

Less precise than B-tree
Only effective if data physically ordered

Hash Indexes

Best for exact equality only:

CREATE INDEX idx_users_id_hash ON users USING HASH (id);

Note: B-tree usually better choice due to versatility

Query Optimization

Using EXPLAIN ANALYZE

Identify slow queries:

EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON u.id = o.customer_id
WHERE u.created_at >= '2024-01-01'
GROUP BY u.id, u.name
ORDER BY order_count DESC
LIMIT 100;

Key Metrics:

Seq Scan: Table scan (bad for large tables)
Index Scan: Using index (good)
Nested Loop: Join method (expensive for large datasets)
Hash Join: Efficient join for large tables
Execution Time: Actual query runtime

Avoid N+1 Queries

BAD: Multiple queries:

users = db.query("SELECT * FROM users")
for user in users:
    orders = db.query(f"SELECT * FROM orders WHERE customer_id = {user.id}")

GOOD: Single join query:

results = db.query("""
    SELECT u.*, o.id AS order_id, o.total
    FROM users u
    LEFT JOIN orders o ON u.id = o.customer_id
""")

Optimize Subqueries

BAD: Subquery in SELECT:

SELECT
    u.name,
    (SELECT COUNT(*) FROM orders WHERE customer_id = u.id) AS order_count
FROM users u;

GOOD: Use JOIN with GROUP BY:

SELECT u.name, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON u.id = o.customer_id
GROUP BY u.id, u.name;

Use EXISTS Instead of IN

BAD: IN with subquery:

SELECT * FROM users
WHERE id IN (SELECT customer_id FROM orders WHERE total > 1000);

GOOD: EXISTS (stops at first match):

SELECT * FROM users u
WHERE EXISTS (
    SELECT 1 FROM orders o
    WHERE o.customer_id = u.id AND o.total > 1000
);

Limit Result Sets

-- Always use LIMIT for pagination
SELECT * FROM large_table
ORDER BY created_at DESC
LIMIT 50 OFFSET 100;

-- Better: Use cursor-based pagination
SELECT * FROM large_table
WHERE id > last_seen_id
ORDER BY id
LIMIT 50;

Configuration Tuning

Memory Settings

postgresql.conf:

# Shared buffers (25% of RAM)
shared_buffers = 8GB

# Effective cache size (75% of RAM)
effective_cache_size = 24GB

# Work memory per operation
work_mem = 64MB

# Maintenance work memory
maintenance_work_mem = 2GB

# WAL buffers
wal_buffers = 16MB

Guidelines:

shared_buffers: 25% of total RAM (max 40%)
effective_cache_size: 50-75% of total RAM
work_mem: Depends on concurrent connections
- Formula: (RAM - shared_buffers) / (max_connections * 3)
maintenance_work_mem: Higher for faster VACUUM, INDEX creation

Connection Settings

# Maximum concurrent connections
max_connections = 200

# Use connection pooling (PgBouncer) for higher concurrency

Connection Pooling with PgBouncer:

[databases]
mydb = host=localhost port=5432 dbname=production

[pgbouncer]
listen_addr = *
listen_port = 6432
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25

Write Performance

# WAL settings for performance
wal_level = replica
synchronous_commit = off  # Accept small data loss for speed
wal_compression = on
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9

Trade-offs:

synchronous_commit = off: Faster writes, risk of data loss on crash
Higher checkpoint_timeout: Fewer I/O spikes, longer recovery time

Autovacuum Tuning

# Autovacuum settings
autovacuum = on
autovacuum_max_workers = 4
autovacuum_naptime = 30s
autovacuum_vacuum_scale_factor = 0.1
autovacuum_analyze_scale_factor = 0.05

Purpose:

Remove dead tuples
Update statistics for query planner
Prevent transaction ID wraparound

Monitoring and Diagnosis

Slow Query Log

# Log queries slower than 100ms
log_min_duration_statement = 100
log_line_prefix = '%t [%p]: user=%u,db=%d,app=%a,client=%h '

Active Queries

SELECT
    pid,
    now() - query_start AS duration,
    state,
    query
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY duration DESC;

Blocking Queries

SELECT
    blocked_locks.pid AS blocked_pid,
    blocked_activity.query AS blocked_query,
    blocking_locks.pid AS blocking_pid,
    blocking_activity.query AS blocking_query
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks
    ON blocking_locks.locktype = blocked_locks.locktype
    AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
    AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;

Cache Hit Ratio

SELECT
    sum(heap_blks_read) AS heap_read,
    sum(heap_blks_hit) AS heap_hit,
    sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) AS cache_hit_ratio
FROM pg_statio_user_tables;

Target: Above 0.95 (95% hit ratio)

Index Usage

SELECT
    schemaname,
    tablename,
    indexname,
    idx_scan,
    idx_tup_read,
    idx_tup_fetch
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;

Action: Drop unused indexes to save write performance

Table Bloat

SELECT
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size,
    n_dead_tup,
    n_live_tup,
    ROUND(n_dead_tup * 100.0 / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_ratio
FROM pg_stat_user_tables
WHERE n_live_tup > 0
ORDER BY n_dead_tup DESC;

Solution: VACUUM FULL or pg_repack for severe bloat

Advanced Optimization

Partitioning Large Tables

Range Partitioning:

CREATE TABLE events (
    id SERIAL,
    event_type VARCHAR(50),
    event_data JSONB,
    created_at TIMESTAMP NOT NULL
) PARTITION BY RANGE (created_at);

CREATE TABLE events_2024_01 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

CREATE TABLE events_2024_02 PARTITION OF events
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');

-- Create indexes on partitions
CREATE INDEX ON events_2024_01(event_type);
CREATE INDEX ON events_2024_02(event_type);

Benefits:

Faster queries on time-ranges
Easier data archival
Improved maintenance performance

Materialized Views

Pre-computed aggregates:

CREATE MATERIALIZED VIEW daily_sales_summary AS
SELECT
    DATE(created_at) AS sale_date,
    product_category,
    COUNT(*) AS transaction_count,
    SUM(amount) AS total_sales
FROM sales
GROUP BY DATE(created_at), product_category;

CREATE UNIQUE INDEX ON daily_sales_summary (sale_date, product_category);

-- Refresh periodically
REFRESH MATERIALIZED VIEW CONCURRENTLY daily_sales_summary;

Benefits:

Instant query response for complex aggregations
Reduce database load
CONCURRENTLY option allows reads during refresh

Parallel Query Execution

# Enable parallel execution
max_parallel_workers_per_gather = 4
max_parallel_workers = 8
parallel_tuple_cost = 0.1

Force parallel execution:

SET max_parallel_workers_per_gather = 4;

SELECT COUNT(*) FROM large_table
WHERE category = 'electronics';

Best For:

Large table scans
Aggregations on millions of rows
Data warehouse queries

Backup and Recovery

Physical Backups (pg_basebackup)

pg_basebackup -D /backup/pgdata -Ft -z -P

Advantages:

Full cluster backup
Fastest restore
Point-in-time recovery with WAL

Logical Backups (pg_dump)

# Single database
pg_dump -Fc mydb > mydb.dump

# All databases
pg_dumpall > all_databases.sql

# Restore
pg_restore -d mydb mydb.dump

Advantages:

Selective restore
Cross-version compatible
Text format for inspection

Point-in-Time Recovery (PITR)

# Enable WAL archiving
wal_level = replica
archive_mode = on
archive_command = 'cp %p /archive/%f'

Restore to specific time:

# In recovery.conf
restore_command = 'cp /archive/%f %p'
recovery_target_time = '2024-01-15 14:30:00'

High Availability

Streaming Replication

Primary server:

wal_level = replica
max_wal_senders = 3
wal_keep_size = 1GB

Standby server:

primary_conninfo = 'host=primary.example.com port=5432'
restore_command = 'cp /archive/%f %p'

Benefits:

Real-time replication
Read replicas for scaling
Automatic failover with tools (Patroni, repmgr)

Connection Pooler (PgBouncer)

Setup:

[databases]
production = host=localhost port=5432

[pgbouncer]
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 20

Benefits:

Reduced connection overhead
Higher concurrency
Connection reuse

Best Practices Summary

Index wisely: Cover filtered columns, avoid over-indexing
Use EXPLAIN ANALYZE: Understand query execution plans
Tune memory settings: shared_buffers, work_mem based on workload
Connection pooling: Use PgBouncer for high concurrency
Regular VACUUM: Keep autovacuum properly configured
Monitor cache hit ratio: Target above 95%
Partition large tables: Improve query and maintenance performance
Use materialized views: For complex, repeated aggregations
Enable query logging: Log slow queries for optimization
Regular backups: Automated daily backups with PITR

Troubleshooting Common Issues

Slow queries: Run EXPLAIN ANALYZE, check for seq scans on large tables

High CPU usage: Identify expensive queries, optimize or add indexes

Memory issues: Reduce work_mem or max_connections, add RAM

Connection pool exhausted: Increase pool size or optimize queries

Bloat: Run VACUUM or pg_repack on bloated tables

Lock contention: Identify blocking queries, optimize transaction duration

Bottom Line

PostgreSQL performance requires continuous optimization through proper indexing, query tuning, and configuration. Start with EXPLAIN ANALYZE to identify bottlenecks. Use connection pooling for concurrency. Monitor cache hit ratio and index usage. Partition large tables and use materialized views for analytics. Regular maintenance via autovacuum is critical. With proper tuning, PostgreSQL handles millions of transactions per second on modern hardware.

PostgreSQL Performance Tuning: Complete Optimization Guide

PostgreSQL Performance Tuning: Complete Optimization Guide

Performance Fundamentals

Indexing Strategies

B-Tree Indexes (Default)

GIN Indexes (Generalized Inverted Index)

BRIN Indexes (Block Range Index)

Hash Indexes

Query Optimization

Using EXPLAIN ANALYZE

Avoid N+1 Queries

Optimize Subqueries

Use EXISTS Instead of IN

Limit Result Sets

Configuration Tuning

Memory Settings

Connection Settings

Write Performance

Autovacuum Tuning

Monitoring and Diagnosis

Slow Query Log

Active Queries

Blocking Queries

Cache Hit Ratio

Index Usage

Table Bloat

Advanced Optimization

Partitioning Large Tables

Materialized Views

Parallel Query Execution

Backup and Recovery

Physical Backups (pg_basebackup)

Logical Backups (pg_dump)

Point-in-Time Recovery (PITR)

High Availability

Streaming Replication

Connection Pooler (PgBouncer)

Best Practices Summary

Troubleshooting Common Issues

Bottom Line

Share this article

Ready to Transform Your Business?

Related Articles

Redis Caching: Performance Optimization Strategies

API Design Patterns for Modern Applications

Cloud Architecture for Modern Enterprises