Dualo
SQL Essentials

Query optimization — anti-patterns to avoid

The top 10 mistakes that turn fast queries slow: SELECT *, function on indexed column, N+1, implicit casts, deep pagination.

2 min read

**Diagnosing slow queries — a checklist**: (1) EXPLAIN ANALYZE to get actual vs estimated row counts + per-operator timing; (2) check for Seq Scans on large tables — missing index or non- predicate; (3) check join algorithms — Nested Loop on a 10M × 10M is a bug; (4) check sort operations — can an index eliminate the sort? (5) check buffer stats in Postgres (`EXPLAIN (ANALYZE, BUFFERS)`) — heavy disk reads on what should be cached.

**** (Search-ARGument-able) is the key concept. A predicate `col OPERATOR value` is sargable when the engine can apply it via index seek/range. Any function, expression, or implicit cast on the column breaks sargability. Stay on the value side: `amount * 1.2 > 100` non-sargable; rewrite as `amount > 100 / 1.2`.

**Parameter sniffing / plan caching**: engines cache execution plans keyed by query text + parameter-value samples. A plan optimized for parameter=common-value runs badly for parameter=rare-value and vice versa. Mitigations: OPTION (RECOMPILE) in SQL Server for volatile queries; `SET plan_cache_mode = force_custom_plan` in Postgres; parameterize by sniffable dimension.

Bind parameters vs string interpolation: ALWAYS use bind parameters ($1, ?). Security (SQL injection), but also performance (plan cache reuse) and correctness (types, NULLs). String interpolation is an anti-pattern.

Index choice gotchas: (i) function expression index available? CREATE INDEX ON users(LOWER(email)) supports WHERE LOWER(email) = ?; (ii) partial index for common filter: CREATE INDEX ON orders(created_at) WHERE status = 'pending' — smaller, faster for pending-only queries; (iii) covering index with INCLUDE: CREATE INDEX ON orders(customer_id) INCLUDE (amount, status) enables index-only scans.

Batching, not looping: instead of N queries in a loop, send one query with VALUES or ... IN (a, b, c, ...). For bulk inserts, use COPY (Postgres) / LOAD DATA (MySQL) / BULK INSERT (SQL Server) — 10-100× faster than per-row INSERT.

**Connection pooling**: opening a Postgres connection costs 1-5ms + memory. Use a pooler (pgbouncer, pgpool, or server-side) with transaction-level pooling for app tier. Max connections matter — Postgres max_connections default 100; a 50-app-server setup with 200 Cloud Run instances opening 10 connections each = 2000 pool requests, instantly exhausts. Use connection multiplexing.

Query timeout: set per-statement timeout (SET statement_timeout = '30s' in Postgres) at the session or role level. Prevents runaway queries from taking down the DB. Separate timeouts for OLTP (short) and reports (long).

Observability for slow queries: pg_stat_statements in Postgres (aggregate stats per query template), SQL Server Query Store, MySQL performance_schema + slow_query_log. Essential. Without them, you optimize by anecdote.

Grounded on https://use-the-index-luke.com/