Postgres: planner internals (data structures and flow)

July 08, 2025

PostgreSQL’s planner is built around a set of data structures that help it safely and efficiently create query plans. The planner decides when filters can be safely moved, identifies expressions that must be equal, avoids unnecessary sorting or joins, and handles other correctness constraints like outer joins and null semantics.

Planning is more than costing

Query planning is often thought of purely as choosing the cheapest execution path—selecting the fastest scans or joins based solely on cost estimates. But PostgreSQL’s planner must also ensure correctness, maintain SQL semantics, and guarantee safe transformations. This involves deciding:

These considerations require the planner to rely on structured metadata beyond simple numeric costs.

RestrictInfo: metadata for qualifiers

Rather than modifying expressions directly, PostgreSQL wraps WHERE and JOIN clauses in RestrictInfo structs. These wrappers store metadata such as:

This makes clause movement safe and simplifies optimization.

EquivalenceClass: managing equalities

Optimizations often depend on knowing two expressions will produce identical results at runtime. PostgreSQL manages this with EquivalenceClass structs, representing sets of expressions known to be equal.

Built from mergejoinable equality clauses (e.g., t1.a = t2.a), EquivalenceClasses allow:

RelOptInfo: planner’s relation representation

Each table or join has a corresponding RelOptInfo struct, representing the planner’s internal view of a relation (distinct from parser-level constructs like RangeTblEntry). It contains:

This structure enables flexible reasoning about scan and join strategies during planning.

Once a relation is represented by a RelOptInfo, the planner begins populating it with possible execution strategies.

Path: representing candidate plans

Each possible plan for a relation is represented by a Path. Examples include sequential scans, index scans, bitmap scans, and parallel scans. Each Path has:

Paths are accumulated in RelOptInfo.pathlist. The planner discards suboptimal paths using add_path().

PathKeys: symbolic sort orders

PathKeys represent the sort order of tuples produced by a Path. Built from EquivalenceClass members and sort operators, PathKeys enable:

By using symbolic representation, PathKeys let planners efficiently compare sort orders across candidate plans.

Additional structures: JoinInfo, ParamPathInfo, and PlaceHolderVar

All the structures discussed above support the planner’s bottom-up strategy for assembling a query plan.

Planning in stages

PostgreSQL’s planner works bottom-up: it plans scan paths for base relations first, then builds join paths by combining those. This staged approach involves:

At each stage, the planner generates multiple alternative paths and discards those that are clearly worse — for example, paths that are more expensive and provide no useful physical property like sort order or parallel safety. This happens through the add_path() mechanism, which enforces a dominance check.

Aggressive pruning is essential to keep planning tractable. Without it, the number of paths and join combinations would grow too quickly to handle, especially in queries with many joins.

Some optimizers (like those based on the Cascades framework) represent each subquery or join input as a logical expression tree — a normalized operator-level AST. These optimizers memoize each such expression and apply transformation rules to generate equivalent alternatives (e.g., commutativity, associativity, filter pushdown). Each new form is stored and considered separately, and the optimizer defers choosing a physical implementation until all rewrites have been explored. In contrast, PostgreSQL keeps the logical query structure mostly fixed and explores only physical alternatives — represented as Path nodes — for each part of the plan. It memoizes the cheapest paths for each relation or join combination, but aggressively prunes weaker ones early and avoids tracking multiple logically equivalent forms.

Closing thoughts

The planner’s internal structures evolved to balance correctness, extensibility, and planning efficiency. While not perfect or exhaustive, the combination of clause wrappers, symbolic equality tracking, and staged path generation supports a wide range of query patterns without relying on heavyweight memoization or rule systems. It’s a pragmatic design, and still improving.


© 2024 Amit Langote. Hosted by GitHub. Powered by Jekyll