Postgres: what to work on

March 23, 2026

There are broadly two kinds of work one can do in the PostgreSQL project. One kind involves tackling something whose scope is wide from the start: a new feature that touches the planner, the executor, and the storage layer, or an interface overhaul that requires coordinating changes across multiple subsystems. The other kind involves noticing something specific and localized, understanding it deeply enough to fix it cleanly, and shipping the result in a way that is obviously correct and obviously beneficial. Both kinds of work matter, but they require different habits of attention, and it is easy to let one crowd out the other.

The wide-scope work tends to be what people remember. Declarative partitioning, logical replication, parallel query: these are the kinds of contributions that are cited in release announcements and conference talks. They also tend to be the work that takes years to reach a stable state and that carries real risk of not landing in a given release cycle because the surface area is too large to review all at once. That risk is not a reason to avoid such projects, but it is worth understanding honestly.

The localized work is different in character. It often starts not with a design but with a discomfort: a nagging feeling that something is slower than it should be, or that a data structure is being used in a way that made sense once but no longer fits the code around it. Acting on that discomfort requires a different kind of readiness than the wide-scope work does.

One thing that helps is reading code without a specific goal. Most of the time, you open a source file because you need to fix something or understand an interface well enough to extend it. But the useful observations, like noticing that a list is being scanned linearly in a hot path, or that a memory context is being allocated and freed in a tighter loop than the surrounding code makes obvious, come more easily when you are not already focused on something else. Browsing through heapam.c or tuplesort.c with no particular agenda is different from reading those files while trying to figure out why your patch is failing a test. Both are necessary. The unfocused reading is the one that tends to get skipped.

Benchmarking things that feel fine is another habit worth developing. Most performance work is reactive: you write a patch, you measure it, you report the improvement. But some of the most interesting findings come from measuring code that nobody suspected was slow. The observation that some algorithm is underperforming its theoretical potential usually does not come from inspecting the algorithm. It comes from running a workload, seeing a number that does not match your intuition, and then going to look at why. Building a habit of running microbenchmarks on parts of the system you did not write, just to calibrate your sense of what is normal, pays off in unexpected ways. Andres Freund’s upcoming talk at pgconf.dev, Profiling Postgres Perils, goes into exactly the kind of discipline this requires.

It also helps to keep a running list of things that seem off but that you are not in a position to investigate right then. When you are deep in a large patch series, you encounter strange corners of the code regularly, and the instinct is usually to note the strangeness and move on because you have something else to finish. Writing those observations down, even vaguely, gives you a pool of starting points for the next time you have a few days without a clear agenda. Some of them will turn out to be nothing. A few will turn into something genuinely useful.

The completed small fixes in git history are also worth studying closely, not just for the technical content but for the shape of the work. What was the initial observation? How was the problem scope defined? How did the author frame the explanation in the commit message? Reading a commit that measurably improves performance on a real workload, and understanding the chain of reasoning that led from suspicion to proof to fix, is instructive in a way that reading the code alone is not. The reasoning is usually not recorded anywhere else. David Rowley’s tuple deforming improvements that went into v18, and the further work he is doing for v19, are good examples. So is Tomas Vondra’s fast path locking improvement in v18.

The deeper issue, though, is that these two modes of working are hard to interleave. Wide-scope work requires holding a large amount of context in your head and managing many simultaneous concerns: design choices that depend on each other, review feedback that pulls in different directions, benchmark results that complicate the story you thought you were telling. Localized work requires something closer to the opposite, a relaxed attention that can notice things sideways, follow a hunch without knowing where it leads, and stop when the finding turns out to be uninteresting. Trying to do both at the same time tends to mean doing neither well.

One approach that helps is to treat them as separate phases rather than parallel tracks. Finishing a large push and then deliberately spending a few weeks in exploratory mode before starting the next one is not the same as being unproductive. The exploratory time tends to generate the seeds of the next round of focused smaller work, and it also refreshes the parts of your attention that the sustained effort has worn down.

There is also something to be said for getting better at identifying the minimum useful slice of a large project early, before the full scope is clear. The instinct with wide-scope work is to hold the whole vision in your head and push toward it. But the parts of large projects that actually land in a given release are usually the parts that can be explained and reviewed independently of the rest. Thinking early about which slice that is, and structuring the work so that slice can ship even if the surrounding context is still in flux, makes the large projects feel less risky to the people reviewing them and more likely to actually make it.

Neither kind of work is more important than the other. PostgreSQL needs both people who will spend three years getting something architecturally right and people who will spend three days making something measurably faster. The goal is not to stop doing one in favor of the other but to stay genuinely sharp at both, which mostly means not letting the urgency of the large work completely crowd out the quieter habits that make the small work possible.


© 2025 Amit Langote. Hosted by GitHub. Powered by Jekyll