Rootconf 2025 Annual Conference - 16th and 17th May

Rootconf 2025 Annual Conference - 16th and 17th May

On platforms, distributed data systems & security

Tickets

Loading…

Rohan Reddy Alleti

@rohanreddyalleti

What Lies Beneath PostgreSQL: The Mechanics of Writes, MVCC and VACUUM (I am still thinking about the title, might change)

Submitted Apr 9, 2025

Overview

If you assume an UPDATE tweaks only a row, think again. In PostgreSQL, every write is a story of versioning, visibility, and vacuuming. This tech talk dives straight into the internals, MVCC, dead tuples, transaction IDs , and shows exactly what happens when your app writes to the database.

We’ll break down how PostgreSQL handles concurrent transactions using MVCC, what INSERTs and UPDATEs do on disk, and why bloat builds up even when you’re just “updating a value.” You’ll get a clear view of how PostgreSQL handles space, tracks row versions, and why VACUUM isn’t optional, it’s critical for data integrity and long-term performance.

We will use tools like pageinspect extension and the pg_stat views to inspect what’s happening under the hood. We’ll walk through how indexes are touched during updates by understanding a Postgres optimization called Heap Only Tuples (HOT) and how we can leverage it to improve write throughput, and how transaction wraparound can sneak up and cause real damage, unless VACUUM steps in.

Takeaways

  • Understand MVCC at the byte level: row versions, xmin/xmax, and tuple visibility.
  • See exactly how INSERTs and UPDATEs behave on disk, and how they differ.
  • Understanding the working of VACUUM and its importance
  • Learn how indexes respond to updates and how postgres keeps them efficient (HOT chain).

Who is this for

If you’re curious to know what’s happening behind the scenes during reads and writes in Postgres, you’re at the right place. We’ll uncover these internals, which will make you more equipped to make informed decisions when designing and optimizing your databases.

I originally delivered a condensed down version of this this as an internal tech talk for my colleagues, and later translated it into a written blog, which you can find here. The blog gained positive traction and engagement on Reddit. This tech talk would include the contents from the blog as well as additional details on vacuum and impact on indexes.

About me

Hey 👋 I am Rohan Reddy Alleti, a software developer and solution consultant at Sahaj Software, primarily in backend development, cloud computing, and Retrieval-Augmented Generation (RAG) systems. I’m particularly interested in database internals, system-level behaviour, and the mechanical sympathy between hardware and software. I enjoy digging deep into how things work under the hood and strive to build systems that are both efficient and elegant.

mail: rohanjnr44@gmail.com
linkedin: http://linkedin.com/in/rohanalleti

I also presented a session on Retrieval Augmented Generation at Javafest 2024 recently, the video can be found here.

Undecided
I could do a demo for the above talk or could just use screenshots in slides if thats makes the talk faster. I do plan on showcasing performance metrics with and without HOT (heap only tuples) in use, etc but still not finalised on this.
Draft Slides

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

In-person conference

Hosted by

We care about site reliability, cloud costs, security and data privacy

Supported by

Gold Sponsor

Coinbase (NASDAQ: COIN) is on a mission to increase economic freedom for more than 1 billion people.

Gold Sponsor

Nutanix is a global leader in cloud software, offering organizations a single platform for running apps and data across clouds.

Sponsor

Facets.cloud is the orchestration platform for infra, CI/CD, and config. Built for self-service, governance, and AI readiness.

Sponsor

Thoughtworks is a global technology consultancy that integrates strategy, design and engineering to drive digital innovation.