RA
Rohan Reddy Alleti
@rohanreddyalleti
What Lies Beneath PostgreSQL: The Mechanics of Writes, MVCC and VACUUM (I am still thinking about the title, might change)
Submitted Apr 9, 2025
Topic of your submission:
Databases
Type of submission:
30 mins talk
I am submitting for:
Rootconf Annual Conference 2025
Overview
If you assume an UPDATE tweaks only a row, think again. In PostgreSQL, every write is a story of versioning, visibility, and vacuuming. This tech talk dives straight into the internals, MVCC, dead tuples, transaction IDs , and shows exactly what happens when your app writes to the database.
We’ll break down how PostgreSQL handles concurrent transactions using MVCC, what INSERTs and UPDATEs do on disk, and why bloat builds up even when you’re just “updating a value.” You’ll get a clear view of how PostgreSQL handles space, tracks row versions, and why VACUUM isn’t optional, it’s critical for data integrity and long-term performance.
We will use tools like pageinspect extension and the pg_stat views to inspect what’s happening under the hood. We’ll walk through how indexes are touched during updates by understanding a Postgres optimization called Heap Only Tuples (HOT) and how we can leverage it to improve write throughput, and how transaction wraparound can sneak up and cause real damage, unless VACUUM steps in.
Takeaways
- Understand MVCC at the byte level: row versions, xmin/xmax, and tuple visibility.
- See exactly how INSERTs and UPDATEs behave on disk, and how they differ.
- Understanding the working of VACUUM and its importance
- Learn how indexes respond to updates and how postgres keeps them efficient (HOT chain).
Who is this for
If you’re curious to know what’s happening behind the scenes during reads and writes in Postgres, you’re at the right place. We’ll uncover these internals, which will make you more equipped to make informed decisions when designing and optimizing your databases.
I originally delivered a condensed down version of this this as an internal tech talk for my colleagues, and later translated it into a written blog, which you can find here. The blog gained positive traction and engagement on Reddit. This tech talk would include the contents from the blog as well as additional details on vacuum and impact on indexes.
About me
Hey 👋 I am Rohan Reddy Alleti, a software developer and solution consultant at Sahaj Software, primarily in backend development, cloud computing, and Retrieval-Augmented Generation (RAG) systems. I’m particularly interested in database internals, system-level behaviour, and the mechanical sympathy between hardware and software. I enjoy digging deep into how things work under the hood and strive to build systems that are both efficient and elegant.
mail: rohanjnr44@gmail.com
linkedin: http://linkedin.com/in/rohanalleti
I also presented a session on Retrieval Augmented Generation at Javafest 2024 recently, the video can be found here.
Undecided
I could do a demo for the above talk or could just use screenshots in slides if thats makes the talk faster. I do plan on showcasing performance metrics with and without HOT (heap only tuples) in use, etc but still not finalised on this.
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}