Building a distributed transactional database

Building a distributed transactional database

Design MVCC databases with real-world consistency guarantees

Tickets

Loading…

🧱 Workshop: Building a Distributed Transactional Database

🎯 Target Audience

  • Senior engineers with a background in distributed systems, transactions, and concurrency control
  • Researchers and PhD students

📘 Abstract

This workshop explores the design space of globally consistent transactional databases by incrementally building a distributed Multi-Version Concurrency Control (MVCC) key-value store using Hybrid Logical Clocks (HLC).

Participants will implement Snapshot Isolation (SI) and examine how linearizable reads can be achieved under bounded clock skew without hardware-assisted time, using uncertainty detection and restart techniques.

The workshop emphasizes:

  • Correctness invariants
  • Timestamp ordering
  • Trade-offs between commit-wait and retry-based designs

It provides a unified conceptual framework for understanding systems such as Google Spanner, CockroachDB, and YugabyteDB.


✅ Learning outcomes

By the end of the workshop, participants will be able to:

  • Reason about time, ordering, and causality in distributed databases
  • Implement Hybrid Logical Clocks
  • Build an MVCC storage engine on top of a RocksDB-like store
  • Implement Snapshot Isolation
  • Achieve linearizable reads under bounded clock skew
  • Understand the limits of clock-based approaches

📋 Prerequisites

  • Java 21+
  • Basic understanding of distributed systems

🔗 Resources


🧩 Workshop Modules


Module 1: Time and Order in Distributed Systems (60 minutes)

🎯 Objective

Understand why physical clocks fail and implement Hybrid Logical Clocks.

📚 Concepts

  • Physical clock drift and NTP limitations
  • Logical clocks (Lamport) and causality
  • Hybrid Logical Clocks (HLC): combining physical time with logical ticks

💻 Coding Tasks

1. Implement HybridTimestamp

Value object holding:

  • wallClockTime
  • logicalTicks
  • Comparison logic

2. Implement HybridClock

  • now()
  • Handling physical time updates
  • Handling same-millisecond logical ticks
  • tick(remoteTime) (Lamport rule)
  • MAX_OFFSET sanity checks

📁 Artifacts

  • src/main/java/org/example/txn/HybridTimestamp.java
  • src/main/java/org/example/txn/HybridClock.java

Module 2: The Storage Layer with RocksDB (60 minutes)

🎯 Objective

Build a versioned key-value store using RocksDB.

📚 Concepts

  • LSM Trees and RocksDB basics
  • MVCC mapping
  • Key design: UserKey + Timestamp (descending) → Value
  • Why descending timestamps enable efficient latest-version seeks
  • Iterators, scan, reverseScan

💻 Coding Tasks

1. Set up RocksDB

org.rocksdb:rocksdbjni:9.x.x

Initialize RocksDB instance.

2. Implement MVCCStore

  • put(key, timestamp, value)
    rocksDB.put(serialize(key, timestamp), value)

  • get(key, readTimestamp)
    rocksDB.seek(serialize(key, readTimestamp))

Challenge: efficient memcomparable key encoding.

3. Refactor

Replace ConcurrentSkipListMap with RocksDBManager.

📁 Artifacts

  • src/main/java/org/example/txn/MVCCStore.java
  • src/main/java/org/example/txn/RocksDBStorage.java

Module 3: Transactions and Snapshot Isolation (60 minutes)

🎯 Objective

Implement transaction lifecycle and Snapshot Isolation.

📚 Concepts

  • Snapshot reads at T_start
  • Write intents and buffering
  • Transaction states: Running, Committed, Aborted
  • Atomic commit and visibility

💻 Coding Tasks

Read Path

  • Read at T_read

  • Resolve intents:

    • If intent is mine → read

    • If intent is other:

      • Committed < T_read → visible
      • Committed > T_read → ignore
      • Aborted / Pending → ignore

Write Path

  • Write intent with TxnId
  • Detect write–write conflicts
  • Check committed versions newer than T_start

Commit Path

  • commit(txnId)
  • Mark COMMITTED with T_commit
  • Apply intents asynchronously

(Optional: separate data and status storage)

📁 Artifacts

  • TxnClient.java
  • TransactionRecord.java

Module 4: Distributed Consistency — Clock-Bound Wait (60 minutes)

🎯 Objective

Achieve external consistency (linearizability) using HLC.

📚 Concepts

  • Clock skew & uncertainty window
T_now ∈ [T_now - ε, T_now + ε]
  • Why causal transactions may appear in the future
  • Commit-wait vs restart strategies

💻 Coding Tasks

  1. Simulate uncertainty (MAX_CLOCK_SKEW)
  2. Detect commits in uncertainty window
T_commit ∈ [T_read - ε, T_read]
  1. Trigger read restart with T_restart = T_commit + ε
  2. Client retry logic

📁 Artifacts

  • StorageReplica.java (handleRead updates)
  • TxnClient.java (retry logic)

Module 5: Putting It All Together (30 minutes)

🎯 Objective

Run end-to-end simulations.

▶ Tasks

  • Run StorageReplicaCommitTest (Snapshot Isolation)
  • Run ClockBoundWaitTest (External Consistency)

About the instructor

Unmesh Joshi is a Distinguished Engineer at Thoughtworks. He is a software architecture enthusiast, who believes that understanding principles of distributed systems is as essential today as understanding web architecture or object-oriented programming was in the last decade. For the last two years he has been publishing patterns of distributed systems on martinfowler.com.
In 2023, he authored the book Patterns of Distributed Systems published by Addison Wesley Professional. This book is an essential catalog of patterns aimed at enhancing comprehension, communication and education on distributed system design
He has also conducted various training sessions around this topic. Twitter: @unmeshjoshi

How to attend this workshop

This workshop is open for participation to Rootconf annual members.
This workshop is open to 30 participants only. Seats will be available on first-come-first-serve basis. 🎟️

Contact information ☎️

For inquiries about the workshop, contact +91-7676332020 or write to info@hasgeek.com

Hosted by

We care about site reliability, cloud costs, security and data privacy