Tickets

Loading…

Sujit Kamthe

@sujitkamthe

Engineering Reliable and Scalable Realtime Change Data Capture Pipelines

Submitted May 18, 2025

Abstract

Change Data Capture (CDC) is essential for real-time data architectures, but when handling critical financial transactions, even the slightest data loss or inconsistency is unacceptable. In this talk, we’ll share our experience designing and building a highly reliable, low-latency CDC pipeline for a finance use case where data integrity, availability, and observability were top priorities.

We’ll dive into the key challenges we faced—ensuring exactly-once processing, handling schema evolution, mitigating network failures, and optimizing for performance without compromising consistency. Beyond data replication, we’ll cover how we implemented real-time monitoring, alerting to proactively identify and resolve CDC failures. We’ll also discuss data reconciliation strategies, like audit logs, and validation mechanisms to ensure data correctness across systems.

Whether you’re working with financial data or other mission-critical workloads, this talk will provide practical insights for building resilient, fault-tolerant CDC pipelines with strong guarantees for data integrity and observability.

Takeaways

  1. Deep dive into real-world engineering challenges of implementing realtime data replication pipeline
  2. Learn strategies to enhance observability in CDC pipelines using real-time monitoring, alerting, and metrics to proactively detect and resolve issues
  3. Designing end-to-end validation and reconciliation workflows
  4. Lessons learned from operating mission-critical CDC in production

Audience

The talk is for

  • Platform/Data Engineers building real time data pipelines
  • DevOps Engineers managing the deployment and scaling of Data pipelines
  • Cloud Infra and SRE Teams responsible for performance, availability, and cost control

Bio

Sujit is a technology leader with over 14 years of experience in building large-scale, high-performance distributed systems. A full-stack polyglot developer, he specializes in Functional Programming, Microservices, Data Engineering, and DevOps. Sujit has led complex data engineering projects, optimization problems, and cloud-native architectures
LinkedIn: https://www.linkedin.com/in/sujitkamthe/

Reference Links

Slides: https://docs.google.com/presentation/d/1zlOGIgOcxbQmVq0b85GHnIyP0XG2J__DkitfgikkPvU/edit?usp=sharing

Updated slides:
https://docs.google.com/presentation/d/1O36m1n51xhSiZ8lDermUOaN3dgaZA57CgjuAyeqvTz0/edit?usp=sharing

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid Access Ticket

Hosted by

Jump starting better data engineering and AI futures

Supported by

Gold Sponsor

Sahaj is an artisanal technology services company crafting purpose-built AI and data-led solutions for businesses.

Gold Sponsor

Atlassian unleashes the potential of every team. Our agile & DevOps, IT service management and work management software helps teams organize, discuss, and compl

Gold Sponsor

Together, we can build for everyone.

Bronze sponsor & Swag sponsor

AI-Powered Upskilling for Modern Data Professionals

Bronze Sponsor

Thoughtworks is a pioneering global technology consultancy, leading the charge in custom software development and technology innovation.

Community partner

Grace Hopper Celebration India 2025, hosted by AnitaB.org India, is Asia’s largest gathering of women and allies in technology.

Community partner

Bengaluru Systems Meetup