BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//HasGeek//NONSGML Funnel//EN
DESCRIPTION:It worked in theory. Let’s talk about production.
X-WR-CALDESC:It worked in theory. Let’s talk about production.
NAME:Topical Edition on Databases
X-WR-CALNAME:Topical Edition on Databases
REFRESH-INTERVAL;VALUE=DURATION:PT12H
SUMMARY:Topical Edition on Databases
TIMEZONE-ID:Asia/Kolkata
X-PUBLISHED-TTL:PT12H
X-WR-TIMEZONE:Asia/Kolkata
BEGIN:VEVENT
SUMMARY:Introduction to Topical Edition on Databases
DTSTART:20260613T043000Z
DTEND:20260613T044000Z
DTSTAMP:20260609T235854Z
UID:session/6nT1vhP4XZJF6LWHneHH6v@hasgeek.com
SEQUENCE:2
CREATED:20260531T044054Z
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260603T164807Z
LOCATION:Auditorium - TERI Auditorium\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Introduction to Topical Edition on Databases in Auditorium in 
 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Incremental Computation
DTSTART:20260613T044000Z
DTEND:20260613T053500Z
DTSTAMP:20260609T235854Z
UID:session/NuFLkvWed2fh9X72jeQyYC@hasgeek.com
SEQUENCE:7
CATEGORIES:30-minute talk – technical deep dive
CREATED:20260531T044111Z
DESCRIPTION:Incremental computations repeatedly evaluate a function on som
 e input values that are "changing".  The goal of an efficient implementati
 on is to "reuse" previously computed results: when presented with a new ch
 ange to the input\, an incremental computation should only perform work pr
 oportional to the size of the changes of the input\, rather than to the si
 ze of the entire dataset.\n\nIn databases "incremental computation" is kno
 wn as Incremental View Maintenance (IVM)\, and has long been a central pro
 blem of database theory and practice.\n\nWe describe a set of simple ideas
  which combined solve completely the IVM problem for arbitrary queries (in
 cluding recursive queries and essentially all queries that can be written 
 in SQL):\n  - representing changes as a first-class object\n  - treating a
 ll computations as (stateful) stream computations\n  - a trivial algorithm
  for converting any standard stream computation into an incremental comput
 ation\n\nThis work has received the 2023 VLDB best paper award\, and the 2
 024 ACM SIGMOD research highlights award.\n\n## Takeaways\nThese ideas are
  not just a pretty theory: they are very practical. Feldera is a start-up 
 which has built an incremental query engine which maintains  incrementally
  arbitrary collections of views described in SQL\; the incremental mainten
 ance produces many orders of magnitude reduction in query latency and comp
 utational resource usage compared with traditional batch SQL query engines
 .\n\n## Who is this for?\nAny person interested in databases\, including t
 heoreticians and practitioners.  Only basic knowledge of data structures i
 s required for understading this presentation.\n\n## About the presenter \
 nMihai Budiu is chief scientist at Feldera\, an early-stage startup.  He h
 as a Ph.D. in computer science from Carnegie Mellon University.  He was pr
 eviously employed at VMware Research\, Barefoot Networks\, and Microsoft R
 esearch.  Four of his papers have received "test of time" awards. He is th
 e acting PMC chair for the Apache Calcite project.
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260603T164810Z
LOCATION:Auditorium - TERI Auditorium\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/inc
 remental-computation-NuFLkvWed2fh9X72jeQyYC
BEGIN:VALARM
ACTION:display
DESCRIPTION:Incremental Computation in Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Database systems: a decade of disruption and innovation
DTSTART:20260613T054000Z
DTEND:20260613T062500Z
DTSTAMP:20260609T235854Z
UID:session/Tc1fwbPNPxPVCkUTFhKTMe@hasgeek.com
SEQUENCE:7
CATEGORIES:30-minute talk – technical deep dive
CREATED:20260531T044200Z
DESCRIPTION:The past decade has been highly eventful for Database systems\
 , to say the least. \n\nDatabases have transitioned from on-prem system so
 ftware running in controlled environments to highly scalable and elastic s
 ervices that run on commodity hardware in the cloud. They have evolved to 
 handle diverse\, complex workloads and multi-modal data while adhering to 
 the ever-growing demands of security\, compliance and data governance. \nT
 his transformation has been possible due to several fundamental breakthrou
 ghs in both technology and business. This talk traces the evolution of dat
 abase systems over the past decade and describes some of the key ideas tha
 t have paved the way. The talk concludes by highlighting a few open challe
 nges and opportunities for innovation in database systems\, as we enter th
 e era of AI.\n\n## #Take-aways from the keynote\n1. Understand the story o
 f evolution of database systems over the past decade\n2. Learn about some 
 deep systems innovations that have had tremendous impact both on industry 
 and academia\n3. Get a glimpse of what lies in store for the future of dat
 abases in the era of AI \n\n### Intended audience\nAnyone who is fascinate
 d by database systems and is interested in getting a bird's eye view of th
 e past\, present and future of this fundamental branch of computer science
 .\n\n### About the presenter\nKarthik Ramachandra is head the Azure SQL DB
  R&D Organization in India. Prior to this\, he was a researcher at Microso
 ft Research India and Microsoft Gray Systems Lab. Karthik's areas of inter
 est include query processing and optimization in large scale databases and
  data management systems. \nKarthik has a PhD in computer science from Ind
 ian Institute of Technology Bombay and a B.Tech. from BMS College of Engin
 eering\, Bangalore. \nHis doctoral thesis titled “Holistic Optimization 
 of Database Applications” won an honorable mention for the ACM SIGMOD Ji
 m Gray Doctoral Dissertation Award in 2015. 
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260603T164812Z
LOCATION:Auditorium - TERI Auditorium\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/dat
 abase-systems-a-decade-of-disruption-and-innovation-Tc1fwbPNPxPVCkUTFhKTMe
BEGIN:VALARM
ACTION:display
DESCRIPTION:Database systems: a decade of disruption and innovation in Aud
 itorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Break
DTSTART:20260613T062500Z
DTEND:20260613T065000Z
DTSTAMP:20260609T235854Z
UID:session/HJbQ56SCVfRARvVQgjWh5r@hasgeek.com
SEQUENCE:7
CREATED:20260531T044237Z
LAST-MODIFIED:20260603T164823Z
LOCATION:Polaris School of Technology\, Bengaluru
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Break in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Migrating Panacea.AI's 5TB/day Log Platform from Manticore to Clic
 khouse...and the lessons learnt
DTSTART:20260613T065000Z
DTEND:20260613T071000Z
DTSTAMP:20260609T235854Z
UID:session/4fNgna4hYE3Swf7PGnyGXb@hasgeek.com
SEQUENCE:9
CATEGORIES:15-minute talk – focused engineering experience
CREATED:20260531T044525Z
DESCRIPTION:## Description\n\nWe sized the storage layer for Nutanix's Pan
 acea.AI platform — 5 TB and 5 billion log lines a day — three differen
 t ways and got three answers an order of magnitude apart. Same workload\, 
 same retention\, same ingest rate\; the engines disagreed on storage by 37
 × and on CPU by 3×.\n\n| Sizing for 5 TB/day\, 30-day retention | Disk |
  RAM | CPU cores | Compression | Ingest |\n| :---- | :---- | :---- | :----
  | :---- | :---- |\n| Inverted-index\, heavy | 750 TB | 24 TB | 1\,200+ | 
 1.5 : 1 | \\~25k rows/s |\n| Inverted-index\, lean | 500 TB | 10 TB | 800 
 | 1 : 1 | \\~50k rows/s |\n| **Columnar (what we ship)** | **20 TB** | **4
  TB** | **400** | **10 : 1** | **870k+ rows/s** |\n\nThis 15-minute talk i
 s the engineering case for why that gap exists\, and what it takes to land
  on the small end of it in production.\n\nWe'll cover why log analytics is
  unusually well-suited to a columnar layout — access patterns that are a
 lmost always per-bundle and time-windowed\, compression headroom on raw lo
 g messages\, and the operational economics that fall out of those two — 
 and the four schema patterns that did the real work in production:\n\n| Sc
 hema pattern | What it does | What it bought us |\n| :---- | :---- | :----
  |\n| `Delta+ZSTD` codec stacking | Stacks delta encoding under ZSTD on lo
 g columns | 9.76 TB raw → 480 GB on disk (20.8×)\; up to 993× on monot
 onic IDs |\n| `LowCardinality(String)` | Dictionary-encodes high-frequency
  strings (levels\, hosts\, services) | Smaller marks\, faster filters\, be
 tter cache hit rate |\n| `tokenbf_v1` skip indexes | Bloom-filter-based sk
 ip indexes on tokenized log text | Replaces full-text indexing for substri
 ng search |\n| Monthly partitions \\+ `ttl_only_drop_parts=1` | Drops whol
 e parts on TTL instead of mutating | Self-maintaining cluster\, no DBA |\n
 \nThe cluster today holds 154 billion rows across logs\, metrics\, traces\
 , and AI-generated incident reports\, sustains 870k+ inserts/sec on a sing
 le node\, and runs without a dedicated DBA. We'll close with the one searc
 h-latency trade-off we accepted to land here\, what it cost\, what it didn
 't\, and the framework we now use to re-evaluate it quarterly.\n\n---\n\n#
 # Key Takeaways\n\n1. **Engine choice is a sizing decision\, not a feature
  decision.** The same 5 TB/day workload sized at 750 TB\, 500 TB\, or 20 T
 B depending on the storage model — the table above is the artifact you a
 ctually defend in a design review.  \n2. **Why log analytics fits a column
 ar engine** — access patterns\, compression headroom\, operational econo
 mics — together with the four production schema patterns (`Delta+ZSTD`\,
  `LowCardinality`\, `tokenbf_v1`\, partition-aligned TTLs) that delivered 
 the 37× disk reduction.  \n3. **The trade-off we accepted:** substring se
 arch latency moved from inverted-index speed to bloom-filter speed on a qu
 ery class used in \\<15% of sessions. We'll show the query-mix that made t
 he call defensible.\n\n---\n\n## Target Audience\n\nSREs\, platform engine
 ers\, and DBAs operating high-volume log telemetry. Engineering managers a
 nd architects evaluating storage engines for petabyte-class observability\
 , log search\, or AI workloads. Anyone running an inverted-index backend t
 oday who suspects the access patterns of their workload have outgrown the 
 model they started with.\n\n---\n\n## Bio\n\n**Sohham Seal** — SDE-2 at 
 Nutanix on the Panacea AI platform\; works on the columnar ingestion and q
 uery layer that powers AI-driven incident triage across Nutanix’s custom
 er fleet. His recent AI-related work includes GNN-based recommendation sys
 tems\, biometric security\, and EMG-based gesture prediction (IEEE TIFS\, 
 PCEMS Best Paper). Will present in Bangalore.\n\n**Mohit Gurnani** — SDE
 -4 at Nutanix\; architects and leads Panacea.AI\, an agentic AI platform p
 rocessing 20 PB+ of observability data annually across 29\,000+ enterprise
  clusters — ClickHouse for high-cardinality analytics\, Kubernetes-nativ
 e event-driven pipelines\, and LangGraph agents on top. IEEE-published aut
 hor\; previously presented at ICDMAI 2017\\. Designed the migration descri
 bed in this talk. Will join Q\\&A virtually from Columbus\, OH.  
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260603T164825Z
LOCATION:Auditorium - TERI Auditorium\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/mig
 rating-panacea-ais-5tb-day-log-platform-from-manticore-to-clickhouse-and-t
 he-lessons-learnt-4fNgna4hYE3Swf7PGnyGXb
BEGIN:VALARM
ACTION:display
DESCRIPTION:Migrating Panacea.AI's 5TB/day Log Platform from Manticore to 
 Clickhouse...and the lessons learnt in Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Rolling Your Own Database (Safely!): Property-based Testing at Sca
 le
DTSTART:20260613T071500Z
DTEND:20260613T075000Z
DTSTAMP:20260609T235854Z
UID:session/H3BkKdaBtnvMNhGEqHvZQ@hasgeek.com
SEQUENCE:6
CATEGORIES:30-minute talk – technical deep dive
CREATED:20260531T044623Z
DESCRIPTION:\n## Description\nThere are real advantages to building a spec
 ialized database: better performance\, less impedance mismatch\, and lower
  operational cost. But the conventional wisdom against rolling your own ex
 ists for a reason. ACID is *hard*\, and general-purpose systems are reliab
 le precisely because they've been battle-hardened over decades. That calcu
 lus has recently changed - cheap object storage and a rich ecosystem of op
 en-source primitives have collapsed the cost of building a new database. M
 odern testing methods are doing the same for the cost of trusting one.\n\n
 In this session\, I'll present Pangolin\, a production OLAP database used 
 at Antithesis\, an autonomous software testing company whose deterministic
  hypervisor and fuzzer simulate billions of program states. The resulting 
 data\, billions of rows with monstrous cardinalities\, little schema\, and
  complex query patterns\, breaks general-purpose databases. We’ll cover 
 the design decisions that let Pangolin handle this workload\, and the test
 ing strategy that made it possible for three engineers to design and deplo
 y it in only 14 months: a PBT-forward approach to development where the sa
 me properties catch correctness bugs at the source\, watch for violations 
 in production\, and guide fuzzing in deterministic simulation. We'll walk 
 through real bugs from each layer\, why we'd never have caught them with c
 onventional testing\, and what lessons other teams building reliable softw
 are can take away.\n\n## Takeaways\n- **Properties test what your code sho
 uld do\, not what you remember to check.** Example-based tests catch bugs 
 you’ve already thought of. PBT lets you bake in invariants and find coun
 terexamples hidden in the seams between assumptions.\n- **One property\, m
 any jobs.** The same invariant represents failures in local testing\, logs
  in production\, and guides fuzzing in simulation. One specification can r
 eplace a stack of test suites and lets a small team ship a reliable databa
 se in the time it usually takes to write one.\n  \n## Target Audience\nEng
 ineers interested in specialized data systems\, property-based testing\, o
 r simulation testing as practical tools for shipping correct and reliable 
 production software fast!\n\n## Bio\nShomik Ghose is a database engineer a
 t Antithesis\, where he helped design and build Pangolin. Shomik is a fan 
 of distributed systems\, software correctness and dogs.\n\nhttps://drive.g
 oogle.com/file/d/1hBTAh-84wDqT93WtnFqfBmsVVrid7Anu/view?usp=sharing\n\n\n
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260603T164827Z
LOCATION:Auditorium - TERI Auditorium\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/rol
 ling-your-own-database-safely-property-based-testing-at-scale-H3BkKdaBtnvM
 NhGEqHvZQ
BEGIN:VALARM
ACTION:display
DESCRIPTION:Rolling Your Own Database (Safely!): Property-based Testing at
  Scale in Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Lunch
DTSTART:20260613T075000Z
DTEND:20260613T085000Z
DTSTAMP:20260609T235854Z
UID:session/2Hat5kit59HFHf1XiAeWKd@hasgeek.com
SEQUENCE:5
CREATED:20260531T044800Z
LAST-MODIFIED:20260603T164829Z
LOCATION:Polaris School of Technology\, Bengaluru
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Lunch in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:What Breaks When Aerospike Hits 6 Million QPS
DTSTART:20260613T085000Z
DTEND:20260613T092500Z
DTSTAMP:20260609T235854Z
UID:session/33CryR5n9qr3bDKNSJyhP5@hasgeek.com
SEQUENCE:4
CATEGORIES:30-minute talk – technical deep dive
CREATED:20260531T044921Z
DESCRIPTION:**Category:** War Stories & Lessons Learned\n\n---\n\n## Abstr
 act\n\nWhen your database is in the critical path of every ad auction\, fa
 ilure isn't abstract. A misconfigured cluster costs you money in real time
 . A CPU spike at 1AM means your bidder is throttling while your competitor
 s are not.\n\nInMobi's DSP runs multiple purpose-built Aerospike clusters 
 on Kubernetes\, peaking at 6 million QPS across workloads that have almost
  nothing in common — real-time user segment lookups\, ML embedding servi
 ng\, frequency-cap enforcement\, event deduplication. After years of runni
 ng this at scale\, we've collected a set of failures and near-misses that 
 the documentation doesn't warn you about.\n\nThis talk goes through a few 
 of them — incidents where the root cause turned out to be a default we n
 ever questioned\, a data model decision that looked fine on day one\, or a
  capacity assumption that held until it suddenly didn't. Each one taught u
 s something we couldn't have learned without the production traffic to tri
 gger it.\n\nBeyond the failures\, we'll cover what we've built around Aero
 spike to keep it operational: caching layers\, circuit-breaker patterns tu
 ned per cluster\, and the observability that now gives us early warning be
 fore things go wrong.\n\n---\n\n## Key Takeaways\n\n- Default database con
 figurations are tuned for correctness\, not for extreme QPS — understand
 ing what each setting trades off is what separates operating from just run
 ning a database\n- Data models that look fine at low scale can become infr
 astructure problems over time — record growth is a design concern\, not 
 just a storage concern\n- Every database has hidden resource costs that on
 ly surface at migration time — know your overhead before you need to\n- 
 Resilience at this scale isn't about the database being reliable — it's 
 about designing every layer around it to degrade gracefully when it isn't\
 n\n---\n\n## Target Audience\n\n- Senior engineers and architects operatin
 g databases in the critical path of production traffic\n- Platform and SRE
  engineers managing stateful\, high-throughput systems on Kubernetes\n- En
 gineers using or evaluating Aerospike at scale — or running any low-late
 ncy key-value store under real load\n- Engineers who have hit — or expec
 t to hit — scaling limits in production and want to know what breaks fir
 st and why\n\n---\n\n## Speaker Bios\n\n**Shivam Gupta**\nShivam is a Staf
 f Software Engineer at InMobi in the DSP platform — the real-time biddin
 g infrastructure that processes millions of ad auctions per second across 
 InMobi's global footprint.\n\n>shivam.gupta@inmobi.com\n>\n>## Slide deck\
 nhttps://docs.google.com/presentation/d/1_-TPbgA_TyI2Iinl-6VhvuHRUy1d4cwzd
 lxR8ul6Tt0/edit?usp=sharing
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260603T164831Z
LOCATION:Auditorium - TERI Auditorium\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/wha
 t-breaks-when-aerospike-hits-6-million-qps-33CryR5n9qr3bDKNSJyhP5
BEGIN:VALARM
ACTION:display
DESCRIPTION:What Breaks When Aerospike Hits 6 Million QPS in Auditorium in
  5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:No Stats\, No Problem: : Building Feedback-Driven Optimizers for L
 akehouses
DTSTART:20260613T093000Z
DTEND:20260613T100500Z
DTSTAMP:20260609T235854Z
UID:session/MDK1mGGnGYRP4VCLaFvxUP@hasgeek.com
SEQUENCE:8
CATEGORIES:30-minute talk – technical deep dive
CREATED:20260531T044835Z
DESCRIPTION:Modern query optimizers were designed assuming that the engine
  has reasonably good statistics: row counts\, NDVs\, histograms\, column c
 orrelations\, table freshness\, and reliable cost models. In many lakehous
 e environments\, that assumption breaks down. This talk is about building 
 a query optimizer that can survive in that world. We will discuss a set of
  practical techniques for planning under limited statistics: LEO-style lea
 rning from executed queries\, equivalence sets to compensate for missing N
 DV and semantic constraints\, auto-stats driven by “magic number” sens
 itivity analysis\, and complementary learning from both data and query exe
 cution.\n\nThe second half of the talk presents a research direction: onli
 ne parametric query optimization for recurring BI workloads. Many lakehous
 e queries are templatized: the same SQL shape runs repeatedly with differe
 nt customer IDs\, time windows\, geographies\, product lines\, or account 
 bindings. Most bindings behave like the common case\, but a few create pla
 n cliffs — for example\, a whale account or an unusually broad date rang
 e. We will examine how an optimizer can learn compact parameter-risk regio
 ns\, maintain a bounded set of useful plans\, and reduce tail-latency regr
 et.\n\nAttendees will leave with a mental model for optimizer design when 
 statistics are incomplete by default: what to estimate\, what to learn\, w
 hat to collect\, what to treat as uncertain\, and where robust planning be
 ats blind adaptivity\n\nThe session is aimed at database engineers\, query
  optimizer developers\, data platform teams\, and practitioners running an
 alytical SQL over lakehouse or object-store-backed systems.\n\nSlide: http
 s://docs.google.com/presentation/d/1oze2xvJuLeavgb9S2LCQrZRoZiLDldZD/edit?
 usp=sharing&ouid=103492819106331508633&rtpof=true&sd=true\n\nSweta Singh l
 eads the SQL query optimizer team at e6data. She has over two decades of e
 xperience in database systems\, query optimization\, distributed systems\,
  performance engineering\, and workload management. Before E6data\, she sp
 ent 19 years on the IBM Db2 development team.  Her work spans cost-based o
 ptimization\, statistics approximation\, learning optimizers\, join enumer
 ation\, workload management\, distributed systems and OLTP performance eng
 ineering. \nRenu Pinky Sumam is a Senior Software Engineer on the Query Op
 timizer team at E6data\, with nearly 19 years of experience across relatio
 nal database technology\, cloud systems and AI. Before joining e6data\, sh
 e worked at IBM on Db2 and IBM Cloud Object Storage\, where she helped rea
 rchitect the Cloud Object Storage billing infrastructure into a serverless
 \, cloud-native architecture.\n
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260603T164833Z
LOCATION:Auditorium - TERI Auditorium\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/no-
 stats-no-problem-building-feedback-driven-optimizers-for-lakehouses-MDK1mG
 GnGYRP4VCLaFvxUP
BEGIN:VALARM
ACTION:display
DESCRIPTION:No Stats\, No Problem: : Building Feedback-Driven Optimizers f
 or Lakehouses in Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:From Timeout to Sub-Second: Solving Scale-Dependent Deadlocks in D
 istributed Systems
DTSTART:20260613T101000Z
DTEND:20260613T104500Z
DTSTAMP:20260609T235854Z
UID:session/4scoctu9SpKoMr7G2rmvR3@hasgeek.com
SEQUENCE:10
CATEGORIES:15-minute talk – focused engineering experience
CREATED:20260531T044720Z
DESCRIPTION:**Abstract**\nIn highly coordinated distributed systems like A
 pache HBase\, operations often rely on global barriers\, synchronized proc
 edures that require every node to reach a consensus point before moving fo
 rward. At extreme scales\, these barriers become highly sensitive to threa
 d contention and coordination overhead. This talk details a real-world pro
 duction incident at Flipkart where a critical disaster recovery pipeline h
 it a hard "60-second wall"\, consistently failing due to a hidden architec
 tural flaw.\n\n\nWe will walk through the journey of diagnosing a scale-de
 pendent deadlock that only manifested in 50+ node production clusters whil
 e remaining invisible in smaller staging environments. Attendees will lear
 n how a seemingly harmless\, redundant synchronous RPC call from a worker 
 node back to the central coordinator created a circular dependency in the 
 Master’s RPC handlers\, causing the entire cluster-wide log roll procedu
 re to time out.\nThe session covers the debugging methodology used to prov
 e the deadlock\, including the use of synchronized\, multi-instance thread
  dumps across hundreds of nodes. Finally\, we discuss the architectural sh
 ift required to solve it: decoupling local worker tasks from synchronous c
 allbacks during time-sensitive global barriers.\n\n\n\n**Key Takeaways**\n
 1) _The Circular Dependency_: Understand how synchronous RPC calls within 
 a blocked coordinator thread lead to distributed deadlocks.\n2) _Practical
  Approach_: A practical guide to using synchronized\, multi-instance threa
 d dumps (taken at fixed intervals) to definitively prove a thread is block
 ed rather than just slow.\n3) _Hard Metrics_: See how removing a single re
 dundant check reduced the rolllog procedure time from a mandated 60\,000ms
  timeout failure down to just a few hundred milliseconds.\n4) _Architectur
 al Rule of Thumb_: Never allow workers to make synchronous callbacks to a 
 coordinator that is currently parked waiting for those same workers.\n\n\n
 **Target Audience**\nThis session is designed for Backend Engineers\, Syst
 ems Designers\, and SREs who are interested in database internals and the 
 practical approaches in building and scaling distributed stateful systems.
 \n\n**Slides Deck**\nhttps://docs.google.com/presentation/d/1ylQRYiXmmRqha
 3vJ0aEtzxNiG5mKG-Td8KQiOHQNsPM/edit?usp=sharing\n\n\n**About Me**\nVarun M
 ishra\, senior software engineer (SDE-III) at Flipkart\, where I am workin
 g on centrally managed platforms. We are solving for high scale distribute
 d systems and their reliability. Varun has more than 7 years of experience
  in software development and more than 5 years working on databases.\n
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260603T164835Z
LOCATION:Auditorium - TERI Auditorium\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/fro
 m-timeout-to-sub-second-solving-scale-dependent-deadlocks-in-distributed-s
 ystems-4scoctu9SpKoMr7G2rmvR3
BEGIN:VALARM
ACTION:display
DESCRIPTION:From Timeout to Sub-Second: Solving Scale-Dependent Deadlocks 
 in Distributed Systems in Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Break
DTSTART:20260613T104500Z
DTEND:20260613T111000Z
DTSTAMP:20260609T235854Z
UID:session/16qAgf6i8Pynjd5UXgDXpg@hasgeek.com
SEQUENCE:5
CREATED:20260531T045010Z
LAST-MODIFIED:20260603T164837Z
LOCATION:Polaris School of Technology\, Bengaluru
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Break in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Fast on Paper\, Slow in Reality: What We Got Wrong About Performan
 ce
DTSTART:20260613T111000Z
DTEND:20260613T114500Z
DTSTAMP:20260609T235854Z
UID:session/Rg5k9Qb8M4ekUazbaWVDYL@hasgeek.com
SEQUENCE:6
CATEGORIES:30-minute talk – technical deep dive
CREATED:20260531T045028Z
DESCRIPTION:## Description\n\nIn distributed systems engineering\, a desig
 n that is "correct on paper" is only the beginning\; the real challenge is
  making it "fast in reality." This session offers a transparent post-morte
 m of the architectural assumptions we made while building a distributed ke
 y-value store from scratch in Go\, and why several of those assumptions co
 llapsed under production-grade pressure. We’ll move beyond high-level de
 sign to deconstruct the hidden performance bottlenecks within standard dis
 tributed patterns\, exploring how generalized 2-Phase Commit (2PC) became 
 a crippling bottleneck\, why our waiting list built on Go’s standard mut
 ex became a global point of contention\, and why our initially "standard" 
 transactional steps led to redundant network and disk I/O that unexpectedl
 y doubled our latency.\n\nBy deconstructing these failures\, we provide a 
 practical roadmap for building distributed stateful systems that perform a
 s well in production as they do on paper. We will discuss our remediation 
 journey: from bypassing protocol stages for localized transactions to impl
 ementing storage-layer batching and eliminating redundant network calls to
  local nodes. Attendees will leave with a clear understanding of how to br
 idge the gap between theoretical correctness and reality in high-scale dis
 tributed databases.\n\n## Takeaways\n\n- **Protocol Fast-Paths**: Learn ho
 w to identify "safe paths" in distributed transactions to bypass the 2PC t
 ax and significantly reduce latency for shard-local operations.\n- **Lock 
 Partitioning**: Practical strategies for managing high-concurrency bottlen
 ecks in Go by moving from global locks to partitioned lock groups (using c
 oncurrent maps like xsync) to isolate contention across different request 
 paths and correlation IDs.\n- **Defensive Storage Design**: Why storage-la
 yer pagination and I/O batching are critical for preventing "OOM" and late
 ncy spikes during large-scale range queries and high-throughput operations
 .\n- **Scaling Inter-Node IO**: How moving from single to multiple persist
 ent outbound connectors per partition can dramatically increase replicatio
 n throughput and resiliency.\n\n## Target Audience\n\nThis session is desi
 gned for Backend Engineers\, Systems Designers\, and SREs who are interest
 ed in database internals and the practical performance trade-offs inherent
  in building and scaling distributed stateful systems.\n\n## Bio\nSarthak 
 Makhija is a Principal Architect at [Caizin](https://caizin.ai/) specializ
 ing in storage engines and distributed systems. While at ThoughtWorks\, he
  led the development of a strongly consistent\, distributed key-value stor
 age engine in Go from scratch.\n\nHe is a contributor to the book [Pattern
 s of Distributed Systems](https://learning.oreilly.com/library/view/-/9780
 138222246/) and writes  about database internals on his blog\, [tech-lesso
 ns.in](https://tech-lessons.in/). \n\nSarthak also conducts workshops on t
 he "Internals of key-value storage engines: LSM-trees and beyond" and Rust
 .\n
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260603T164839Z
LOCATION:Auditorium - TERI Auditorium\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/fas
 t-on-paper-slow-in-reality-what-we-got-wrong-about-performance-Rg5k9Qb8M4e
 kUazbaWVDYL
BEGIN:VALARM
ACTION:display
DESCRIPTION:Fast on Paper\, Slow in Reality: What We Got Wrong About Perfo
 rmance in Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Databases Were Not Designed For This
DTSTART:20260613T115000Z
DTEND:20260613T122500Z
DTSTAMP:20260609T235854Z
UID:session/DaWePnStj81as4CqAket3U@hasgeek.com
SEQUENCE:7
CATEGORIES:30-minute talk – technical deep dive
CREATED:20260531T045101Z
DESCRIPTION:### Description\nDatabases were not designed for agents. They 
 were built around a set of implicit assumptions: callers issue predictable
  queries\, connections are short-lived\, bad queries fail loudly\, and sch
 emas are a contract with engineers. Agentic systems break every one of the
 se assumptions. Agents reason their way to queries\, hold connections whil
 e an LLM thinks\, retry operations unpredictably\, and read your schema as
  natural language -- so a column named `flg_1` is a bug\, not a style choi
 ce.\n\nThe session walks through each broken assumption and the concrete f
 ix for it. Role-level timeouts\, per-agent database roles with minimum pri
 vilege\, soft deletes with agent identity tracking\, append-only event log
 s with idempotency key constraints\, and query tagging. Every fix is a SQL
  or Python snippet you can take back and apply the same week.\n\n### Takea
 ways\n\n1. Your database is not broken -- your assumptions are. Agents exp
 ose implicit contracts baked into traditional schema design\, connection p
 ooling\, and query monitoring that were never written down anywhere.\n \n2
 . A defensive data layer is not optional for agentic systems. Idempotency 
 keys\, append-only logs\, and per-agent roles are the difference between a
  recoverable mistake and silent data corruption.\n\n### Target Audience\n\
 nBackend and platform engineers who are integrating LLM agents into produc
 tion systems or are about to. Also useful for database administrators and 
 engineering leads making architectural decisions about agentic workloads\,
  or those who want to improve the robustness and observability posture of 
 their databases.\n\n### Bio\n\nArpit Bhayani is a Principal Engineer II at
  Razorpay\, where he is working at the intersection of Data and AI. In the
  past\, he was India Tech Lead for GCP Memorystore (providing managed Redi
 s to GCP customers) and GCP Dataproc (providing managed Spark ecosystem to
  GCP customers).\n\nHe writes and teaches about database internals\, syste
 m design\, and engineering fundamentals at arpitbhayani.me\, and shares co
 ntent with a large engineering audience on YouTube\, LinkedIn\, and Twitte
 r.\n
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260603T164841Z
LOCATION:Auditorium - TERI Auditorium\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/dat
 abases-were-not-designed-for-this-DaWePnStj81as4CqAket3U
BEGIN:VALARM
ACTION:display
DESCRIPTION:Databases Were Not Designed For This in Auditorium in 5 minute
 s
TRIGGER:-PT5M
END:VALARM
END:VEVENT
END:VCALENDAR
