BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//HasGeek//NONSGML Funnel//EN
DESCRIPTION:It worked in theory. Let’s talk about production.
X-WR-CALDESC:It worked in theory. Let’s talk about production.
NAME:Topical Edition on Databases
X-WR-CALNAME:Topical Edition on Databases
REFRESH-INTERVAL;VALUE=DURATION:PT12H
SUMMARY:Topical Edition on Databases
TIMEZONE-ID:Asia/Kolkata
X-PUBLISHED-TTL:PT12H
X-WR-TIMEZONE:Asia/Kolkata
BEGIN:VEVENT
SUMMARY:Introduction to Topical Edition on Databases
DTSTART:20260613T043000Z
DTEND:20260613T044000Z
DTSTAMP:20260727T065252Z
UID:session/6nT1vhP4XZJF6LWHneHH6v@hasgeek.com
SEQUENCE:2
CREATED:20260531T044054Z
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260603T164807Z
LOCATION:Auditorium - TERI\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Introduction to Topical Edition on Databases in Auditorium in 
 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Incremental Computation
DTSTART:20260613T044000Z
DTEND:20260613T053500Z
DTSTAMP:20260727T065252Z
UID:session/NuFLkvWed2fh9X72jeQyYC@hasgeek.com
SEQUENCE:8
CATEGORIES:30-minute talk – technical deep dive
CREATED:20260531T044111Z
DESCRIPTION:Incremental computations repeatedly evaluate a function on som
 e input values that are "changing".  The goal of an efficient implementati
 on is to "reuse" previously computed results: when presented with a new ch
 ange to the input\, an incremental computation should only perform work pr
 oportional to the size of the changes of the input\, rather than to the si
 ze of the entire dataset.\n\nIn databases "incremental computation" is kno
 wn as Incremental View Maintenance (IVM)\, and has long been a central pro
 blem of database theory and practice.\n\nWe describe a set of simple ideas
  which combined solve completely the IVM problem for arbitrary queries (in
 cluding recursive queries and essentially all queries that can be written 
 in SQL):\n  - representing changes as a first-class object\n  - treating a
 ll computations as (stateful) stream computations\n  - a trivial algorithm
  for converting any standard stream computation into an incremental comput
 ation\n\nThis work has received the 2023 VLDB best paper award\, and the 2
 024 ACM SIGMOD research highlights award.\n\n## Takeaways\nThese ideas are
  not just a pretty theory: they are very practical. Feldera is a start-up 
 which has built an incremental query engine which maintains  incrementally
  arbitrary collections of views described in SQL\; the incremental mainten
 ance produces many orders of magnitude reduction in query latency and comp
 utational resource usage compared with traditional batch SQL query engines
 .\n\n## Who is this for?\nAny person interested in databases\, including t
 heoreticians and practitioners.  Only basic knowledge of data structures i
 s required for understading this presentation.\n\n## About the presenter \
 nMihai Budiu is chief scientist at Feldera\, an early-stage startup.  He h
 as a Ph.D. in computer science from Carnegie Mellon University.  He was pr
 eviously employed at VMware Research\, Barefoot Networks\, and Microsoft R
 esearch.  Four of his papers have received "test of time" awards. He is th
 e acting PMC chair for the Apache Calcite project.
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260718T102450Z
LOCATION:Auditorium - TERI\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/inc
 remental-computation-NuFLkvWed2fh9X72jeQyYC
BEGIN:VALARM
ACTION:display
DESCRIPTION:Incremental Computation in Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Database systems: a decade of disruption and innovation
DTSTART:20260613T054000Z
DTEND:20260613T062500Z
DTSTAMP:20260727T065252Z
UID:session/Tc1fwbPNPxPVCkUTFhKTMe@hasgeek.com
SEQUENCE:8
CATEGORIES:30-minute talk – technical deep dive
CREATED:20260531T044200Z
DESCRIPTION:The past decade has been highly eventful for Database systems\
 , to say the least. \n\nDatabases have transitioned from on-prem system so
 ftware running in controlled environments to highly scalable and elastic s
 ervices that run on commodity hardware in the cloud. They have evolved to 
 handle diverse\, complex workloads and multi-modal data while adhering to 
 the ever-growing demands of security\, compliance and data governance. \nT
 his transformation has been possible due to several fundamental breakthrou
 ghs in both technology and business. This talk traces the evolution of dat
 abase systems over the past decade and describes some of the key ideas tha
 t have paved the way. The talk concludes by highlighting a few open challe
 nges and opportunities for innovation in database systems\, as we enter th
 e era of AI.\n\n## #Take-aways from the keynote\n1. Understand the story o
 f evolution of database systems over the past decade\n2. Learn about some 
 deep systems innovations that have had tremendous impact both on industry 
 and academia\n3. Get a glimpse of what lies in store for the future of dat
 abases in the era of AI \n\n### Intended audience\nAnyone who is fascinate
 d by database systems and is interested in getting a bird's eye view of th
 e past\, present and future of this fundamental branch of computer science
 .\n\n### About the presenter\nKarthik Ramachandra is head the Azure SQL DB
  R&D Organization in India. Prior to this\, he was a researcher at Microso
 ft Research India and Microsoft Gray Systems Lab. Karthik's areas of inter
 est include query processing and optimization in large scale databases and
  data management systems. \nKarthik has a PhD in computer science from Ind
 ian Institute of Technology Bombay and a B.Tech. from BMS College of Engin
 eering\, Bangalore. \nHis doctoral thesis titled “Holistic Optimization 
 of Database Applications” won an honorable mention for the ACM SIGMOD Ji
 m Gray Doctoral Dissertation Award in 2015. 
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260720T052252Z
LOCATION:Auditorium - TERI\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/dat
 abase-systems-a-decade-of-disruption-and-innovation-Tc1fwbPNPxPVCkUTFhKTMe
BEGIN:VALARM
ACTION:display
DESCRIPTION:Database systems: a decade of disruption and innovation in Aud
 itorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Break
DTSTART:20260613T062500Z
DTEND:20260613T065000Z
DTSTAMP:20260727T065252Z
UID:session/HJbQ56SCVfRARvVQgjWh5r@hasgeek.com
SEQUENCE:7
CREATED:20260531T044237Z
LAST-MODIFIED:20260603T164823Z
LOCATION:Polaris School of Technology\, Bengaluru
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Break in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Migrating Panacea.AI's 5TB/day Log Platform from Manticore to Clic
 khouse...and the lessons learnt
DTSTART:20260613T065000Z
DTEND:20260613T071000Z
DTSTAMP:20260727T065252Z
UID:session/4fNgna4hYE3Swf7PGnyGXb@hasgeek.com
SEQUENCE:10
CATEGORIES:15-minute talk – focused engineering experience
CREATED:20260531T044525Z
DESCRIPTION:## Description\n\nWe sized the storage layer for Nutanix's Pan
 acea.AI platform — 5 TB and 5 billion log lines a day — three differen
 t ways and got three answers an order of magnitude apart. Same workload\, 
 same retention\, same ingest rate\; the engines disagreed on storage by 37
 × and on CPU by 3×.\n\n| Sizing for 5 TB/day\, 30-day retention | Disk |
  RAM | CPU cores | Compression | Ingest |\n| :---- | :---- | :---- | :----
  | :---- | :---- |\n| Inverted-index\, heavy | 750 TB | 24 TB | 1\,200+ | 
 1.5 : 1 | \\~25k rows/s |\n| Inverted-index\, lean | 500 TB | 10 TB | 800 
 | 1 : 1 | \\~50k rows/s |\n| **Columnar (what we ship)** | **20 TB** | **4
  TB** | **400** | **10 : 1** | **870k+ rows/s** |\n\nThis 15-minute talk i
 s the engineering case for why that gap exists\, and what it takes to land
  on the small end of it in production.\n\nWe'll cover why log analytics is
  unusually well-suited to a columnar layout — access patterns that are a
 lmost always per-bundle and time-windowed\, compression headroom on raw lo
 g messages\, and the operational economics that fall out of those two — 
 and the four schema patterns that did the real work in production:\n\n| Sc
 hema pattern | What it does | What it bought us |\n| :---- | :---- | :----
  |\n| `Delta+ZSTD` codec stacking | Stacks delta encoding under ZSTD on lo
 g columns | 9.76 TB raw → 480 GB on disk (20.8×)\; up to 993× on monot
 onic IDs |\n| `LowCardinality(String)` | Dictionary-encodes high-frequency
  strings (levels\, hosts\, services) | Smaller marks\, faster filters\, be
 tter cache hit rate |\n| `tokenbf_v1` skip indexes | Bloom-filter-based sk
 ip indexes on tokenized log text | Replaces full-text indexing for substri
 ng search |\n| Monthly partitions \\+ `ttl_only_drop_parts=1` | Drops whol
 e parts on TTL instead of mutating | Self-maintaining cluster\, no DBA |\n
 \nThe cluster today holds 154 billion rows across logs\, metrics\, traces\
 , and AI-generated incident reports\, sustains 870k+ inserts/sec on a sing
 le node\, and runs without a dedicated DBA. We'll close with the one searc
 h-latency trade-off we accepted to land here\, what it cost\, what it didn
 't\, and the framework we now use to re-evaluate it quarterly.\n\n---\n\n#
 # Key Takeaways\n\n1. **Engine choice is a sizing decision\, not a feature
  decision.** The same 5 TB/day workload sized at 750 TB\, 500 TB\, or 20 T
 B depending on the storage model — the table above is the artifact you a
 ctually defend in a design review.  \n2. **Why log analytics fits a column
 ar engine** — access patterns\, compression headroom\, operational econo
 mics — together with the four production schema patterns (`Delta+ZSTD`\,
  `LowCardinality`\, `tokenbf_v1`\, partition-aligned TTLs) that delivered 
 the 37× disk reduction.  \n3. **The trade-off we accepted:** substring se
 arch latency moved from inverted-index speed to bloom-filter speed on a qu
 ery class used in \\<15% of sessions. We'll show the query-mix that made t
 he call defensible.\n\n---\n\n## Target Audience\n\nSREs\, platform engine
 ers\, and DBAs operating high-volume log telemetry. Engineering managers a
 nd architects evaluating storage engines for petabyte-class observability\
 , log search\, or AI workloads. Anyone running an inverted-index backend t
 oday who suspects the access patterns of their workload have outgrown the 
 model they started with.\n\n---\n\n## Bio\n\n**Sohham Seal** — SDE-2 at 
 Nutanix on the Panacea AI platform\; works on the columnar ingestion and q
 uery layer that powers AI-driven incident triage across Nutanix’s custom
 er fleet. His recent AI-related work includes GNN-based recommendation sys
 tems\, biometric security\, and EMG-based gesture prediction (IEEE TIFS\, 
 PCEMS Best Paper). Will present in Bangalore.\n\n**Mohit Gurnani** — SDE
 -4 at Nutanix\; architects and leads Panacea.AI\, an agentic AI platform p
 rocessing 20 PB+ of observability data annually across 29\,000+ enterprise
  clusters — ClickHouse for high-cardinality analytics\, Kubernetes-nativ
 e event-driven pipelines\, and LangGraph agents on top. IEEE-published aut
 hor\; previously presented at ICDMAI 2017\\. Designed the migration descri
 bed in this talk. Will join Q\\&A virtually from Columbus\, OH.  
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260720T061056Z
LOCATION:Auditorium - TERI\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/mig
 rating-panacea-ais-5tb-day-log-platform-from-manticore-to-clickhouse-and-t
 he-lessons-learnt-4fNgna4hYE3Swf7PGnyGXb
BEGIN:VALARM
ACTION:display
DESCRIPTION:Migrating Panacea.AI's 5TB/day Log Platform from Manticore to 
 Clickhouse...and the lessons learnt in Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Sponsored session: The evolving OLAP stack: a conversation with Fi
 rebolt
DTSTART:20260613T065000Z
DTEND:20260613T073000Z
DTSTAMP:20260727T065252Z
UID:session/G9bGQHZJynboC3ojVFoGqZ@hasgeek.com
SEQUENCE:3
CREATED:20260611T075933Z
DESCRIPTION:Agentic workloads are changing what we ask of analytical datab
 ases. This fireside chat explores how the OLAP space is evolving - and wha
 t it takes to build for what's coming next.
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260611T080452Z
LOCATION:Conference Room - TERI\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Sponsored session: The evolving OLAP stack: a conversation wit
 h Firebolt in Conference Room in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Rolling Your Own Database (Safely!): Property-based Testing at Sca
 le
DTSTART:20260613T071500Z
DTEND:20260613T075000Z
DTSTAMP:20260727T065252Z
UID:session/H3BkKdaBtnvMNhGEqHvZQ@hasgeek.com
SEQUENCE:7
CATEGORIES:30-minute talk – technical deep dive
CREATED:20260531T044623Z
DESCRIPTION:\n## Description\nThere are real advantages to building a spec
 ialized database: better performance\, less impedance mismatch\, and lower
  operational cost. But the conventional wisdom against rolling your own ex
 ists for a reason. ACID is *hard*\, and general-purpose systems are reliab
 le precisely because they've been battle-hardened over decades. That calcu
 lus has recently changed - cheap object storage and a rich ecosystem of op
 en-source primitives have collapsed the cost of building a new database. M
 odern testing methods are doing the same for the cost of trusting one.\n\n
 In this session\, I'll present Pangolin\, a production OLAP database used 
 at Antithesis\, an autonomous software testing company whose deterministic
  hypervisor and fuzzer simulate billions of program states. The resulting 
 data\, billions of rows with monstrous cardinalities\, little schema\, and
  complex query patterns\, breaks general-purpose databases. We’ll cover 
 the design decisions that let Pangolin handle this workload\, and the test
 ing strategy that made it possible for three engineers to design and deplo
 y it in only 14 months: a PBT-forward approach to development where the sa
 me properties catch correctness bugs at the source\, watch for violations 
 in production\, and guide fuzzing in deterministic simulation. We'll walk 
 through real bugs from each layer\, why we'd never have caught them with c
 onventional testing\, and what lessons other teams building reliable softw
 are can take away.\n\n## Takeaways\n- **Properties test what your code sho
 uld do\, not what you remember to check.** Example-based tests catch bugs 
 you’ve already thought of. PBT lets you bake in invariants and find coun
 terexamples hidden in the seams between assumptions.\n- **One property\, m
 any jobs.** The same invariant represents failures in local testing\, logs
  in production\, and guides fuzzing in simulation. One specification can r
 eplace a stack of test suites and lets a small team ship a reliable databa
 se in the time it usually takes to write one.\n  \n## Target Audience\nEng
 ineers interested in specialized data systems\, property-based testing\, o
 r simulation testing as practical tools for shipping correct and reliable 
 production software fast!\n\n## Bio\nShomik Ghose is a database engineer a
 t Antithesis\, where he helped design and build Pangolin. Shomik is a fan 
 of distributed systems\, software correctness and dogs.\n\nhttps://drive.g
 oogle.com/file/d/1hBTAh-84wDqT93WtnFqfBmsVVrid7Anu/view?usp=sharing\n\n\n
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260720T060543Z
LOCATION:Auditorium - TERI\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/rol
 ling-your-own-database-safely-property-based-testing-at-scale-H3BkKdaBtnvM
 NhGEqHvZQ
BEGIN:VALARM
ACTION:display
DESCRIPTION:Rolling Your Own Database (Safely!): Property-based Testing at
  Scale in Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Compaction Is not a database-wide decision
DTSTART:20260613T073500Z
DTEND:20260613T080000Z
DTSTAMP:20260727T065252Z
UID:session/ANi42WKXdt4qv3Y3TupF6p@hasgeek.com
SEQUENCE:4
CATEGORIES:15-minute talk – focused engineering experience
CREATED:20260611T080733Z
DESCRIPTION:# Session Description\n\nModern LSM-tree storage engines typic
 ally force a global choice between tiered and leveled compaction. Tiered c
 ompaction offers excellent write throughput but can suffer from read ampli
 fication\, while leveled compaction improves read performance at the cost 
 of additional write amplification. Existing systems generally apply one st
 rategy across the entire database\, implicitly assuming that all data exhi
 bits similar access patterns and workload characteristics.\n\nAmethyst exp
 lores a different approach: treating compaction as a local rather than glo
 bal decision. The system continuously characterizes SSTable behavior using
  lightweight metadata and dynamically selects between compaction strategie
 s at the segment level. In this talk\, we will examine the trade-offs that
  motivated the design\, the challenges of workload characterization\, the 
 mechanisms required to safely transition between policies\, and the result
 s from benchmarking adaptive compaction against traditional LSM configurat
 ions. We will also discuss cases where adaptation helps\, where it fails\,
  and what these results suggest about the future of self-tuning storage en
 gines.\n\n# Key Takeaways\nUnderstand why the traditional choice between t
 iered and leveled compaction remains one of the fundamental trade-offs in 
 LSM-tree design.\nLearn how lightweight workload characterization can enab
 le adaptive compaction policies and the engineering challenges involved in
  building self-tuning storage systems.\n\n# Target Audience\nThis session 
 will be valuable for:\nDatabase engineers and storage engine developers\nD
 istributed systems practitioners\nPerformance and infrastructure engineers
 \nResearchers and students interested in storage systems and database inte
 rnals\nAnyone operating or evaluating LSM-based systems such as RocksDB\, 
 Cassandra\, ScyllaDB\, LevelDB\, or CockroachDB\n\n# Speaker Bio\nSuchitra
  is a Computer Science student with interests in databases\, distributed s
 ystems\, and storage engines. She's currently building Amethyst\, an exper
 imental LSM-tree storage engine that investigates adaptive compaction stra
 tegies and workload-aware storage optimization. Her work focuses on bridgi
 ng ideas from database research and practical systems engineering through 
 hands-on implementation and benchmarking.\n\n​Nilin is a Software Engine
 ering and Computer Science student specializing in backend systems and dat
 abase internals.As a co-developer of Amethyst\, she specializes in impleme
 nting adaptive compaction logic\, physical disk I/O\, and performance benc
 hmarking.
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260611T082033Z
LOCATION:Conference Room - TERI\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/com
 paction-is-not-a-database-wide-decision-ANi42WKXdt4qv3Y3TupF6p
BEGIN:VALARM
ACTION:display
DESCRIPTION:Compaction Is not a database-wide decision in Conference Room 
 in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Lunch
DTSTART:20260613T075000Z
DTEND:20260613T085000Z
DTSTAMP:20260727T065252Z
UID:session/2Hat5kit59HFHf1XiAeWKd@hasgeek.com
SEQUENCE:5
CREATED:20260531T044800Z
LAST-MODIFIED:20260603T164829Z
LOCATION:Polaris School of Technology\, Bengaluru
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Lunch in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:What Breaks When Aerospike Hits 6 Million QPS
DTSTART:20260613T085000Z
DTEND:20260613T092500Z
DTSTAMP:20260727T065252Z
UID:session/33CryR5n9qr3bDKNSJyhP5@hasgeek.com
SEQUENCE:5
CATEGORIES:30-minute talk – technical deep dive
CREATED:20260531T044921Z
DESCRIPTION:**Category:** War Stories & Lessons Learned\n\n---\n\n## Abstr
 act\n\nWhen your database is in the critical path of every ad auction\, fa
 ilure isn't abstract. A misconfigured cluster costs you money in real time
 . A CPU spike at 1AM means your bidder is throttling while your competitor
 s are not.\n\nInMobi's DSP runs multiple purpose-built Aerospike clusters 
 on Kubernetes\, peaking at 6 million QPS across workloads that have almost
  nothing in common — real-time user segment lookups\, ML embedding servi
 ng\, frequency-cap enforcement\, event deduplication. After years of runni
 ng this at scale\, we've collected a set of failures and near-misses that 
 the documentation doesn't warn you about.\n\nThis talk goes through a few 
 of them — incidents where the root cause turned out to be a default we n
 ever questioned\, a data model decision that looked fine on day one\, or a
  capacity assumption that held until it suddenly didn't. Each one taught u
 s something we couldn't have learned without the production traffic to tri
 gger it.\n\nBeyond the failures\, we'll cover what we've built around Aero
 spike to keep it operational: caching layers\, circuit-breaker patterns tu
 ned per cluster\, and the observability that now gives us early warning be
 fore things go wrong.\n\n---\n\n## Key Takeaways\n\n- Default database con
 figurations are tuned for correctness\, not for extreme QPS — understand
 ing what each setting trades off is what separates operating from just run
 ning a database\n- Data models that look fine at low scale can become infr
 astructure problems over time — record growth is a design concern\, not 
 just a storage concern\n- Every database has hidden resource costs that on
 ly surface at migration time — know your overhead before you need to\n- 
 Resilience at this scale isn't about the database being reliable — it's 
 about designing every layer around it to degrade gracefully when it isn't\
 n\n---\n\n## Target Audience\n\n- Senior engineers and architects operatin
 g databases in the critical path of production traffic\n- Platform and SRE
  engineers managing stateful\, high-throughput systems on Kubernetes\n- En
 gineers using or evaluating Aerospike at scale — or running any low-late
 ncy key-value store under real load\n- Engineers who have hit — or expec
 t to hit — scaling limits in production and want to know what breaks fir
 st and why\n\n---\n\n## Speaker Bios\n\n**Shivam Gupta**\nShivam is a Staf
 f Software Engineer at InMobi in the DSP platform — the real-time biddin
 g infrastructure that processes millions of ad auctions per second across 
 InMobi's global footprint.\n\n>shivam.gupta@inmobi.com\n>\n>## Slide deck\
 nhttps://docs.google.com/presentation/d/1_-TPbgA_TyI2Iinl-6VhvuHRUy1d4cwzd
 lxR8ul6Tt0/edit?usp=sharing
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260720T055521Z
LOCATION:Auditorium - TERI\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/wha
 t-breaks-when-aerospike-hits-6-million-qps-33CryR5n9qr3bDKNSJyhP5
BEGIN:VALARM
ACTION:display
DESCRIPTION:What Breaks When Aerospike Hits 6 Million QPS in Auditorium in
  5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:No Stats\, No Problem: : Building Feedback-Driven Optimizers for L
 akehouses
DTSTART:20260613T093000Z
DTEND:20260613T100500Z
DTSTAMP:20260727T065252Z
UID:session/MDK1mGGnGYRP4VCLaFvxUP@hasgeek.com
SEQUENCE:9
CATEGORIES:30-minute talk – technical deep dive
CREATED:20260531T044835Z
DESCRIPTION:Modern query optimizers were designed assuming that the engine
  has reasonably good statistics: row counts\, NDVs\, histograms\, column c
 orrelations\, table freshness\, and reliable cost models. In many lakehous
 e environments\, that assumption breaks down. This talk is about building 
 a query optimizer that can survive in that world. We will discuss a set of
  practical techniques for planning under limited statistics: LEO-style lea
 rning from executed queries\, equivalence sets to compensate for missing N
 DV and semantic constraints\, auto-stats driven by “magic number” sens
 itivity analysis\, and complementary learning from both data and query exe
 cution.\n\nThe second half of the talk presents a research direction: onli
 ne parametric query optimization for recurring BI workloads. Many lakehous
 e queries are templatized: the same SQL shape runs repeatedly with differe
 nt customer IDs\, time windows\, geographies\, product lines\, or account 
 bindings. Most bindings behave like the common case\, but a few create pla
 n cliffs — for example\, a whale account or an unusually broad date rang
 e. We will examine how an optimizer can learn compact parameter-risk regio
 ns\, maintain a bounded set of useful plans\, and reduce tail-latency regr
 et.\n\nAttendees will leave with a mental model for optimizer design when 
 statistics are incomplete by default: what to estimate\, what to learn\, w
 hat to collect\, what to treat as uncertain\, and where robust planning be
 ats blind adaptivity\n\nThe session is aimed at database engineers\, query
  optimizer developers\, data platform teams\, and practitioners running an
 alytical SQL over lakehouse or object-store-backed systems.\n\nSlide: http
 s://docs.google.com/presentation/d/1oze2xvJuLeavgb9S2LCQrZRoZiLDldZD/edit?
 usp=sharing&ouid=103492819106331508633&rtpof=true&sd=true\n\nSweta Singh l
 eads the SQL query optimizer team at e6data. She has over two decades of e
 xperience in database systems\, query optimization\, distributed systems\,
  performance engineering\, and workload management. Before E6data\, she sp
 ent 19 years on the IBM Db2 development team.  Her work spans cost-based o
 ptimization\, statistics approximation\, learning optimizers\, join enumer
 ation\, workload management\, distributed systems and OLTP performance eng
 ineering. \nRenu Pinky Sumam is a Senior Software Engineer on the Query Op
 timizer team at E6data\, with nearly 19 years of experience across relatio
 nal database technology\, cloud systems and AI. Before joining e6data\, sh
 e worked at IBM on Db2 and IBM Cloud Object Storage\, where she helped rea
 rchitect the Cloud Object Storage billing infrastructure into a serverless
 \, cloud-native architecture.\n
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260720T061648Z
LOCATION:Auditorium - TERI\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/no-
 stats-no-problem-building-feedback-driven-optimizers-for-lakehouses-MDK1mG
 GnGYRP4VCLaFvxUP
BEGIN:VALARM
ACTION:display
DESCRIPTION:No Stats\, No Problem: : Building Feedback-Driven Optimizers f
 or Lakehouses in Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Birds of Feather (BOF) session: Rethinking data systems for the ag
 e of LLMs
DTSTART:20260613T100500Z
DTEND:20260613T110500Z
DTSTAMP:20260727T065252Z
UID:session/8dcKceFyR7mSP3RBfcn1Si@hasgeek.com
SEQUENCE:5
CATEGORIES:Birds of Feather (BOF) proposals – discussion on focussed top
 ics
CREATED:20260611T082047Z
DESCRIPTION:## Introduction\nOver the past few years\, the center of gravi
 ty in data systems has begun to shift. While traditional database workload
 s were dominated by deterministic transactions and analytical queries\, bo
 th industry and academic evidence now point to a rapid rise in AI-driven\,
  token-based workload. Analysts estimate that 80% of enterprise data is un
 structured\, yet historically underutilized. Today\, systems are being red
 esigned to make this data queryable through semantic and multimodal interf
 aces\, marking a transition from structured query processing to probabilis
 tic\, context-aware data workflows.\n\nIn this setting\, long-held assumpt
 ions are being challenged: LLM invocations increasingly dominate execution
  cost\, latency\, and even correctness trade-offs\, fundamentally reshapin
 g optimization priorities across the stack. Databases are no longer passiv
 e stores but active participants in reasoning loops—supporting retrieval
 \, context assembly\, and execution for AI agents. \n\nWe’ll look at how
  practitioners and researchers are rethinking optimization in this new set
 ting\, where LLM network calls are the real bottleneck\, and efficiency me
 ans reducing token usage and managing uncertainty. We’ll also explore ho
 w agents are starting to sit between users and databases\, turning queries
  into reasoning loops\, how and what data are we vectorizing and what this
  means for system design. Finally\, we’ll touch on open challenges\, inc
 luding working with time-series and evolving data\, lack of clear benchmar
 ks and ensuring reliability in probabilistic outputs\, and supporting long
 -running\, stateful workflows.\n\n## Contributors \n### [Karthik Ramachand
 ra](https://www.linkedin.com/in/karthikramachandra/)\nPartner Director of 
 Engg & India Site Lead - Azure SQL at Microsoft \, Co-founder CTO at Shrot
 aHouse\n\n\n## Takeaways\nA clear mental model of how data systems are evo
 lving\, especially in the semantic query engine space. \nPractical insight
 s into what’s changing in systems today\, including how teams are optimi
 zing LLM-heavy workloads and translating research ideas into production.\n
 \n## Who is this for?\nEngineers and Researchers building or working on da
 tabases\, data infrastructure\, or AI/data platforms
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260611T082157Z
LOCATION:Conference Room - TERI\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/ret
 hinking-data-systems-for-the-age-of-llms-8dcKceFyR7mSP3RBfcn1Si
BEGIN:VALARM
ACTION:display
DESCRIPTION:Birds of Feather (BOF) session: Rethinking data systems for th
 e age of LLMs in Conference Room in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:From Timeout to Sub-Second: Solving Scale-Dependent Deadlocks in D
 istributed Systems
DTSTART:20260613T101000Z
DTEND:20260613T104500Z
DTSTAMP:20260727T065252Z
UID:session/4scoctu9SpKoMr7G2rmvR3@hasgeek.com
SEQUENCE:11
CATEGORIES:15-minute talk – focused engineering experience
CREATED:20260531T044720Z
DESCRIPTION:**Abstract**\nIn highly coordinated distributed systems like A
 pache HBase\, operations often rely on global barriers\, synchronized proc
 edures that require every node to reach a consensus point before moving fo
 rward. At extreme scales\, these barriers become highly sensitive to threa
 d contention and coordination overhead. This talk details a real-world pro
 duction incident at Flipkart where a critical disaster recovery pipeline h
 it a hard "60-second wall"\, consistently failing due to a hidden architec
 tural flaw.\n\n\nWe will walk through the journey of diagnosing a scale-de
 pendent deadlock that only manifested in 50+ node production clusters whil
 e remaining invisible in smaller staging environments. Attendees will lear
 n how a seemingly harmless\, redundant synchronous RPC call from a worker 
 node back to the central coordinator created a circular dependency in the 
 Master’s RPC handlers\, causing the entire cluster-wide log roll procedu
 re to time out.\nThe session covers the debugging methodology used to prov
 e the deadlock\, including the use of synchronized\, multi-instance thread
  dumps across hundreds of nodes. Finally\, we discuss the architectural sh
 ift required to solve it: decoupling local worker tasks from synchronous c
 allbacks during time-sensitive global barriers.\n\n\n\n**Key Takeaways**\n
 1) _The Circular Dependency_: Understand how synchronous RPC calls within 
 a blocked coordinator thread lead to distributed deadlocks.\n2) _Practical
  Approach_: A practical guide to using synchronized\, multi-instance threa
 d dumps (taken at fixed intervals) to definitively prove a thread is block
 ed rather than just slow.\n3) _Hard Metrics_: See how removing a single re
 dundant check reduced the rolllog procedure time from a mandated 60\,000ms
  timeout failure down to just a few hundred milliseconds.\n4) _Architectur
 al Rule of Thumb_: Never allow workers to make synchronous callbacks to a 
 coordinator that is currently parked waiting for those same workers.\n\n\n
 **Target Audience**\nThis session is designed for Backend Engineers\, Syst
 ems Designers\, and SREs who are interested in database internals and the 
 practical approaches in building and scaling distributed stateful systems.
 \n\n**Slides Deck**\nhttps://docs.google.com/presentation/d/1ylQRYiXmmRqha
 3vJ0aEtzxNiG5mKG-Td8KQiOHQNsPM/edit?usp=sharing\n\n\n**About Me**\nVarun M
 ishra\, senior software engineer (SDE-III) at Flipkart\, where I am workin
 g on centrally managed platforms. We are solving for high scale distribute
 d systems and their reliability. Varun has more than 7 years of experience
  in software development and more than 5 years working on databases.\n
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260720T062932Z
LOCATION:Auditorium - TERI\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/fro
 m-timeout-to-sub-second-solving-scale-dependent-deadlocks-in-distributed-s
 ystems-4scoctu9SpKoMr7G2rmvR3
BEGIN:VALARM
ACTION:display
DESCRIPTION:From Timeout to Sub-Second: Solving Scale-Dependent Deadlocks 
 in Distributed Systems in Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Break
DTSTART:20260613T104500Z
DTEND:20260613T111000Z
DTSTAMP:20260727T065252Z
UID:session/16qAgf6i8Pynjd5UXgDXpg@hasgeek.com
SEQUENCE:5
CREATED:20260531T045010Z
LAST-MODIFIED:20260603T164837Z
LOCATION:Polaris School of Technology\, Bengaluru
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Break in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Fast on Paper\, Slow in Reality: What We Got Wrong About Performan
 ce
DTSTART:20260613T111000Z
DTEND:20260613T114500Z
DTSTAMP:20260727T065252Z
UID:session/Rg5k9Qb8M4ekUazbaWVDYL@hasgeek.com
SEQUENCE:7
CATEGORIES:30-minute talk – technical deep dive
CREATED:20260531T045028Z
DESCRIPTION:## Description\n\nIn distributed systems engineering\, a desig
 n that is "correct on paper" is only the beginning\; the real challenge is
  making it "fast in reality." This session offers a transparent post-morte
 m of the architectural assumptions we made while building a distributed ke
 y-value store from scratch in Go\, and why several of those assumptions co
 llapsed under production-grade pressure. We’ll move beyond high-level de
 sign to deconstruct the hidden performance bottlenecks within standard dis
 tributed patterns\, exploring how generalized 2-Phase Commit (2PC) became 
 a crippling bottleneck\, why our waiting list built on Go’s standard mut
 ex became a global point of contention\, and why our initially "standard" 
 transactional steps led to redundant network and disk I/O that unexpectedl
 y doubled our latency.\n\nBy deconstructing these failures\, we provide a 
 practical roadmap for building distributed stateful systems that perform a
 s well in production as they do on paper. We will discuss our remediation 
 journey: from bypassing protocol stages for localized transactions to impl
 ementing storage-layer batching and eliminating redundant network calls to
  local nodes. Attendees will leave with a clear understanding of how to br
 idge the gap between theoretical correctness and reality in high-scale dis
 tributed databases.\n\n## Takeaways\n\n- **Protocol Fast-Paths**: Learn ho
 w to identify "safe paths" in distributed transactions to bypass the 2PC t
 ax and significantly reduce latency for shard-local operations.\n- **Lock 
 Partitioning**: Practical strategies for managing high-concurrency bottlen
 ecks in Go by moving from global locks to partitioned lock groups (using c
 oncurrent maps like xsync) to isolate contention across different request 
 paths and correlation IDs.\n- **Defensive Storage Design**: Why storage-la
 yer pagination and I/O batching are critical for preventing "OOM" and late
 ncy spikes during large-scale range queries and high-throughput operations
 .\n- **Scaling Inter-Node IO**: How moving from single to multiple persist
 ent outbound connectors per partition can dramatically increase replicatio
 n throughput and resiliency.\n\n## Target Audience\n\nThis session is desi
 gned for Backend Engineers\, Systems Designers\, and SREs who are interest
 ed in database internals and the practical performance trade-offs inherent
  in building and scaling distributed stateful systems.\n\n## Bio\nSarthak 
 Makhija is a Principal Architect at [Caizin](https://caizin.ai/) specializ
 ing in storage engines and distributed systems. While at ThoughtWorks\, he
  led the development of a strongly consistent\, distributed key-value stor
 age engine in Go from scratch.\n\nHe is a contributor to the book [Pattern
 s of Distributed Systems](https://learning.oreilly.com/library/view/-/9780
 138222246/) and writes  about database internals on his blog\, [tech-lesso
 ns.in](https://tech-lessons.in/). \n\nSarthak also conducts workshops on t
 he "Internals of key-value storage engines: LSM-trees and beyond" and Rust
 .\n
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260720T053912Z
LOCATION:Auditorium - TERI\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/fas
 t-on-paper-slow-in-reality-what-we-got-wrong-about-performance-Rg5k9Qb8M4e
 kUazbaWVDYL
BEGIN:VALARM
ACTION:display
DESCRIPTION:Fast on Paper\, Slow in Reality: What We Got Wrong About Perfo
 rmance in Auditorium in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Flash talks
DTSTART:20260613T114500Z
DTEND:20260613T121500Z
DTSTAMP:20260727T065252Z
UID:session/LWj7XZaE6u9bpknqo4NwXG@hasgeek.com
SEQUENCE:2
CREATED:20260611T082229Z
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260611T082235Z
LOCATION:Conference Room - TERI\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
BEGIN:VALARM
ACTION:display
DESCRIPTION:Flash talks in Conference Room in 5 minutes
TRIGGER:-PT5M
END:VALARM
END:VEVENT
BEGIN:VEVENT
SUMMARY:Databases were not designed for this
DTSTART:20260613T115000Z
DTEND:20260613T122500Z
DTSTAMP:20260727T065252Z
UID:session/DaWePnStj81as4CqAket3U@hasgeek.com
SEQUENCE:9
CATEGORIES:30-minute talk – technical deep dive
CREATED:20260531T045101Z
DESCRIPTION:### Description\nDatabases were not designed for agents. They 
 were built around a set of implicit assumptions: callers issue predictable
  queries\, connections are short-lived\, bad queries fail loudly\, and sch
 emas are a contract with engineers. Agentic systems break every one of the
 se assumptions. Agents reason their way to queries\, hold connections whil
 e an LLM thinks\, retry operations unpredictably\, and read your schema as
  natural language -- so a column named `flg_1` is a bug\, not a style choi
 ce.\n\nThe session walks through each broken assumption and the concrete f
 ix for it. Role-level timeouts\, per-agent database roles with minimum pri
 vilege\, soft deletes with agent identity tracking\, append-only event log
 s with idempotency key constraints\, and query tagging. Every fix is a SQL
  or Python snippet you can take back and apply the same week.\n\n### Takea
 ways\n\n1. Your database is not broken -- your assumptions are. Agents exp
 ose implicit contracts baked into traditional schema design\, connection p
 ooling\, and query monitoring that were never written down anywhere.\n \n2
 . A defensive data layer is not optional for agentic systems. Idempotency 
 keys\, append-only logs\, and per-agent roles are the difference between a
  recoverable mistake and silent data corruption.\n\n### Target Audience\n\
 nBackend and platform engineers who are integrating LLM agents into produc
 tion systems or are about to. Also useful for database administrators and 
 engineering leads making architectural decisions about agentic workloads\,
  or those who want to improve the robustness and observability posture of 
 their databases.\n\n### Bio\n\nArpit Bhayani is a Principal Engineer II at
  Razorpay\, where he is working at the intersection of Data and AI. In the
  past\, he was India Tech Lead for GCP Memorystore (providing managed Redi
 s to GCP customers) and GCP Dataproc (providing managed Spark ecosystem to
  GCP customers).\n\nHe writes and teaches about database internals\, syste
 m design\, and engineering fundamentals at arpitbhayani.me\, and shares co
 ntent with a large engineering audience on YouTube\, LinkedIn\, and Twitte
 r.\n
GEO:12.9634971;77.6380856
LAST-MODIFIED:20260720T063530Z
LOCATION:Auditorium - TERI\nBengaluru\nIN
ORGANIZER;CN=Rootconf:MAILTO:no-reply@hasgeek.com
URL:https://hasgeek.com/rootconf/topical-edition-on-databases/schedule/dat
 abases-were-not-designed-for-this-DaWePnStj81as4CqAket3U
BEGIN:VALARM
ACTION:display
DESCRIPTION:Databases were not designed for this in Auditorium in 5 minute
 s
TRIGGER:-PT5M
END:VALARM
END:VEVENT
END:VCALENDAR