No Stats, No Problem: : Building Feedback-Driven Optimizers for Lakehouses

Mar 2026

30 Mon 09:00 AM – 11:59 PM IST

31 Tue 09:00 AM – 11:59 PM IST

1 Wed 09:00 AM – 11:59 PM IST

2 Thu 09:00 AM – 11:59 PM IST

3 Fri 09:00 AM – 11:59 PM IST

4 Sat 09:00 AM – 11:59 PM IST

5 Sun 09:00 AM – 11:59 PM IST

Apr 2026

6 Mon 09:00 AM – 11:59 PM IST

7 Tue 09:00 AM – 11:59 PM IST

8 Wed 09:00 AM – 11:59 PM IST

9 Thu 09:00 AM – 11:59 PM IST

10 Fri 09:00 AM – 11:59 PM IST

11 Sat 09:00 AM – 11:59 PM IST

12 Sun 09:00 AM – 11:59 PM IST

Apr 2026

13 Mon 09:00 AM – 11:59 PM IST

14 Tue 09:00 AM – 11:59 PM IST

15 Wed 09:00 AM – 11:59 PM IST

16 Thu 09:00 AM – 11:59 PM IST

17 Fri 09:00 AM – 11:59 PM IST

18 Sat 09:00 AM – 11:59 PM IST

19 Sun 09:00 AM – 11:59 PM IST

Apr 2026

20 Mon 09:00 AM – 11:59 PM IST

21 Tue 09:00 AM – 11:59 PM IST

22 Wed 09:00 AM – 11:59 PM IST

23 Thu 09:00 AM – 11:59 PM IST

24 Fri 09:00 AM – 11:59 PM IST

25 Sat 09:00 AM – 11:59 PM IST

26 Sun 09:00 AM – 11:59 PM IST

Apr 2026

27 Mon 09:00 AM – 11:59 PM IST

28 Tue 09:00 AM – 11:59 PM IST

29 Wed 09:00 AM – 11:59 PM IST

30 Thu 09:00 AM – 11:59 PM IST

1 Fri 09:00 AM – 11:59 PM IST

2 Sat 09:00 AM – 11:59 PM IST

3 Sun 09:00 AM – 11:59 PM IST

May 2026

4 Mon 09:00 AM – 11:59 PM IST

5 Tue 09:00 AM – 11:59 PM IST

6 Wed 09:00 AM – 11:59 PM IST

7 Thu 09:00 AM – 11:59 PM IST

8 Fri 09:00 AM – 11:59 PM IST

9 Sat 09:00 AM – 11:59 PM IST

10 Sun 09:00 AM – 11:59 PM IST

May 2026

11 Mon 09:00 AM – 11:59 PM IST

12 Tue 09:00 AM – 11:59 PM IST

13 Wed 09:00 AM – 11:59 PM IST

14 Thu 09:00 AM – 11:59 PM IST

15 Fri 09:00 AM – 11:59 PM IST

16 Sat 09:00 AM – 11:59 PM IST

17 Sun 09:00 AM – 11:59 PM IST

May 2026

18 Mon 09:00 AM – 11:59 PM IST

19 Tue 09:00 AM – 11:59 PM IST

20 Wed 09:00 AM – 11:59 PM IST

21 Thu 09:00 AM – 11:59 PM IST

22 Fri 09:00 AM – 11:59 PM IST

23 Sat 09:00 AM – 11:59 PM IST

24 Sun 09:00 AM – 11:59 PM IST

May 2026

25 Mon 09:00 AM – 11:59 PM IST

26 Tue 09:00 AM – 11:59 PM IST

27 Wed 09:00 AM – 11:59 PM IST

28 Thu 09:00 AM – 11:59 PM IST

29 Fri 09:00 AM – 11:59 PM IST

30 Sat 09:00 AM – 11:59 PM IST

31 Sun 09:00 AM – 11:59 PM IST

Jun 2026

1 Mon 09:00 AM – 11:59 PM IST

2 Tue 09:00 AM – 11:59 PM IST

3 Wed 09:00 AM – 11:59 PM IST

4 Thu 09:00 AM – 11:59 PM IST

5 Fri 09:00 AM – 11:59 PM IST

6 Sat 09:00 AM – 11:59 PM IST

7 Sun 09:00 AM – 11:59 PM IST

Jun 2026

8 Mon 09:00 AM – 11:59 PM IST

9 Tue 09:00 AM – 11:59 PM IST

10 Wed 09:00 AM – 11:59 PM IST

11 Thu 09:00 AM – 11:59 PM IST

12 Fri 09:00 AM – 11:59 PM IST

13 Sat 09:00 AM – 11:59 PM IST

14 Sun

No Stats, No Problem: : Building Feedback-Driven Optimizers for Lakehouses

Submitted May 9, 2026

Session type - select the format for your session: 30-minute talk – technical deep dive

Modern query optimizers were designed assuming that the engine has reasonably good statistics: row counts, NDVs, histograms, column correlations, table freshness, and reliable cost models. In many lakehouse environments, that assumption breaks down. This talk is about building a query optimizer that can survive in that world. We will discuss a set of practical techniques for planning under limited statistics: LEO-style learning from executed queries, equivalence sets to compensate for missing NDV and semantic constraints, auto-stats driven by “magic number” sensitivity analysis, and complementary learning from both data and query execution.

The second half of the talk presents a research direction: online parametric query optimization for recurring BI workloads. Many lakehouse queries are templatized: the same SQL shape runs repeatedly with different customer IDs, time windows, geographies, product lines, or account bindings. Most bindings behave like the common case, but a few create plan cliffs — for example, a whale account or an unusually broad date range. We will examine how an optimizer can learn compact parameter-risk regions, maintain a bounded set of useful plans, and reduce tail-latency regret.

Attendees will leave with a mental model for optimizer design when statistics are incomplete by default: what to estimate, what to learn, what to collect, what to treat as uncertain, and where robust planning beats blind adaptivity

The session is aimed at database engineers, query optimizer developers, data platform teams, and practitioners running analytical SQL over lakehouse or object-store-backed systems.

Slide: https://docs.google.com/presentation/d/1oze2xvJuLeavgb9S2LCQrZRoZiLDldZD/edit?usp=sharing&ouid=103492819106331508633&rtpof=true&sd=true

Sweta Singh leads the SQL query optimizer team at e6data. She has over two decades of experience in database systems, query optimization, distributed systems, performance engineering, and workload management. Before E6data, she spent 19 years on the IBM Db2 development team. Her work spans cost-based optimization, statistics approximation, learning optimizers, join enumeration, workload management, distributed systems and OLTP performance engineering.
Renu Pinky Sumam is a Senior Software Engineer on the Query Optimizer team at E6data, with nearly 19 years of experience across relational database technology, cloud systems and AI. Before joining e6data, she worked at IBM on Db2 and IBM Cloud Object Storage, where she helped rearchitect the Cloud Object Storage billing infrastructure into a serverless, cloud-native architecture.

Rootconf topical edition on Databases

No Stats, No Problem: : Building Feedback-Driven Optimizers for Lakehouses

Comments