Rows, columns, and consequences

Speak at Rootconf’s Special Edition on Databases

Mario Rozario

@mario_the_legend

The Query That Ran for 47 Days. Then Didn't.

Submitted Apr 29, 2026

{Describe your session in 2 paragraphs}
This talk is built around a single production incident: a batch pipeline that had run reliably for 47 consecutive nights, processing between 200 to 300 million rows and feeding a dashboard that an operations team depended on every morning that silently collapsed on night 48. No code changed. No configuration changed. By 9am, with a P1 raised and the dashboard down, we were staring at our observability tool (Viewpoint’s) screen which now told a story nobody had anticipated. The query was not slow. It was paralysed. There is a difference and understanding that difference is what this talk is about.

Teradata’s cost-based optimizer is one of the most sophisticated query planners in the industry, the product of decades of research translated into production software running at petabyte scale. And yet, even the best optimizer is only as good as the information it operates on. When that information is stale, incomplete, or simply surprised by reality, the consequences in a massively parallel environment are not subtle. They are catastrophic.

{Mention 1-2 takeaways from your session}
We will walk through how we diagnosed the incident, namely what the signals were, where we looked, what we found, and why our first fix did not work. Along the way, we will get into how Teradata’s optimizer actually makes its decisions, where its safety nets exist, and crucially, where those safety nets have limits that only show up under specific real-world conditions.

The technical resolution is one part of the story. The other part is what the incident revealed about the broader system, namely the processes, assumptions, and blind spots that allowed a known class of problem to land without warning. This will be a mixture of both theory and practicality. The talk lives in the space between them.

  1. Attendees will leave with a concrete, transferable diagnostic framework —not just a list of tuning tips, but a way of thinking about optimizer failures in massively parallel systems.

  2. How to read EXPLAIN confidence levels as early warning signals, not just plan descriptions.

{Which audiences is your session going to beneficial for?}

  1. Engineers and DBAs who operate large-scale analytical systems and have experienced query performance that defied easy explanation.
  2. Distributed systems practitioners curious about how massively parallel shared-nothing architectures handle data skew in practice.
  3. Database researchers interested in the gap between published hot-key mitigation techniques and what production constraints actually allow.
  4. Anyone who has stared at an EXPLAIN plan at 6am and wondered what it was actually trying to tell them.

{Add your bio - who you are; where you work}

Mario leads Technical Delivery and Enablement at Teradata, steering a team focused on delivering cutting-edge technical solutions and training across the data analytics space. Over more than a decade working with Teradata’s platform, he has helped global clients unlock actionable insights from their data:- spanning query optimisation, workload management, large-scale data modelling, and the kind of production incidents that permanently reshape how you think about systems.

Beyond the platform, Mario brings a strong grounding in AI and ML — backed by a Post Graduate Program from the University of Texas and certifications including Google Cloud Platform. He has driven analytics hackathons and trained upwards of 800 associates, building a culture of technical rigour and continuous learning within his teams.

He brings to this talk both the technical depth of someone who has spent years inside Teradata’s internals and the enablement instinct of someone whose job is to make that knowledge useful to others:— including at 6am, when a pipeline that ran perfectly for 47 nights suddenly doesn’t.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

We care about site reliability, cloud costs, security and data privacy