Anay Nayak

Anay Nayak

@anaynayak

Shipping an MLOps Platform: What we let the AI own and what we didn't

Submitted Jun 25, 2026

We built a production ML forecasting platform under a hard deadline with significant instability underneath it:

  1. EAP datasets from a live dimensional modelling migration changing schemas mid-flight
  2. ML framework compatibility failures
  3. A model architecture that went through three revisions
  4. Infrastructure requirements we were learning in real time from a team with established backend systems spanning 2000+ repositories we mostly could not access
  5. Desired end-to-end ownership by Data Scientists while considering infrastructure best practices

With 2 data scientists and 2 MLOps engineers, we wouldn’t have made it. What held delivery together was a human-directed, AI-assisted development cycle — build, deploy, diagnose, fix, and repeat — that kept each iteration moving without waiting on a handoff between people. The foundation held: a second use case shipped on the same platform in 7 days.

This talk is an honest account of what that workflow looked like, what it got right, and what it got wrong. The centrepiece is a temporal leakage failure: the session validated the train/test split as structurally correct, and was wrong in a way that only a domain-aware human review caught after the fact. For a system live across three countries, that miss would have been invisible until the model degraded on live data. We cover how that failure reshaped the boundary between what we let the tool own end-to-end and what we always reviewed ourselves — and what that boundary looks like as a practice.

Takeaways

  1. How to structure a human-directed AI session for iterative deployment work - custom skills that held up across multiple rewrites, failures, and mid-flight infrastructure changes.
  2. The accountability boundary in practice: the class of errors an AI coding assistant will miss confidently, illustrated by a temporal leakage failure that reached human review and what the catch actually looked like.

Audience

  1. Data and ML engineers using AI coding assistants on production work who want to move beyond one-shot prompting into sustained, iterative workflows
  2. Anyone trying to work out where AI-assisted engineering is genuinely useful and where it still needs a human in the loop

Bio

Anay Nayak is a consultant at Sahaj Software. He has worked on building the MLOps platform described in this talk. He works across data platforms, MLOps, and large-scale system design

https://docs.google.com/presentation/d/1zOPkXgCAPUsNtWN0jsZPpzsXW1w3yfUTkNX7Jc5osTY/edit

{Add the link to 2-min elevator pitch video}

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jumpstart better data engineering and AI futures