Vinayak Kadam

Structured Extraction in Production: State, Schemas, and the Engines Underneath.

Submitted Jun 25, 2026

Abstract

Structured extraction from unstructured text is often presented as a solved problem, enabled by LLM frameworks, schema libraries, and built-in tooling. In practice, however, production systems expose a different set of challenges around scale, multi-turn interactions, reliability, and long-term maintainability.

This session explores lessons learned from building real-world extraction pipelines and examines the limits of prompt-centric approaches. It discusses why deterministic application logic often provides a stronger foundation for grounding and state management, why incremental or delta-based updates can outperform full-state regeneration, and how different structured-output mechanisms introduce distinct trade-offs in correctness, latency, and failure handling.

The talk also addresses an underappreciated operational concern: prompts, schemas, and model behavior evolve over time. Small changes to descriptions, guardrails, or underlying models can alter extraction outcomes in subtle ways. Treating these artefacts as versioned, testable components is critical for maintaining reliability in production environments.

Attendees will gain practical frameworks for separating extraction concerns from state management, reasoning about schemas as part of the model interface rather than passive documentation, and making deliberate architectural choices instead of relying solely on framework defaults.

Key takeaways

  • Deterministic systems should own state, grounding, and post-processing logic rather than delegating these responsibilities entirely to language models.

  • Structured-output mechanisms are not interchangeable; their underlying behavior influences accuracy, latency, and failure modes.

  • Prompts and schemas are evolving software assets that benefit from versioning, testing, and evaluation-driven development practices.

Who this is for

AI/ML engineers, data scientists, and backend developers building stateful LLM systems, especially those moving structured extraction workflows from prototype to production.

Bio

Vinayak Kadam is a Solution Consultant at Sahaj Software, where he works on AI-powered and multi-agent systems for complex enterprise workflows. With experience spanning data engineering, cloud platforms, and large-scale distributed systems, he focuses on building production-ready solutions that combine intelligent automation with practical business outcomes.

{Add the link to draft slides - PDF/PPT - with comments access}

{Add the link to 2-min elevator pitch video}

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jumpstart better data engineering and AI futures