The Fifth Elephant - Pune Edition

The Fifth Elephant - Pune Edition

AI at the heart of industry & innovation

Tickets

Loading…

Kiran Kulkarni

Kiran Kulkarni

@kiran_kulkarni

Workshop: Build & Optimise AI Agents That Survive Production

Submitted Jan 5, 2026

Description

It’s easy to ship a magical agent demo. It’s much harder to ship an agent that works for real users: noisy inputs, partial context, flaky tools, ambiguous goals, and “tiny prompt changes” that break everything.

In this hands-on workshop, we’ll build a small but realistic agent in Python + DSPy, then turn it into something you can actually run in production: structured I/O, tool contracts, tracing, evals, and automatic optimisation.

You’ll leave with a concrete engineering workflow (an “agent improvement loop”) that you can take back to your team:

instrument → collect failures → convert to evals → optimise → ship via CI.

The techniques are framework-agnostic; we’ll use DSPy because it makes optimisation and modularity explicit in code.

Key Takeaways

  • A practical mental model of agents: goal → plan/act loop → tool calls → ground-truth checks → stop conditions.
  • How to build agents as maintainable software (signatures/modules) instead of brittle prompt blobs.
  • How to add observability + evals so you can debug “why it failed” and measure progress.
  • How to use DSPy optimisers (few-shot + program/prompt optimisation) to improve quality systematically.
  • A repeatable CI workflow to keep agents improving safely as users and requirements change.

Target Audience

  • Level: Intermediate
  • Prerequisites:
    • Comfortable with Python (APIs, functions, virtualenv/uv)
    • Basic familiarity with LLMs (prompts, tool calling concepts)
    • Laptop + internet access
  • Best suited for:
    • Software engineers / platform engineers building LLM features
    • SDETs / QA engineers working on evals and reliability
    • Engineering managers and tech leads who need a production-ready approach

Workshop outline (3 hours)

  1. Anatomy of a production agent (15 min)

    Agent loop, tool contracts, ground-truth checks, stop conditions, failure modes.

  2. Build the agent in DSPy (60 min)

    Signatures + modules, tool wiring, structured outputs, error handling.

  3. Observability & evals (45 min)

    Tracing, failure buckets, creating an eval set from real-ish cases, measuring baseline.

  4. Optimisation (45 min)

    Few-shot baselines → DSPy optimiser run → compare metrics + inspect deltas.

  5. Shipping the improvement loop (15 min)

    Minimal CI pattern: run evals on PR, regressions gate merges, version prompts/programs.

Setup

The workshop skeleton and requirements can be found in this repo: https://github.com/unravel-team/real-agents-workshop

Speaker bio (short)

Kiran Kulkarni is the founder of Unravel.tech, where he helps teams build production-grade AI systems—agentic workflows, evaluation pipelines, and reliability/observability practices. He’s been a founding engineer and engineering leader across data + AI systems and loves turning “cool demos” into software that survives real users.

Utkarsh Dighe is a senior engineer at Unravel.tech, where he designs and builds pragmatic solutions using Agentic AI to tackle complex problems across domains. He takes an engineering-first approach, focusing on reliability, robustness, and scalability—ensuring systems don’t just work in theory, but hold up under real-world usage.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid access

Hosted by

Jumpstart better data engineering and AI futures

Supported by

Platinum sponsor

Nutanix is a global leader in cloud software, offering organizations a single platform for running apps and data across clouds.