Your AI Agent is lying to you: Observability for LLM systems in production

Jul 2026

27 Mon

28 Tue

29 Wed

30 Thu

31 Fri 09:00 AM – 06:00 PM IST

1 Sat

2 Sun

Your AI Agent is lying to you: Observability for LLM systems in production

Submitted Jun 1, 2026

I am submitting for: Track 2 - Building & implementing AI tools & agents in production Type of session: 30 mins talk

You have shipped your LLM-powered agent. Congrats. Now do you actually know what it is doing? Most teams flying blind in production only discover issues when users complain, by which point the damage is already done. This talk dives deep into the observability gap in GenAI systems: why traditional APM tools were never designed for non-deterministic, multi-step agentic workflows, and why bolting them on creates a false sense of confidence.

We will explore what real observability looks like for LLM applications, covering distributed traces across agent hops, evaluation frameworks for output quality, and feedback loops that catch regressions before they reach users. You will leave with a practical framework for debugging, evaluating, and continuously improving AI systems in production, so you are never again the last person to find out something went wrong.

Takeaways:

A three-layer observability framework (tracing, evaluation, and feedback loops) that can be applied to any LLM-powered agent system regardless of stack or model provider, giving attendees a concrete blueprint for operating GenAI systems in production with confidence.
A practical mental model for identifying and diagnosing silent failures in agentic workflows, the subtle regressions, stale tool outputs, and reasoning drifts that traditional monitoring tools miss entirely, so teams stop being the last to know when something goes wrong.
An understanding of how production observability directly reduces mean time to resolution for LLM failures, translating into faster debugging cycles, more reliable agent behavior, and greater confidence when shipping GenAI systems at scale.

Benefits to the Ecosystem:
LLM observability is one of the most underserved areas in production AI today. Most engineering teams shipping agents have no systematic way to know when their systems are hallucinating, regressing, or silently failing. This talk gives the community a practical, vendor-neutral framework for closing that gap.

Bio:
Siddhant Agarwal is a seasoned Developer Relations professional with over a decade of experience building and scaling global developer ecosystems. He is currently a Senior Developer Relations Advocate at ClickHouse, leading developer engagement across the APJ region, where he works at the intersection of real-time analytics, data infrastructure, and developer experience.

Previously, Sid led Developer Relations across APAC at Neo4j, where he was responsible for building and scaling the regional DevRel strategy by driving developer adoption, growing communities, launching programs and events, and representing the region across product, engineering, and go-to-market initiatives.

In his prior role, Sid has also managed flagship developer programs at Google. A recognized Google Developer Expert (GDE) in Gen-AI, he is passionate about empowering developers to reimagine how data and AI systems are built and understood.

Known for his ability to tell powerful stories through technology, Siddhant translates complex systems into narratives that inspire action and innovation. With his signature “Local to Global” approach, he helps grassroots developer communities scale their ideas into global impact. He continues to shape communities, share insights, and drive meaningful connections across the tech ecosystem.

Learn more at meetsid.dev

Draft slides:
To be added later

Elevator Pitch:
To be added later

Speak at The Fifth Elephant 2026 Annual Conference

Your AI Agent is lying to you: Observability for LLM systems in production

Comments