Evaluating Agentic Applications: Overcoming the Non-deterministic World

Submitted May 14, 2025

I am submitting for: Speaking at the Fifth Elephant 2025 Annual Conference Type of submission: 30 mins talk Choose the topic your submission falls under: Applied AI Engineering & Agentic AI track

Agentic applications powered by large language models (LLMs) are transforming the way we build intelligent, interactive systems. Agentic applications powered by large language models (LLMs) are transforming the way we build intelligent, interactive systems. However, their non-deterministic nature introduces significant challenges when it comes to testing and evaluation. How can we ensure reliability, correctness, and consistent behaviour in systems that are inherently probabilistic?

This talk will dive into practical strategies for evaluating agentic applications, focusing on a conversational agent use case. Using a real-world use case, we dissect the application into key components—query understanding, data source orchestration, tool invocation, and response synthesis—and examine how each can be evaluated deterministically, even when built on stochastic foundations.

Key Takeaways

Designing deterministic tests for non-deterministic agents
Evaluating agent behaviour at both the component and system level
Integrating evaluation into the agent development lifecycle
Metrics beyond accuracy: goal completion, grounding, latency, and consistency

Target Audience

Data scientists & Researchers
Architects working with AI and agentic use cases
AI enthusiasts exploring agentic applications

Speaker Bio
Shruti Dhavalikar is a seasoned Data Scientist with over 6 years of experience, currently serving as a Solution Consultant at Sahaj Software in Pune, India. Her expertise lies in transforming complex data into actionable insights, delivering end-to-end product cycles under Agile methodologies, and ensuring scalable, clean, and robust coding across diverse tech stacks. Shruti’s strong communication skills have enabled her to effectively interact with clients, contributing to successful project outcomes. Beyond her professional pursuits, she harbours a curiosity for cosmology and space, and enjoys exploring different cuisines as a travelling foodie.

The Fifth Elephant 2025 Annual Conference CfP

Evaluating Agentic Applications: Overcoming the Non-deterministic World

Comments