The Fifth Elephant 2025 Annual Conference CfP

The Fifth Elephant 2025 Annual Conference CfP

Speak at The Fifth Elephant 2025 Annual Conference

Shruti Dhavalikar

@shrutidhavalikar

Evaluating Agentic Applications: Overcoming the Non-deterministic World

Submitted May 14, 2025

Agentic applications powered by large language models (LLMs) are transforming the way we build intelligent, interactive systems. Agentic applications powered by large language models (LLMs) are transforming the way we build intelligent, interactive systems. However, their non-deterministic nature introduces significant challenges when it comes to testing and evaluation. How can we ensure reliability, correctness, and consistent behaviour in systems that are inherently probabilistic?

This talk will dive into practical strategies for evaluating agentic applications, focusing on a conversational agent use case. Using a real-world use case, we dissect the application into key components—query understanding, data source orchestration, tool invocation, and response synthesis—and examine how each can be evaluated deterministically, even when built on stochastic foundations.

Key Takeaways

  • Designing deterministic tests for non-deterministic agents
  • Evaluating agent behaviour at both the component and system level
  • Integrating evaluation into the agent development lifecycle
  • Metrics beyond accuracy: goal completion, grounding, latency, and consistency

Target Audience

  • Data scientists & Researchers
  • Architects working with AI and agentic use cases
  • AI enthusiasts exploring agentic applications

Speaker Bio
Shruti Dhavalikar is a seasoned Data Scientist with over 6 years of experience, currently serving as a Solution Consultant at Sahaj Software in Pune, India. Her expertise lies in transforming complex data into actionable insights, delivering end-to-end product cycles under Agile methodologies, and ensuring scalable, clean, and robust coding across diverse tech stacks. Shruti’s strong communication skills have enabled her to effectively interact with clients, contributing to successful project outcomes. Beyond her professional pursuits, she harbours a curiosity for cosmology and space, and enjoys exploring different cuisines as a travelling foodie.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures