🚀 Showcase your Open Source AI work!
Get expert feedback & present at the OSAI meet-ups and festivals in 2025!
Tricha Anjali
@tanjali
Submitted Mar 30, 2025
Describe your session in 2 paragraphs
GenAI is transforming the way we build and deploy software — but beneath the surface of working POCs lies a critical challenge: can we test these systems? In traditional software, test-driven development is a given. But with LLMs’ probabilistic outputs and evolving prompts, most AI systems today ship without clear test strategies. This session will look into testing GenAI systems — from prompts to agents.
Drawing from our experience at Numberz.ai — where we build GenAI-driven financial agents to flag inconsistencies, detect shenanigans, and explain anomalies — this talk will walk through testable surfaces of GenAI: from prompt behavior to output grounding to agent workflows. The goal is to share actionable ways to build reliable AI systems, especially in high-stakes domains like finance.
Mention 1-2 takeaways from your session
How to design and implement testable GenAI systems using patterns like prompt regression, LLM-as-a-judge and golden data.
Which audiences is your session going to be beneficial for?
This session is ideal for AI/ML practitioners, product managers, CTOs, and engineering leads who are deploying LLMs in real-world applications.
Add your bio - who you are; where you work
Tricha Anjali is the Founder of Numberz.ai, a platform that uses AI to automate forensic finance and compliance workflows. With multiple years of experience in enterprise tech, Numberz combines deep technical insight with a product-led mindset.
Hosted by
Supported by
Community Partner
Community sponsor
Login to leave a comment
No comments posted yet