Feb 2026
16 Mon
17 Tue
18 Wed
19 Thu
20 Fri
21 Sat
22 Sun
Feb 2026
23 Mon
24 Tue
25 Wed
26 Thu
27 Fri 10:00 AM – 04:30 PM IST
28 Sat 09:00 AM – 06:00 PM IST
1 Sun
Govind Joshi
@govindkrjoshi1
Submitted Dec 9, 2025
AI is changing software development in two ways at once: we’re shipping AI-powered features that used to be research, and we’re adopting coding agents that can generate code faster than teams can comfortably review. The opportunity is huge—but without mature engineering systems, the result is predictable: impressive demos, fragile production, and a growing gap between how fast we can change code and how safely we should.
This BoF is built around a simple idea: the practices that make AI agents reliable in production are the same practices that make coding agents work for you instead of against you. The “boring” parts of engineering—documentation, specs, tests, CI/CD, code reviews, observability—are not legacy rituals. They’re the control system that lets you harness AI acceleration without turning your SDLC into chaos.
In this discussion, we’ll share best practices and patterns that have worked in real teams:
• Mature engineering systems & gates that scale with AI (review rubrics, CI policies, eval harnesses, rollout guardrails)
• Bridging demos → production for AI features (telemetry-first design, failure modes, fallbacks, canaries/shadow mode)
• Agentic coding loops that are safe and high-leverage (small diffs, acceptance criteria, automated verification, human-in-the-loop approvals)
This session is relevant if you’re:
• Shipping AI features in an existing product and want them to be reliable, observable, and maintainable
• Introducing coding agents into your team and want speed without regressions
• Using AI-assisted coding personally and want a workflow that consistently produces mergeable, production-grade changes
Expected takeaways
• A practical “minimum viable maturity” checklist for AI adoption in the SDLC
• A shared set of patterns for reliable AI features and productive coding agents—because they’re more similar than they look
Previous talk submission
This talk starts with a familiar story: the 8:30 PM scramble before a demo, endlessly tweaking prompts until the bot behaves just enough for tomorrow morning. The demo goes well, everyone is happy, the feature is greenlit—and then it quietly falls apart in production. Users repeat themselves. Interruptions break the flow. Tool calls misfire. You have recordings but no traces, complaints but no repro steps, and you’re stuck in the same “tweak and pray” loop—just with more traffic and higher stakes.
In this session, I’ll argue that the difference between “cool demo” and “reliable product” is not model choice or prompt cleverness, but engineering maturity: documentation, observability, evals, datasets, CI/CD, and feedback loops. We’ll reframe AI product development as discovery, not invention and walk through concrete practices for building that discovery engine: how to log and trace every LLM and tool call, design evals that actually catch regressions, turn production traffic into datasets, and build a flywheel where every failure makes the system stronger. You’ll leave with a pragmatic checklist you can apply to your current AI project without a full platform rewrite.
⸻
Mention 1–2 takeaways from your session
• You’ll learn a practical definition of engineering maturity for AI applications and a minimal set of non-negotiables (docs, observability, evals, datasets, CI/CD) that turn fragile demos into reliable systems.
• You’ll leave with a concrete “flywheel” pattern for AI products—how to capture data, tag outcomes, run evals, and iterate—so you can answer “Can we ship this to 100,000 users?” with data instead of hope.
⸻
Which audiences is your session going to be beneficial for?
This session will be most useful for:
• Engineering managers and tech leads responsible for shipping AI features to production
• Senior/principal engineers and ML/AI engineers working with LLMs, tools, and agents
• Product managers and founders trying to turn promising AI prototypes into reliable products
• Platform / infra / DevOps engineers designing internal AI platforms or evaluation/observability stacks
⸻
Add your bio – who you are; where you work
I’m Govind Joshi, an independent software engineer based in India who spends an unreasonable amount of time building AI-powered systems that actually have to work in the real world. I focus on applied AI: LLM-driven agents that can call tools, handle real users over phone and chat, and operate reliably under production traffic.
Over the last few years, I’ve worked with teams to design and ship AI assistants, voice bots, and evaluation/observability pipelines for LLM applications. I care a lot about the “boring” parts—architecture, evals, monitoring, and engineering maturity—and how they turn AI demos into products you can trust.
Hosted by
Supported by
Platinum sponsor
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}