Speak at The Fifth Elephant 2026 Annual Conference
Share you work with the community
Jul 2026
27 Mon
28 Tue
29 Wed
30 Thu
31 Fri 09:00 AM – 06:00 PM IST
1 Sat
2 Sun
Submitted Jun 25, 2026
Generative UI changes what an agent can return. Instead of answering with a wall of text, the agent can render charts, forms, workflows, tables, dashboards, and other interactive interfaces. That looks great in demos, but production exposed a harder question for us: what should the model actually emit to make this reliable, fast, and streamable?
At Thesys, our first version used JSON. It was the obvious choice, and it worked well enough early on. But once we had 10,000+ developers building with it and real traffic going through the system, JSON started showing its limits. Outputs were token-heavy, latency grew with response size, and partial or malformed JSON during streaming often meant retries or broken renders. The core issue was simple: JSON needs to be structurally complete before it can be parsed, while streaming UI needs to become useful before the full response is done.
This talk is about the engineering path we took after that realization. We replaced JSON with a compact, line-oriented intermediate language designed specifically for streaming UI. I’ll walk through the production constraints that shaped the system: a swappable design system, a swappable model layer, a swappable renderer, and a neutral format sitting between them. We’ll cover the streaming parser, component registration, constrained generation, failure modes we saw in production, and benchmarks showing 50 to 67% fewer tokens and 2 to 3x faster render latency. The session ends with a live side-by-side demo comparing the old and new approach.
This session is for engineers building AI agents, LLM-powered products.
Zahle is part of the founding engineering team at Thesys. Previously he was part of platform team at Razorpay
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}