Feb 2026
23 Mon
24 Tue
25 Wed
26 Thu
27 Fri 09:30 AM – 05:00 PM IST
28 Sat 09:30 AM – 05:00 PM IST
1 Sun
Submitted Nov 10, 2025
As generative AI moves closer to the edge, developers are looking for ways to build creative, high-quality applications that run privately, efficiently, and without dependence on cloud APIs. This session explores how to shift from server-side story generation using large language models (LLMs) to highly optimized on-device workflows powered by small language models (SLMs). Attendees will learn the end-to-end process of generating content in the cloud, reproducing it locally, and progressively improving output quality using prompt tuning and LoRA-based fine-tuning.
Through a series of practical demonstrations, we will walk through three stages of on-device model refinement: baseline inference, prompt-tuned enhancement, and LoRA-based adapter tuning for personalization. Participants will compare outputs from each stage, understand the trade-offs in quality vs. performance, and learn lightweight evaluation methods for generative storytelling. By the end, they will know how to build efficient, privacy-preserving, specialized story generators that can run directly on mobile or embedded devices.
Mayur is a seasoned engineer specializing in AI, data, and backend systems, with extensive experience building scalable, high-performance platforms at organizations such as JioHotstar, Intuit, Walmart, and SAP. He frequently delivers webinars and technical sessions on AI engineering and distributed systems, and actively shares insights with the developer community. Connect with him at: https://www.linkedin.com/in/mayurmadnani/
Hosted by
Supported by
Platinum sponsor
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}