From Script to Screen at Scale: Engineering an AI Short Video Generation Pipeline

Jul 2026

27 Mon

28 Tue

29 Wed

30 Thu

31 Fri 09:00 AM – 06:00 PM IST

1 Sat

2 Sun

From Script to Screen at Scale: Engineering an AI Short Video Generation Pipeline

Submitted Jun 22, 2026

I am submitting for: Track 2 - Building & implementing AI tools & agents in production Type of session: 30 mins talk

Generating thousands of polished short clips from long-form video — automatically, across multiple content genres — is a different problem from what most AI video demos show you. This talk walks through a production pipeline that does exactly that: automated clipping with LLM-based segment selection, an Intelligent Reframing Engine that detects live speakers vs. static faces using mouth movement, head motion, and emotion signals, and a final aesthetics layer that handles branded overlays and captions.

The focus is on what breaks. We’ll cover six production failure modes — clip-reframe mismatches, liveness false positives, STT hallucinations, genre config drift — and the mitigations that actually worked. The genre config pattern that drives the entire pipeline without branching logic is transferable to any multi-variant AI system.

For ML and data engineers building or evaluating AI content generation systems.

Speaker bio:
Himanshu Aggarwal is a Machine Learning Engineer at Glance, where he builds large-scale AI systems for content discovery and personalization, serving over 250 million users globally. His expertise spans recommender systems, semantic retrieval, knowledge graphs, and large language models, with a strong focus on designing scalable, production-grade architectures.

With experience across research and high-scale consumer platforms, Himanshu works on advancing content understanding and building intelligent systems that enhance how users discover and engage with digital experiences across domains.

Link to PPT (work ongoing): https://drive.google.com/file/d/1YSR__8PKY-r3HqHmLSG7_UA4AQURQce0/view?usp=sharing

Speak at The Fifth Elephant 2026 Annual Conference

From Script to Screen at Scale: Engineering an AI Short Video Generation Pipeline

Comments