Sidharth Ramachandran


The Multimodal Revolution: Reshaping Video Analysis Pipelines

Submitted May 23, 2024

Multimodal AI is revolutionizing video analysis, but practical insights on pipeline design are scarce. Traditional computer vision pipelines often involve a complex web of specialized models. This leads to high costs, maintenance burdens, and difficulty in adapting to new tasks. This talk will dissect a real-world case study where multimodal models dramatically simplified a large-scale video analysis pipeline, leading to significantly reduced costs and improved agility.


I want to illustrate in this talk the breakthroughs we were able to unlock with the help of Multimodal models. I will demonstrate how we use simple, often older models like CLIP to create a simplified pipeline that replaces complex, interdependent vision pipelines. We also show the metrics that we use to decide when to use more expensive models like GPT-4-Vision and LLaVA to ensure cost-efficient processing of each video file. Attendees will gain real-world knowledge based on our experiences refactoring a large-scale system, avoiding potential pitfalls.


I work for a large media organization where we need to perform multiple simple tasks with our content library. We operate a generic vision pipeline to which several specific components can be added/ subtracted to perform various downstream tasks - as diverse as trailer generation, thumbnail selection to content moderation. The use of Multimodal models significantly simplified our pipelines and also made them easy to extend to to additional use cases.


{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid Access Ticket

Hosted by

All about data science and machine learning

Supported by

Gold Sponsor

Atlassian unleashes the potential of every team. Our agile & DevOps, IT service management and work management software helps teams organize, discuss, and compl

Silver Sponsor