The Fifth Elephant 2023 Winter
On the engineering and business implications of AI & ML
Dec 2023
4 Mon
5 Tue
6 Wed
7 Thu
8 Fri 09:00 AM – 04:15 PM IST
9 Sat
10 Sun
On the engineering and business implications of AI & ML
Dec 2023
4 Mon
5 Tue
6 Wed
7 Thu
8 Fri 09:00 AM – 04:15 PM IST
9 Sat
10 Sun
This video is for members only
Shyam Choudhary
Roposo is a live video platform with over ~200 million end users with ~1000 live videos getting uploaded every day, each lasting 15 minutes to 3 hours. In order to increase engagement and improve user experience, we are trying to create a central video feed which will have assets that can easily consumed. This requires converting our events and creator led videos to shorter formats like trailers and short clips, for which, we process the videos with help of AI to get the most important segments.
Videos can be very diverse as their content can vary from:
As a solution to this, we bifurcated videos based on the density of speech happening in them and created separate solutions for a speech-heavy and a visual-heavy video.
For a speech-heavy video, we are use transcription to select the most important segments of a video while for a visual-heavy video, we break videos into shots and generate visual descriptions of the shots to select the most important segments.
We are leveraging the following for our use-case:
To enhance the viewer experience, we are post-processing our short videos with AI-generated music, custom transitions between shots, animations, stickers, subtitles and a lot more.
The end-to-end processing runs at 5-10 mins for 30 min long video.
Data Scientists and ML Engineers
Dec 2023
4 Mon
5 Tue
6 Wed
7 Thu
8 Fri 09:00 AM – 04:15 PM IST
9 Sat
10 Sun
Hosted by
Supported by
Sponsor
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}