The Fifth Elephant 2023 Winter
On the engineering and business implications of AI & ML
Dec 2023
4 Mon
5 Tue
6 Wed
7 Thu
8 Fri 09:00 AM – 04:15 PM IST
9 Sat
10 Sun
On the engineering and business implications of AI & ML
Dec 2023
4 Mon
5 Tue
6 Wed
7 Thu
8 Fri 09:00 AM – 04:15 PM IST
9 Sat
10 Sun
Akshat Gupta
Abstract
In the realm of audio processing and music production, automated genre-based audio morphing is an emerging field that merges the creative boundaries of different music genres through the power of machine learning and natural language processing (NLP). This innovative approach leverages textual prompts to drive the transformation of audio content, transcending traditional genre constraints and enabling new forms of musical expression. This can then be used as an isolated content for personalized recommendation
Approach
Leveraging state of the art models
We finetune them on our dataset, which we created weakly using ChatGPT and in house annotations. We also used 3P datasets like WavCaps to augment our dataset iwth real world examples. We achieve audio morphing in ~30 seconds for a 5 min video.
Deployment at scale
We are deploying at scale in near future, but this will directly impact the liquidity, we have atleast 5-6 variants for the same audio (which would mean 5-6x liquidity) which then we can use in downstream tasks (like personalised recommendation, etc)
Key Points to be discussed in the talk-
Sample Example
music input https://drive.google.com/file/d/1iEN0uML7FMxbKbO3uJckFnJkMtB12aH5/view?usp=sharing
Theme - beats funk 80s music
Output - https://drive.google.com/file/d/13HX1Rrka1AgSCXdBsd0e_kCi2XBffINi/view?usp=sharing
Dec 2023
4 Mon
5 Tue
6 Wed
7 Thu
8 Fri 09:00 AM – 04:15 PM IST
9 Sat
10 Sun
Hosted by
Supported by
Sponsor
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}