Musickiya

Dec 2023

18 Mon

19 Tue 05:30 PM – 06:30 PM IST

20 Wed

21 Thu

22 Fri

23 Sat

24 Sun

Jan 2024

1 Mon

2 Tue

3 Wed

4 Thu

5 Fri 05:30 PM – 07:20 PM IST

6 Sat

7 Sun

Jan 2024

8 Mon 06:00 PM – 06:55 PM IST

9 Tue

10 Wed 06:00 PM – 07:00 PM IST

11 Thu

12 Fri 06:00 PM – 07:30 PM IST

13 Sat 03:00 PM – 06:00 PM IST

14 Sun

Jan 2024

22 Mon

23 Tue

24 Wed

25 Thu

26 Fri

27 Sat 05:00 PM – 05:45 PM IST

28 Sun

Feb 2024

29 Mon

30 Tue

31 Wed

1 Thu

2 Fri

3 Sat 10:00 AM – 06:25 PM IST

4 Sun

Feb 2024

5 Mon

6 Tue

7 Wed 08:15 PM – 09:00 PM IST

8 Thu

9 Fri

10 Sat

11 Sun

Feb 2024

12 Mon 08:15 PM – 09:00 PM IST

13 Tue 08:15 PM – 09:00 PM IST

14 Wed 08:15 PM – 09:00 PM IST

15 Thu 08:15 PM – 09:00 PM IST

16 Fri 07:30 PM – 08:30 PM IST

17 Sat 08:15 PM – 09:00 PM IST

18 Sun

Feb 2024

19 Mon

20 Tue

21 Wed 08:30 PM – 09:15 PM IST

22 Thu

23 Fri

24 Sat

25 Sun

Mar 2024

4 Mon

5 Tue

6 Wed

7 Thu

8 Fri

9 Sat 07:00 PM – 09:00 PM IST

10 Sun 04:00 PM – 06:00 PM IST

Apr 2024

8 Mon

9 Tue

10 Wed

11 Thu

12 Fri 12:00 PM – 06:25 PM IST

13 Sat

14 Sun

Hasura, Bangalore

Musickiya

Submitted Jan 24, 2024

Category: AI for image generation/creatives

Problem Statement

As GenAI is on its path to revolutionise the way most things are done, we propose an innovative application of it. For the last many years, we have witnessed AI assistants that mostly assist with specific day to day activities like writing, setting up an alarm, etc, and mode of communication if typically either via chat or voice commands.
We have designed a Music Assistant called Musickiya that will help music producers and composers across the world to do rapid sampling and mixing. We are proposing a Digital Audio Workstation (DAW) integrated assistant that will take inputs from the user and will provide beats, samples, chord progression etc. right inside the arrangement view which they can blend as per their creativity.

Applications

Producing and mixing a song in a DAW can take quite some time ranging from a few hours to as long as a few weeks. It’s an art that takes about 4-5 years to master and still would require a lot of thinking and experimentation to get the right sounding one.
Some common activities that are performed in a DAW are:

Digital audio processor (record, edit, and mix audio digitally)
MIDI sequencer (record, edit and mix MIDI notes)
Virtual instruments (receives MIDI info and translates it to different instrument sounds)
Music notation (turn MIDI notes into printable sheet music)
Sampling a beat from a piece of sound.

Specifically, activities like figuring out the right virtual instrument, preset etc. that matches the feeling that the producer has for the song can be time taking and manual process. Also, buying costly sound engineered sample packs can be a turn off for budding producers.

An AI assistant can help with these bottlenecks and speed up the production time by at least 10x. Also, the integration of this assistant will be pretty simple with their traditional workflow as it will only just help with the mundane and repetitive manual processes.

Specifically, the functionalities of the assistant are:

Audio to Audio Generation: Convert one piece of music like a chord progression in guitar to another instrument like piano conditioned by a prompt.
Chords (Text/MIDI) to Audio Generation: Convert chords in text or MIDI format like a chord progression in guitar to another instrument like piano conditioned by a prompt.
Prompt to Audio Generation: Get samples of music directly using a prompt.
Lyrics Generation: Get lyrics using a prompt.
Lyrics to Vocal Generation: AI sings the given lyrics and small clip of singing to replicate voice
Noise Reduction: Suppress noise in the given noisy audio.

If we talk about scalability, this AI assistant will have AI models backing it which can run in the cloud or even on premise too as DAW’s GUI and some plugins are already GPU-accelerated. The assistant will stream the audio generated directly in the GUI for the user to review quickly. The proposed solution is scalable across devices along with multiple suggestions for a given use case. As per the initial experiments, it takes ~30 sec to generate 30 seconds of music. This amount can be reduced further after some optimisations.

Solution

We will develop our own plugin for one of the DAWs out there and integration our python programme with it.
The models leveraged to build the solutions are:

OpenAI Jukebox: Generative Model for Music
MusicGen: Meta’s Generative Model for Music
MusicGen Chord: Modified version of MusicGen Melody model

Future Scope/ Roadmap

Adding multimodal inputs, so that the music generated is the closest from what is expected from the user, be it the input in terms of music, text, image and video. Initial approach will be to create this as a standalone tool, and then integrate in the DAWs

Github Roadmap - https://github.com/akshatagkgp/fifth-el-hackathon

Participants - Akshat Gupta, Shyam Choudhary

The Fifth Elephant Open Source AI Hackathon 2024

Musickiya

Comments