Open Source AI Hackathon 2024

GenAI makers and creators contest and showcase

Tickets

Loading…

Kurian Benoy

Aldrin Jenson

@aldrinjenson

Nabeel Boda

@bodanabeel

Indic subtitler

Submitted Jan 30, 2024

Project Objective

The number of tools which are available in Indian languages to subitle audio and videos in Indian languages are almost none. Yet that shouldn’t be the case, as now there are lot of open-source models supporting speech transcription in most of official Indian languages. This tool can be useful for subtitling audios and videos like Indian cinemas for Media Industry in general.

Project Feasability

Due to advent of new technologies like Meta’s seamless M4T model and Whisper fine-tuned models, you can do speech transcription to transcribe audio’s from source audio to source text. With this a Hindi audio can be transcribed to Hindi text for generating subtitles. Meta’s seamless M4T model also supports translation which can take Hindi audio and generate subtitle in languages like English, French, Malayalam etc.

I have tried this feature in my github repository for audio in Malayalam. Given the timeline, I am prioritizing building a minimum viable product (MVP) that focuses on core functionalities: uploading videos/audios, transcribing speech to text in selected Indian languages, and displaying subtitles. Additional features can be added based on the time available.

Project details

Github repo: https://github.com/kurianbenoy/Indic-Subtitler
Demo Video: https://www.loom.com/share/f434e100766548f1b24a073fe4fe6e8c
Hosted Website: https://indicsubtitler.in/

Comments

Login to leave a comment

  • A

    Akshobhya

    @akshobhya_j Editor & Promoter

    @Kurian, Thank you for your proposal submission to The Fifth Elephant Open Source AI Hackathon. The submission needs refinement based on the following considerations.

    Project Scope and Relevance

    1. The proposed project aligns well with the theme of leveraging open-source models in the speech domain to address a practical need for creating subtitles in Indic languages, catering to a specific but relevant use case in the media industry.
      Innovation and Originality:

    2. The project demonstrates creativity by applying open-source models in the speech domain, specifically Meta’s Seamless M4T and whisper fine-tuned models, to enable the easy creation of subtitles in local languages without requiring extensive knowledge of ML models. This approach presents a novel application of AI for a tangible and widely applicable purpose.

    Technical Feasibility

    It's crucial to assess the technical feasibility of integrating Meta’s Seamless M4T and whisper fine-tuned models into a self-hosted web application for subtitle generation. Verification of the models' compatibility, performance, and ease of integration within the proposed timeline is essential.

    Open Source Compatibility

    The project’s commitment to utilizing open-source models aligns with the principles of the hackathon. It's important to ensure that the project will be open source and that proper licenses are applied to the code for public accessibility.

    Use of Technology

    Evaluation of the appropriateness of Meta’s Seamless M4T and whisper fine-tuned models for the task of speech-to-text in Indic languages is critical. Assess the performance, accuracy, and suitability of these models for the intended purpose.

    Accessibility

    Consider how the self-hosted web app ensures accessibility and inclusivity, ensuring that individuals with varying levels of technical expertise can easily utilize the tool to generate subtitles in local languages.

    Sustainability and Maintenance

    The long-term sustainability of the project should be explored, addressing how the self-hosted web app will be maintained, updated, and supported after the hackathon. Plans for community engagement and contributions to sustain the project's utility over time are important.

    Presentation and Demo

    Upon demonstration, it will be crucial to evaluate how effectively the project communicates its value proposition and demonstrates the simplicity and convenience of generating subtitles in local languages using the proposed web app.

    Potential for Future Work

    Consider how the project can be expanded in the future, potentially enhancing the features and capabilities of the self-hosted web app for subtitle generation, and how it can address additional use cases beyond the initial scope.

    TODO

    1. Provide the GitHub repository link in your proposal for easy access and review by mentors and the jury.

    2. Ensure that the GitHub repository contains a comprehensive README.md documenting the project's purpose, technical details, setup instructions, and example use cases.

    3. Utilize the available platforms such as The Fifth Elephant WhatsApp group to engage with mentors and seek guidance on technical and implementation aspects of your project.

    Posted 1 year ago
Hybrid access (members only)

Hosted by

The Fifth Elephant hackathons

Supported by

Host

Jump starting better data engineering and AI futures

Venue host

Welcome to the events page for events hosted at The Terrace @ Hasura. more

Partner

Providing all founders, at any stage, with free resources to build a successful startup.