Hack Five For members

The Fifth Elephant Open Source AI Hackathon 2024

GenAI makers and creators contest and showcase

Make a submission

Accepting submissions till 15 Feb 2024, 11:00 PM

Hasura, Bangalore

Tickets

Loadingโ€ฆ

Overview

The Fifth Elephant Open Source AI Hackathon started on 5 January 2024 and reached its finale with a Demo Day event on 12 April 2024, when the winners of the two month long contest were chosen.

The aim of this hackathon was to encourage individuals/teams to apply and incubate innovative AI ideas/use cases and publish them as open source projects.

  • The hackathon contest participants worked with mentors for over two months to refine their ideas, and advance them to a stage where they are viable projects that could be pursued beyond the hackathon.
  • the project teams worked on AIโ€™s application in education, accessibility, creative expression, scientific research, languages, under the overall theme of AI for India.
  • competing projects were judged on impact and relevance, innovation and creativity, technical soundness and code quality, scope of expansion, reusability and ease of adoption

As a campaign to raise awareness and drive up developer adoption of AI and open source technologies, the hackathon was a great success. It helped shine light on the agility that open source technology enables for creative and innovative developers.

Open Source AI Hackathon Winners

Testimonials

โ€œ...each one of the contestants put in tremendous effort. And we saw the passion in every person, trying to do things not for winning, but about really building your projects. After a long time, I am attending such a hackathon where young folks are so passionate about building. Kudos to all of youโ€.
- Rukma Talwadker, Jury Member, Senior Principal Scientist at Games 24x7

โ€œI really enjoyed judging all the projects - lot of interesting work. The Fifth Elephant has done a great job with mentoring and curating this hackathonโ€.
- Tanuja Ganu, Jury Member, Principal RSDE Manager, Microsoft India

โ€œThe hallmark of this hackathon was getting younger people to code for a longer period of time as opposed to a typical hackathon which turns out to be about โ€” how do you build the coolest thing in the shortest period of timeโ€.
- Sumod Mohan, mentor.

โ€œWhat is impressive about this particular hackathon is, it is not just about cool ideas and fancy demos. It is actually about building a product or a software or a model that can live beyond the demo (and contest).โ€
- Soma Dhavala, team member at Project Seshu

โ€œIt was only through putting my ideas to code that I learnt what the specificity of implementing these (LLMs) were. I began my journey with a sense of hope and commitment towards FOSS principles, and the Hackathon only reinforced my belief that collaboration maketh a better product.โ€
- Sankalp Srivastava, Creator of Project Schematise

Key highlights from the hackathon

During the course of 12 weeks, the hackathon involved:

  1. Started off on 5 January 2024 and invited open source ideas and projects.
  2. Mentorship sessions in February for all project teams. Mentors included Abhishek H Mishra aka Tokenbender, Arvind Saraf, Bharat Shetty, Ramesh Hariharan, Sidharth Ramachandran, Simrat Hanspal, Sumod Mohan and Vinayak Hegde.
  3. The 10 best from 40 applications were chosen for the Demo Showcase.
  4. An involved peer-review process helped further refine projects between March 1st - 15th, followed by extensive rehearsals from April 8th - 10th, 2024.
  5. On Demo Showcase Day - we had project demos from 10 qualifying teams; 5 project winners were chosen on 12 April 2024.

The Prizes

๐Ÿ† Five prizes of โ‚น1,00,000 (One lakh rupees) per theme, were awarded to winning projects.
The prizes for this hackathon have been sponsored by Meta.

Note: Apart from the contest prizes, Microsoft has offered internships to the contestants.

Jury

  1. Ashok Hariharan heads data and business intelligence at United Nations Volunteers.
  2. Rukma Talwadker is Senior principal scientist at Games24x7.
  3. Shubha Shedthikere is a Senior Manager in the Data Science team at Swiggy.
  4. Sunil Abraham is the Public Policy Director for Data Economy and Emerging Tech at Meta, India.
  5. Tanuja Ganu is a Principal RSDE Manager at Microsoft Research India.

Mentors

  1. Abhishek Mishra is a is creator of CodeCherryPop LLM series.
  2. Arvind Saraf is a computer scientist, engineering leader, entrepreneur trained at IIT, MIT and Google.
  3. Simrat Hanspal is currently spearheading AI product strategy at Hasura.
  4. Sumod Mohan is the co-founder and CEO of AutoInfer.

Editors

About The Fifth Elephant

The Fifth Elephant is a community of practitioners, who share feedback on data, AI and ML practices in the industry. If you like the work that The Fifth Elephant does and want to support its activities - review of Papers, Books, building the innovation ecosystem in India through hackathons and conferences - contribute by picking up a membership.

Contact

๐Ÿ’ฌ Post a comment with your questions here, or join The Fifth Elephant Telegram group and the WhatsApp group.

Follow @fifthel on Twitter.

๐Ÿ“ž For any inquiries, call The Fifth Elephant at +91-7676332020.

sponsor image

Hosted by

The Fifth Elephant hackathons

Supported by

Host

All about data science and machine learning

Venue host

Welcome to the events page for events hosted at The Terrace @ Hasura. more

Partner

Providing all founders, at any stage, with free resources to build a successful startup.

Pranjal Kar

@pk99

Bhaswata

@bhaswata08 Editor

samunder singh

@sam011 Editor

Synapse.AI - Bridging the Gap between Deaf and Non-Deaf Community based on Indian Sign Language(ISL)

Submitted Feb 2, 2024

Synapse ai

Shruti-Drishti: Bridging the Communication Gap for the Deaf Community in India ๐ŸŒ‰๐Ÿ‡ฎ๐Ÿ‡ณ

Introduction ๐Ÿ™Œ

Shruti-Drishti is an innovative project aimed at addressing the communication gap between the deaf and non-deaf communities in South Asia, particularly in India. By leveraging deep learning models and state-of-the-art techniques, we strive to facilitate seamless communication and promote inclusivity for individuals with hearing impairments. ๐ŸŒŸ

Our webapp here aims to bridge the communication gap between the Deaf and Non-Deaf Community based on our LSTM and Transformer model on the sign langauge video keypoints.

Our aim is to improve the quality of communication by providing accurate and reliable translations.

We provide two Services- (1) Real Time Sign Language to Text
(2) Text to Sign Language Translation.

This is the repo: https://github.com/pranjalkar99/shruti-drishti
(Note: The repo is being updated with the latest changes and work done so far.)

DEMO VIDEO

Demo for ISL based Sign Language Detection

Key Features โœจ

  1. Sign Language to Text Conversion ๐Ÿ–๏ธโžก๏ธ๐Ÿ“: Our custom Transformer-based Multi-Headed Attention Encoder, powered by Googleโ€™s Tensorflow Mediapipe, accurately converts sign language videos into text, overcoming challenges related to dynamic sign similarity.

  2. Text to Sign Language Generation ๐Ÿ“โžก๏ธ๐Ÿ–๏ธ: Utilizing an Agentic LLM framework, Shruti-Drishti converts textual information into masked keypoints based sign language videos, tailored specifically for Indian Sign Language.

Text2sign

  1. Multilingual Support ๐ŸŒ: Our app uses IndicTrans2 for multilingual support for all 22 scheduled Indian Languages. Accessibility is our top priority, and we make sure that everyone is included.

  2. Content Accessibility ๐Ÿ“ฐ๐ŸŽฅ: Shruti-Drishti enables news channels and content creators to expand their user base by making their content accessible and inclusive through embedded sign language video layouts.

Dataset Details ๐Ÿ“Š

Link to the Dataset: INCLUDE Dataset

The INCLUDE dataset, sourced from AI4Bharat, forms the foundation of our project. It consists of 4,292 videos, with 3,475 videos used for training and 817 videos for testing. Each video captures a single Indian Sign Language (ISL) sign performed by deaf students from St. Louis School for the Deaf, Adyar, Chennai.

Model Architecture ๐Ÿง 

Shruti-Drishti employs two distinct models for real-time Sign Language Detection:

  1. LSTM-based Model ๐Ÿ“ˆ: Leveraging keypoints extracted from Mediapipe for poses, this model utilizes a recurrent neural network (RNN) and Long-Short Term Memory Cells for evaluation.

    • Time distributed layers: Extract features from each frame based on the Mediapipe keypoints. These features capture spatial relationships between joints or movement patterns.
    • Sequential Layers: Allows the model to exploit the temporal nature of the pose data, leading to more accurate pose estimation across a video sequence.
  2. Transformer-based Model ๐Ÿ”„: Trained through extensive experimentation and hyperparameter tuning, this model offers enhanced performance and adaptability.

    • Training Strategies:
      1. Warmup: Gradually increases the learning rate from a very low value to the main training rate, helping the model converge on a good starting point in the parameter space before fine-tuning with higher learning rates.
      2. AdamW: An advanced optimizer algorithm that addresses some shortcomings of the traditional Adam optimizer and often leads to faster convergence and improved performance.
      3. ReduceLRonPlateau: Monitors a specific metric during training and reduces the learning rate if the metric stops improving for a certain number of epochs, preventing overfitting and allowing the model to refine its parameters.
      4. Finetuned VideoMAE: Utilizes the pre-trained weights from VideoMAE as a strong starting point and allows the model to specialize in recognizing human poses within videos.

We have also implemented the VideoMAE model, proposed in the paper โ€œVideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training.โ€ Fine-tuning techniques such as qLORA, peft, head and backbone fine-tuning, and only head fine-tuning were explored, with the latter proving to be the most successful approach.

Solution Approach ๐ŸŽฏ

Shruti-Drishti tackles the communication gap through a two-fold approach:

  1. Sign Language to Text: Implementing a custom Transformer-based Multi-Headed Attention Encoder using Googleโ€™s Tensorflow Mediapipe, we convert sign language videos into text while addressing challenges related to dynamic sign similarity.

  2. Text to Sign Language: Utilizing an Agentic LLM framework, Shruti-Drishti converts textual information into masked keypoints based sign language videos, tailored specifically for Indian Sign Language.

Action Plans ๐Ÿ“‹

  1. Pose-to-Text Implementation: Develop and implement a Pose-to-Text model based on the referenced paper for the Indian Sign Language dataset, using Agentic langchain based state flow as the decoder stage for text-to-gloss conversion and merging masked keypoint videos.

  2. Custom Transformer Model Evaluation: Assess the effectiveness of our custom Transformer/LSTM model on the Sign Language Dataset, focusing on accuracy and adaptability to dynamic signs.

  3. Multilingual App Development: Create a user-friendly multilingual app serving as an interface for our Sign Language Translation services, ensuring easy interaction and adoption by both deaf and non-deaf users.

UseCases

  1. Workplace and Educational Inclusion:

    • Deploy the Sign Language Generation system in offices and educational institutions to facilitate seamless communication with the deaf and mute community.
    • Empower individuals with hearing impairments by providing them with equal opportunities for education and employment.
  2. Content Accessibility:

    • Enable news channels and content creators to expand their user base by making their content accessible and inclusive.
    • Offer services to embed sign language video layouts for content, fostering a more inclusive society and promoting equal participation.

Progress So Far โœ…

Results ๐Ÿ“ˆ

Transformers

Results Image

For detailed results and insights, please refer to our presentation slides.

LSTM

(TODO)

Project Contributors ๐Ÿ‘ฅ

Comments

Make a submission

Accepting submissions till 15 Feb 2024, 11:00 PM

Hasura, Bangalore

Hosted by

The Fifth Elephant hackathons

Supported by

Host

All about data science and machine learning

Venue host

Welcome to the events page for events hosted at The Terrace @ Hasura. more

Partner

Providing all founders, at any stage, with free resources to build a successful startup.