The Fifth Elephant Open Source AI Hackathon 2024

GenAI makers and creators contest and showcase

Tickets

Loading…

Adithya S K

@Adithya_S_K

Srinidhi Somayaji P

@Srinidhi9113

Achala Nayak

@achalanayak

A versatile open-source framework designed to adapt pre-trained Language Models (LLMs), such as Llama, Mistral, and Mixtral, to a wide array of domains and languages.

Submitted Feb 15, 2024

Problem Statement:
Training Large Language Models (LLMs) for Indic languages from scratch is costly and impractical. In response, we present a streamlined framework for adapting pre-trained LLMs like Llama and Mixtral8x7b to various languages, utilizing a compact dataset for cross-lingual tasks. Our solution includes fine-tuning and evaluation processes tailored for practical production use cases.

Unique Selling Points (USPs):

  • Mixture of Languages Architecture: Introducing a novel architecture inspired by the “Mixture of Experts” framework in Mixtral8x7b. Our model consists of 5x7b parameter models, each serving as an expert in a specific language (Kannada, Telugu, Tamil, Hindi, and English).

  • High-Quality Synthetic Data: The model is trained on high-quality synthetic data, ensuring efficiency and reducing additional training costs.

  • Adaptive Lora Adapter Swapping: Employing a method to dynamically switch Lora adapters during inference, enabling a single model to excel in various tasks such as RAG Answering, translation, and instruction following.

  • Multilingual Support: The model is designed to be multilingual, proficient in five languages, catering to diverse linguistic requirements.

  • Indic LLM Evaluation Framework: Developed a specialized evaluation framework tailored for assessing the performance of Indic Large Language Models.

Model Architecture:
The proposed architecture draws inspiration from the Mixture of Experts framework, where each expert is bilingually trained in a specific language. This approach significantly reduces inference time, making it conducive to production environments. The dynamic switching of Lora adapters during inference is based on specific use cases, ensuring adaptability for tasks like retail support conversations and translation. Note that the training of other models is currently in progress.

Project Goals:

  1. User-Friendly Interface: Develop a straightforward interface to empower individuals in adapting models to different domains and languages. The inclusion of a graphical user interface (GUI) ensures easy accessibility, making the adaptation process user-friendly.

  2. Cutting-Edge Support: Incorporate the latest advancements in distributed training code, dataset generation, translation code, and all necessary components for seamless adaptation, fine-tuning, evaluation, and deployment of models. This ensures that the framework stays at the forefront of technology, providing users with state-of-the-art tools for their language model adaptation needs.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid access (members only)

Hosted by

The Fifth Elephant hackathons

Supported by

Host

Jump starting better data engineering and AI futures

Venue host

Welcome to the events page for events hosted at The Terrace @ Hasura. more

Partner

Providing all founders, at any stage, with free resources to build a successful startup.