A versatile open-source framework designed to adapt pre-trained Language Models (LLMs), such as Llama, Mistral, and Mixtral, to a wide array of domains and languages.

Dec 2023

18 Mon

19 Tue 05:30 PM – 06:30 PM IST

20 Wed

21 Thu

22 Fri

23 Sat

24 Sun

Jan 2024

1 Mon

2 Tue

3 Wed

4 Thu

5 Fri 05:30 PM – 07:20 PM IST

6 Sat

7 Sun

Jan 2024

8 Mon 06:00 PM – 06:55 PM IST

9 Tue

10 Wed 06:00 PM – 07:00 PM IST

11 Thu

12 Fri 06:00 PM – 07:30 PM IST

13 Sat 03:00 PM – 06:00 PM IST

14 Sun

Jan 2024

22 Mon

23 Tue

24 Wed

25 Thu

26 Fri

27 Sat 05:00 PM – 05:45 PM IST

28 Sun

Feb 2024

29 Mon

30 Tue

31 Wed

1 Thu

2 Fri

3 Sat 10:00 AM – 06:25 PM IST

4 Sun

Feb 2024

5 Mon

6 Tue

7 Wed 08:15 PM – 09:00 PM IST

8 Thu

9 Fri

10 Sat

11 Sun

Feb 2024

12 Mon 08:15 PM – 09:00 PM IST

13 Tue 08:15 PM – 09:00 PM IST

14 Wed 08:15 PM – 09:00 PM IST

15 Thu 08:15 PM – 09:00 PM IST

16 Fri 07:30 PM – 08:30 PM IST

17 Sat 08:15 PM – 09:00 PM IST

18 Sun

Feb 2024

19 Mon

20 Tue

21 Wed 08:30 PM – 09:15 PM IST

22 Thu

23 Fri

24 Sat

25 Sun

Mar 2024

4 Mon

5 Tue

6 Wed

7 Thu

8 Fri

9 Sat 07:00 PM – 09:00 PM IST

10 Sun 04:00 PM – 06:00 PM IST

Apr 2024

8 Mon

9 Tue

10 Wed

11 Thu

12 Fri 12:00 PM – 06:25 PM IST

13 Sat

14 Sun

Hasura, Bangalore

A versatile open-source framework designed to adapt pre-trained Language Models (LLMs), such as Llama, Mistral, and Mixtral, to a wide array of domains and languages.

Submitted Feb 15, 2024

Category: AI for multilingual

Problem Statement:
Training Large Language Models (LLMs) for Indic languages from scratch is costly and impractical. In response, we present a streamlined framework for adapting pre-trained LLMs like Llama and Mixtral8x7b to various languages, utilizing a compact dataset for cross-lingual tasks. Our solution includes fine-tuning and evaluation processes tailored for practical production use cases.

Unique Selling Points (USPs):

Mixture of Languages Architecture: Introducing a novel architecture inspired by the “Mixture of Experts” framework in Mixtral8x7b. Our model consists of 5x7b parameter models, each serving as an expert in a specific language (Kannada, Telugu, Tamil, Hindi, and English).
High-Quality Synthetic Data: The model is trained on high-quality synthetic data, ensuring efficiency and reducing additional training costs.
Adaptive Lora Adapter Swapping: Employing a method to dynamically switch Lora adapters during inference, enabling a single model to excel in various tasks such as RAG Answering, translation, and instruction following.
Multilingual Support: The model is designed to be multilingual, proficient in five languages, catering to diverse linguistic requirements.
Indic LLM Evaluation Framework: Developed a specialized evaluation framework tailored for assessing the performance of Indic Large Language Models.

Model Architecture:
The proposed architecture draws inspiration from the Mixture of Experts framework, where each expert is bilingually trained in a specific language. This approach significantly reduces inference time, making it conducive to production environments. The dynamic switching of Lora adapters during inference is based on specific use cases, ensuring adaptability for tasks like retail support conversations and translation. Note that the training of other models is currently in progress.

Project Goals:

User-Friendly Interface: Develop a straightforward interface to empower individuals in adapting models to different domains and languages. The inclusion of a graphical user interface (GUI) ensures easy accessibility, making the adaptation process user-friendly.
Cutting-Edge Support: Incorporate the latest advancements in distributed training code, dataset generation, translation code, and all necessary components for seamless adaptation, fine-tuning, evaluation, and deployment of models. This ensures that the framework stays at the forefront of technology, providing users with state-of-the-art tools for their language model adaptation needs.

Mixture of Languages Architecture

Lack of detailed explanation on how the novel architecture is inspired by the "Mixture of Experts" framework in Mixtral8x7b may raise questions about the uniqueness and feasibility of the approach.

The feasibility and effectiveness of using 5x7b parameter models as language experts need to be demonstrated and validated through empirical evidence.

High-Quality Synthetic Data

The efficacy and reliability of training LLMs on solely synthetic data, especially for languages with complex linguistic nuances, may be met with skepticism and requires thorough validation and justification.

Adaptive Lora Adapter Swapping

The technical implementation and real-world performance of dynamic Lora adapter swapping during inference need to be thoroughly tested and verified for its seamless integration with various tasks such as RAG Answering, translation, and instruction following.

Ensuring that the adaptability achieved through adapter swapping does not compromise the model's performance or introduce unexpected errors is a critical technical concern.

Model Architecture

The details regarding the bilingual training of each expert in a specific language within the proposed architecture need to be elaborated to provide a clear understanding of the technical implementation and its effectiveness.

The ongoing training of other models introduces uncertainty regarding the completeness and robustness of the proposed architecture in its current state.

Project Goals

The technical challenges associated with developing a user-friendly interface, especially concerning the management of complex adaptation processes involving advanced LLM models, need to be addressed to ensure practical implementation.

Incorporating the latest advancements in distributed training code, dataset generation, and translation code raises concerns about the integration, compatibility, and potential trade-offs of different cutting-edge components within the framework.

Can you also add your Git repository link and a detailed roadmap of the project?

A

Akshobhya

@akshobhya_j Editor & Promoter
Hello @Adithya_S_K, @Srinidhi9113, @achalanayak. Thanks for submitting your project to The Fifth Elephant Open Source AI Hackathon. Can you update your submission based on the following considerations?

Mixture of Languages Architecture
1. Lack of detailed explanation on how the novel architecture is inspired by the "Mixture of Experts" framework in Mixtral8x7b may raise questions about the uniqueness and feasibility of the approach.
2. The feasibility and effectiveness of using 5x7b parameter models as language experts need to be demonstrated and validated through empirical evidence.
High-Quality Synthetic Data

The efficacy and reliability of training LLMs on solely synthetic data, especially for languages with complex linguistic nuances, may be met with skepticism and requires thorough validation and justification.

Adaptive Lora Adapter Swapping
1. The technical implementation and real-world performance of dynamic Lora adapter swapping during inference need to be thoroughly tested and verified for its seamless integration with various tasks such as RAG Answering, translation, and instruction following.
2. Ensuring that the adaptability achieved through adapter swapping does not compromise the model's performance or introduce unexpected errors is a critical technical concern.
Indic LLM Evaluation Framework

The specificity of the evaluation framework tailored for Indic Large Language Models raises questions about its adaptability and generalizability to other language models or cross-lingual tasks.

Model Architecture
1. The details regarding the bilingual training of each expert in a specific language within the proposed architecture need to be elaborated to provide a clear understanding of the technical implementation and its effectiveness.
2. The ongoing training of other models introduces uncertainty regarding the completeness and robustness of the proposed architecture in its current state.
Project Goals
1. The technical challenges associated with developing a user-friendly interface, especially concerning the management of complex adaptation processes involving advanced LLM models, need to be addressed to ensure practical implementation.
2. Incorporating the latest advancements in distributed training code, dataset generation, and translation code raises concerns about the integration, compatibility, and potential trade-offs of different cutting-edge components within the framework.
Can you also add your Git repository link and a detailed roadmap of the project?
Posted 1 year ago
Copy link
Email
Twitter
Facebook
Linkedin

Open Source AI Hackathon 2024