Advancing multimodal and agentic AI: systems, storage & scalability
Open Source AI Meet-up - Bangalore edition
Day afterApr 2025
31 Mon
1 Tue
2 Wed
3 Thu
4 Fri 01:45 PM – 06:10 PM IST
5 Sat
6 Sun
Submitted Mar 17, 2025
Getting your first language model tuning up and running is great, however, in an enterprise, you would always want to extract more value out of your expensive infrastructure by having more training cycles completed. This session presents various concepts to increase language model tuning throughput. Covering topics (available in open source) from usage of simple to complex knobs down the stack.
Overview of concepts for throughput enhancement
Training knobs. e.g. gradient checkpointing + increased batch size etc
Faster implementations. e.g. replace attention module with flash attention, padding free
Data sampling / collation for efficient load distribution. e.g. multi pack data sampler
Sparse techniques + increased batch size. e.g. LoRA, Prompt Tuning
Parallelisms. e.g. data parallel techniques like distributed data parallel and fully sharded data parallel (FSDP)
Fast kernels. e.g. replace common operations such as cross entropy loss with triton kernel implementations.
Torch compile. e.g. compile torch code to kernels
In-depth conceptual awareness and understanding of various knobs pin pointing to specific stack/methods to maximize throughput. Audiences should also be able to get a hang of application of these concepts.
Tuning users or AI professionals.
I am Mehant Kammakomati. I work as a research software engineer at IBM Research - India. I work on language model tuning capabilities aiming to give best tuning experience to our users.
Hosted by
Supported by
Meet-up sponsor
Community sponsor
Login to leave a comment
No comments posted yet