Enhancing language model tuning throughput

This submission has been added to the schedule

Enhancing language model tuning throughput

Submitted Mar 17, 2025

Choose the topic your submission falls under: AI and costs - optimization, efficiency, learnings Type of session: 30 mins talk I am submitting for: Blr OSAI meetup in April 2025

Abstract

Getting your first language model tuning up and running is great, however, in an enterprise, you would always want to extract more value out of your expensive infrastructure by having more training cycles completed. This session presents various concepts to increase language model tuning throughput. Covering topics (available in open source) from usage of simple to complex knobs down the stack.

Overview of concepts for throughput enhancement

Training knobs. e.g. gradient checkpointing + increased batch size etc
Faster implementations. e.g. replace attention module with flash attention, padding free
Data sampling / collation for efficient load distribution. e.g. multi pack data sampler
Sparse techniques + increased batch size. e.g. LoRA, Prompt Tuning
Parallelisms. e.g. data parallel techniques like distributed data parallel and fully sharded data parallel (FSDP)
Fast kernels. e.g. replace common operations such as cross entropy loss with triton kernel implementations.
Torch compile. e.g. compile torch code to kernels

Comments

Login to leave a comment

No comments posted yet

Advancing multimodal and agentic AI: systems, storage & scalability

Enhancing language model tuning throughput

Abstract

Takeaways

Which audiences is your session going to beneficial for?

Bio

Comments