Aug 2023
7 Mon
8 Tue
9 Wed
10 Thu
11 Fri 09:00 AM – 06:00 PM IST
12 Sat
13 Sun
Navin Pai
With the advent of Large Language Models (LLMs) and Generative AI (GAI), there is growing interest in building LLM-driven applications.. Within the industry, a growing number of companies are exploring whether they can leverage state-of-the-art, open source-driven LLM Models within their own existing k8s clusters and building applications that leverage these bespoke, in-cluster LLMs for a range of applications.
Motivations for this may be driven by security (sending live customer data to OpenAI is a big no-no in most regulated industries), RnD (ability to build custom LLMs for specific use-cases and tweak behaviour accordingly) or even just cost (OpenAI pricing can become a bottleneck at non-trivial scale).
In this talk, we’ll explore the techniques and best practices for deploying LLMs within K8s environments, and how workflows can be built to simplify LLM training/tuning, deployment and management, based on our experience building an LLM-driven platform, Aiden
Note: Given the nature and speed of development in the world of LLMs and GAI, this talk is intended to be more of an experience-driven best practices, mistakes we made while building out LLM-serving architectures, and discussing tradeoffs between different approaches of building GAI-driven applications rather than a one-size-fits-all talk.
Hosted by
Supported by
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}