Tickets

Loading…

Russi Chatterjee

Russi Chatterjee

@ixaxaar218

Engineering of GenAI - a deeper look into the software stacks that power deep learning at scale.

Submitted May 18, 2024

Engineering of GenAI - a deeper look into the software stacks that power deep learning at scale.

Audience

  • Backend and devops engineers
  • Infrastructure / hardware / embedded engineers

Outline

Models - what is a model?

  • Kernels & Computational graphs
  • Model-level & layer-level optimizations

Runtime stack - what runs these models?

  • Deep learning frameworks (pytorch, tensorflow)
  • Higher libraries (deepspeed, cuDNN)
  • Algebra and math libraries (cuBLAS)
  • Device driver (nvidia, amd) <- kernels, CUDA, ROCm, openCL, vulkan (compute shaders) and other tech
  • Hardware device (GPU / FPGA, RDMA & RoCE, infiniband fabric etc)

Operations - how to do things with models?

  • Models Storage Workflow
  • Inference (inference engines, optimizations)
  • Fine-Tune (infrastructure, single and multi-GPU infra)
  • Pre-Train & continual Pre-Train (HPCs and related infra)

(all limited to tools and infrastructure, regardless of model)

Impact

This is a very large overview of the deep learning software ecosystem.
The major benefits would be the following:

  • Build familiarity with the system stack of deep learning
  • Learn the paradigms of optimization for DL
  • Make more informed decisions in production

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid Access Ticket

Hosted by

All about data science and machine learning

Supported by

Gold Sponsor

Atlassian unleashes the potential of every team. Our agile & DevOps, IT service management and work management software helps teams organize, discuss, and compl

Silver Sponsor