Rootconf 2018

On scaling infrastructure and operations

Compute Intensive applications on DC/OS

Submitted by Swapnil Dubey (@swapnildubey) on Thursday, 8 March 2018

videocam_off

Technical level

Intermediate

Section

Full talk

Status

Confirmed & Scheduled

View proposal in schedule

Vote on this proposal

Login to vote

Total votes:  +2

Abstract

Deep learning needs no introduction these days. With the growth of data and complex hidden behaviour inside data, there is a sudden burst in use cases of Deep Learning usage. With lots of data and complex processing, there is a costly infrastructure involved as well for running the Deep Learning models. This becomes even more important when we start using GPUs.

This talk is going to revolve around my learning and experience in the past year developing such an infrastructure which meets the expectations of Data Scientists(specially Deep Learning on GPUs).Why and Why not Kubernetes for our use case?
Creating infinitely scalable infrastructure for DS tasks and with an eye on optimizing the cost. This talk will feature how’s of achieving that.

Outline

  1. Defining the term “Optimized Infrastructure for Data Science”.
  2. Identifying factors for optimizing infrastructure for Deep learning models.’
  3. What’s and Why’s of GPU?
  4. Deep learning on Kubernetes – Why we rejected the approach?
  5. Introduction to DCOS.
  6. Dynamic GPU usage with DCOS for running distributed tensorflow jobs.
  7. Demo: Preparing Docker for our Use case.
  8. Demo: Deep learning model training showing dynamic allocation of GPU using DCOS.

Speaker bio

Swapnil has 9+ years of experience and he is currently working as Technical Architect (Bigdata) at Exadatum Softwares Services Pvt Ltd. Prior to working with Exadatum, Swapnil has experience of working with Snapdeal as Lead Engineer and Schlumberger as Senior Data Engineer.

Swapnil has contributed in Domains of BFSI,Ad Serving and eCommerce with Hadoop,Spark and GCP as primary tech stack.

His current area of interest is in developing infinitely scalable and optimized infrastructure using Docker and Kubernetes/Mesosphere.

Swapnil has served as Cloudera Certified trainer for Hadoop Admin and Developer courses. He believes in learning and sharing his learning across the community. A frequent speaker in meetups and active presenter in conferences. Anthill Inside, Rootconf, Expert talks, Dr Dobbs and IEEE International conference to name few.

Comments

Login with Twitter or Google to leave a comment