Designing an Autonomous Workbench for Data Science on AWS

Designing an Autonomous Workbench for Data Science on AWS

Submitted Jul 9, 2021

In the wake of the COVID-19 pandemic and the consequent remote work setup, we - the Engineering team at Episource - were keen on developing a hosted, self-serving platform which would allow our Data Science counterparts to access the compute and data they needed for their experiments on-the-fly.

Due to the iterative nature of ML development cycle where ideas go from hypothesis to being feature-ready within days, it was imperative for the platform to be instantly scalable to meet the heavy requirements of modern-day ML processes. Another aspect to consider was that we had to ensure sensitive training datasets would stay in encrypted and secure environments only.

This talk will be a quick overview into the thought process, experiments and lessons learnt during our journey of building our own data science workbench on AWS.

During this talk, a participant can expect to understand the following:

The AWS architecture we designed to host the open source Jupyterlab project and adapted it to accommodate our specific requirements using Kubernetes.
Best practices for integrating organised dataset storage, closed-door access control and autoscaling capabilities to our architecture.
First hand insights on how the Workbench has improved Episource’s ML development cycles.

Who is this talk for?

ML teams of any size, who are looking to introduce autonomy and promote rapid experimentation within their data science ranks.

Slides

All submissions

Previous Next

Comments

Hosted by

The Fifth Elephant

Jumpstart better data engineering and AI futures

Submissions for MLOps November edition

Designing an Autonomous Workbench for Data Science on AWS

Comments