Submissions for MLOps November edition

On ML workflows, tools, automation and running ML in production

Dipen Chawla

Dipen Chawla

@dipen_epi

Designing an Autonomous Workbench for Data Science on AWS

Submitted Jul 9, 2021

In the wake of the COVID-19 pandemic and the consequent remote work setup, we - the Engineering team at Episource - were keen on developing a hosted, self-serving platform which would allow our Data Science counterparts to access the compute and data they needed for their experiments on-the-fly.

Due to the iterative nature of ML development cycle where ideas go from hypothesis to being feature-ready within days, it was imperative for the platform to be instantly scalable to meet the heavy requirements of modern-day ML processes. Another aspect to consider was that we had to ensure sensitive training datasets would stay in encrypted and secure environments only.

This talk will be a quick overview into the thought process, experiments and lessons learnt during our journey of building our own data science workbench on AWS.

During this talk, a participant can expect to understand the following:

  1. The AWS architecture we designed to host the open source Jupyterlab project and adapted it to accommodate our specific requirements using Kubernetes.
  2. Best practices for integrating organised dataset storage, closed-door access control and autoscaling capabilities to our architecture.
  3. First hand insights on how the Workbench has improved Episource’s ML development cycles.

Who is this talk for?

  • ML teams of any size, who are looking to introduce autonomy and promote rapid experimentation within their data science ranks.

Slides

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures