Submissions for MLOps November edition
On ML workflows, tools, automation and running ML in production
Sudeep Gupta
Machine Learning is a strategic goal at Farfetch, and enabling Data Scientists to be more productive is a key objective to achieve it. Cloud Providers like AWS, Google, Azure have a lot of services/products which offer various capabilities to create value in ML and Data space. But often the problem with using these services, or building one such Product from scratch is that the key stakeholders - the Data Scientists are ignored. While these products/services (cloud offerings) are capable of hosting/delivering the promised models/analytics, the workflow of a Data Scientist which is integrated with the existing enterprise infrastructure (approvals, security, access, setup) is quite overlooked, and often the onus falls on the User who is solving the problem - making it a tedious job to navigate through the Enterprise tree to figure out what needs to be done to get setup.
As an Azure strategic partner, at Farfetch our objective is to leverage its enterprise services along with cutting edge Open Source technologies and add value for our Data Scientists to enable them to hit the ground running with every problem.
Every process/requirement is captured with the spotlight on Data Scientists. The goal is always - what is the use case, and how do we enable our users to execute it efficiently; and with that (goal) in mind, we are building a Platform as a Service with multi-tenancy at its core to give control back to our users in an Enterprise context, to use/extend the platform as they please.
We identified the requirements and blockers, and have built a ML Platform around them leveraging the components as outlined below:
Workflow Orchestration Layer- Airflow on Kubernetes [7 mins]
Processing Layer - Databricks [7 mins]
Storage Layer - ADLS [7 mins]
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}