Miniconf on Cloud Server Management (Mumbai)

On costs, scaling and securing cloud servers

Raj Rohit

@jalemrajrohit

Re-imagining data infrastructures as event-based architectures

Submitted Nov 13, 2017

This talk would be about how we built a distributed serverless batch data architecture at Episource. This includes the end-to-end ETL pipeline which handles distributed Machine Learning, as well as how we automated ML deployment using the event-based (serverless) paradigm.

Outline

  • Whirlwind intro on what the serverless paradigm is
  • How we built a batch architecture instead of real-time
  • How we got around the 5 min. time limit of Lambda to build an end-to-end completely serverless distributed Deep Learning pipeline
  • How load balancing and monitoring can be done for such huge, complex systems
  • How the serverless paradigm helps re-imagine data architectures for data engineers

Speaker bio

Raj Rohit is a senior data scientist at Episource, where he builds ML algorithms, architects data pipelines, stares at endless Linux logs, and is building the company’s DevOps team. Raj is the author of the Julia Cookbook and is also the moderator of Stack Overflow’s DevOps and DataScience sites.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

We care about site reliability, cloud costs, security and data privacy