Miniconf on Cloud Server Management (Mumbai)

On costs, scaling and securing cloud servers

Re-imagining data infrastructures as event-based architectures

Submitted by Raj Rohit (@jalemrajrohit) on Monday, 13 November 2017

videocam_off

Technical level

Intermediate

Section

Full talk

Status

Submitted

Vote on this proposal

Login to vote

Total votes:  +1

Abstract

This talk would be about how we built a distributed serverless batch data architecture at Episource. This includes the end-to-end ETL pipeline which handles distributed Machine Learning, as well as how we automated ML deployment using the event-based (serverless) paradigm.

Outline

  • Whirlwind intro on what the serverless paradigm is
  • How we built a batch architecture instead of real-time
  • How we got around the 5 min. time limit of Lambda to build an end-to-end completely serverless distributed Deep Learning pipeline
  • How load balancing and monitoring can be done for such huge, complex systems
  • How the serverless paradigm helps re-imagine data architectures for data engineers

Speaker bio

Raj Rohit is a senior data scientist at Episource, where he builds ML algorithms, architects data pipelines, stares at endless Linux logs, and is building the company’s DevOps team. Raj is the author of the Julia Cookbook and is also the moderator of Stack Overflow’s DevOps and DataScience sites.

Links

Comments

Login with Twitter or Google to leave a comment