Developing Scalable Serverless Data Pipelines using Python and AWS Lambda

Feb 2017

13 Mon

14 Tue

15 Wed

16 Thu 09:00 AM – 06:00 PM IST

17 Fri 09:00 AM – 06:00 PM IST

18 Sat

19 Sun

AMANORA THE FERN HOTELS AND CLUB, PUNE, Pune

Developing Scalable Serverless Data Pipelines using Python and AWS Lambda

Submitted Nov 21, 2016

Technical level: Intermediate

Ever wondered how you can process tons of data with Python without launching a single server? AWS Lambda provides a great way to scale your python code in the cloud quickly without worrying about capacity constraints and complex deployments.
At Shoptimize, we run e-commerce stores for multiple top brands in India. We capture millions of events across stores, that we analyze to drive understand user behavior and drive conversions.
In order to capture, process, store, and analyze this data, we operate batch and realtime data pipelines based on Lambda principles. The talk captures some lessons learnt while migrating our home grown analytics data pipeline to AWS Lambda and other services.
The talk also captures difficulties of using AWS Lambda for Pythonistas, and open source projects that should simplify this process.

Outline

Lambda Architecture principles for batch and real time analytics
Introduction to AWS Lambda
Redesign of traditional architecture to AWS Lambda
Real life scenario and Demonstration

Requirements

Linux laptop with Python installed
AWS Free tier account

Speaker bio

Vivek Phalak is the CTO and Co-Founder of Shoptimize - a SaaS E-Commerce startup - that specializes in running and growing e-commerce stores for major retail brands. Vivek has 15+ years experience developing B2B and B2C software applications. He has been a avid Pythonista since 2007. He is a jack of all trades concerned with programming, but is specially interested in data analytics and cloud computing. Vivek has a Master’s degree in Software Engineering from Carnegie Mellon University.

PyCon Pune 2017