PyCon Pune 2017

A conference on the Python programming language

Vivek Phalak

@vivek-shoptimize

Developing Scalable Serverless Data Pipelines using Python and AWS Lambda

Submitted Nov 21, 2016

Ever wondered how you can process tons of data with Python without launching a single server? AWS Lambda provides a great way to scale your python code in the cloud quickly without worrying about capacity constraints and complex deployments.
At Shoptimize, we run e-commerce stores for multiple top brands in India. We capture millions of events across stores, that we analyze to drive understand user behavior and drive conversions.
In order to capture, process, store, and analyze this data, we operate batch and realtime data pipelines based on Lambda principles. The talk captures some lessons learnt while migrating our home grown analytics data pipeline to AWS Lambda and other services.
The talk also captures difficulties of using AWS Lambda for Pythonistas, and open source projects that should simplify this process.

Outline

  • Lambda Architecture principles for batch and real time analytics
  • Introduction to AWS Lambda
  • Redesign of traditional architecture to AWS Lambda
  • Real life scenario and Demonstration

Requirements

  • Linux laptop with Python installed
  • AWS Free tier account

Speaker bio

Vivek Phalak is the CTO and Co-Founder of Shoptimize - a SaaS E-Commerce startup - that specializes in running and growing e-commerce stores for major retail brands. Vivek has 15+ years experience developing B2B and B2C software applications. He has been a avid Pythonista since 2007. He is a jack of all trades concerned with programming, but is specially interested in data analytics and cloud computing. Vivek has a Master’s degree in Software Engineering from Carnegie Mellon University.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}