Scalability truths and serverless architectures: why it is harder with stateful, data-driven systems

This submission has been added to the schedule

Scalability truths and serverless architectures: why it is harder with stateful, data-driven systems

Submitted Mar 14, 2018

Section: Full talk Technical level: Intermediate

Building scalable systems is not easy. It is not as simple as deploying on a cloud and expecting it to scale alongwith the cloud’s elasticity. Many systems and solutions that claim elasticity of scale often indirectly limit their claims to stateless services.

Building and scaling stateless systems has far fewer challenges over stateful systems. That said, stateless services are limited by data centre infrastructure and begs attention at large footprints - at tens of millions of requests per second. It is therefore seemingly easier to scale stateless services and adds credence to claims of almost limitless elastic scale. Serverless architecture offers the convenience of zero server deployments while preserving elasticity of scaling.

In reality, there is little truth in a truly stateless service, in fact it is a case of state being pushed to another service/system. The challenges therefore shift to scaling stateful services - something harder to achieve.

In this talk I will start with an overview of typical application workloads - online vs offline, interactive vs batch, sync vs async etc. and commonly used patterns and libraries to build these systems. We will also evaluate each of these examples to identify critical stateful services/systems and the challenges in scaling them. We will then take the Flipkart Flux open source project as an example to understand the design of a highly scalable stateful system that offers serverless computing for deployed applications, similar to AWS Lambda. The talk will cover various design and tech choices that enables millions of stateful, data-driven workflows/computes to run on the Flux system.

Outline

Defining scalability - as applied to stateless and stateful systems
Stateless service - case of state pushed to a stateful layer
Database/Data store for stateful systems. Choices of such stores - Relational, Append-only etc
Distributing stateful compute, things to take care of
Introduction to serverless architecture, what to expect. Services available
Building your own stateful serverless compute engine - the Flux example
Data engineering for stateful systems - scaling from single node to multi-node cluster on the network

Speaker bio

Regunath is an open source developer, engineer who built Aadhaar and currently works on Retail and Marketplace systems at Flipkart. He is also a core contributor on the Flux project discussed in this talk.

Links

Flux project : https://github.com/flipkart-incubator/flux

Slides

https://www.slideshare.net/regunathbalasubramanian/scalability-truths-and-serverless-architectures

The Fifth Elephant 2018