arrow_back Improve data quality using Apache Airflow and check operator
Scalability truths and serverless architectures: why it is harder with stateful, data-driven systems
Submitted by Regunath B on Wednesday, 14 March 2018
Building scalable systems is not easy. It is not as simple as deploying on a cloud and expecting it to scale alongwith the cloud’s elasticity. Many systems and solutions that claim elasticity of scale often indirectly limit their claims to stateless services.
Building and scaling stateless systems has far fewer challenges over stateful systems. That said, stateless services are limited by data centre infrastructure and begs attention at large footprints - at tens of millions of requests per second. It is therefore seemingly easier to scale stateless services and adds credence to claims of almost limitless elastic scale. Serverless architecture offers the convenience of zero server deployments while preserving elasticity of scaling.
In reality, there is little truth in a truly stateless service, in fact it is a case of state being pushed to another service/system. The challenges therefore shift to scaling stateful services - something harder to achieve.
In this talk I will start with an overview of typical application workloads - online vs offline, interactive vs batch, sync vs async etc. and commonly used patterns and libraries to build these systems. We will also evaluate each of these examples to identify critical stateful services/systems and the challenges in scaling them. We will then take the Flipkart Flux open source project as an example to understand the design of a highly scalable stateful system that offers serverless computing for deployed applications, similar to AWS Lambda. The talk will cover various design and tech choices that enables millions of stateful, data-driven workflows/computes to run on the Flux system.
- Defining scalability - as applied to stateless and stateful systems
- Stateless service - case of state pushed to a stateful layer
- Database/Data store for stateful systems. Choices of such stores - Relational, Append-only etc
- Distributing stateful compute, things to take care of
- Introduction to serverless architecture, what to expect. Services available
- Building your own stateful serverless compute engine - the Flux example
- Data engineering for stateful systems - scaling from single node to multi-node cluster on the network
Regunath is an open source developer, engineer who built Aadhaar and currently works on Retail and Marketplace systems at Flipkart. He is also a core contributor on the Flux project discussed in this talk.