The Fifth Elephant 2017

On data engineering and application of ML in diverse domains

Scalability truths and serverless architectures - why it is harder with stateful, data-driven systems

Submitted by Regunath Balasubramanian (@regunathb) on Monday, 22 May 2017

videocam
Preview video

Technical level

Intermediate

Section

Full talk for data engineering track

Status

Submitted

Vote on this proposal

Login to vote

Total votes:  +3

Abstract

Building scalable systems is not easy. It is not as simple as deploying on a cloud and expecting it to scale alongwith the cloud’s elasticity. Many systems and solutions that claim elasticity of scale often indirectly limit their claims to stateless services.
Serverless architecture is a recent addition to the developer programming/deployment toolset that offers the convenience of zero server deployments while preserving elasticity of scaling.
Building and scaling stateless systems has far fewer challenges over stateful systems. That said, stateless services are limited by data centre infrastructure and begs attention at large footprints - at tens of millions of requests per second. It is therefore seemingly easier to scale stateless services and adds credence to claims of almost limitless elastic scale.
In reality, there is little truth in a truly stateless service, in fact it is a case of state being pushed to another service/system. The challenges therefore shift to scaling stateful services - something harder to achieve.

In this talk I will give an overview of typical application workloads - online vs offline, interactive vs batch, sync vs async etc. and commonly used patterns and libraries to build these systems. We will also evaluate each of these examples to identify critical stateful services/systems and the challenges in scaling them. We will then take the Flipkart Flux open source project as an example to understand the design of a highly scalable stateful system that offers serverless computing for deployed applications, similar to AWS Lambda. The talk will cover various design and tech choices that enables millions of stateful, data-driven workflows/computes to run on the Flux system.

Outline

  • Defining scalability - as applied to stateless and stateful systems
  • Stateless service - case of state pushed to a stateful layer
  • Database/Data store for stateful systems. Choices of such stores - Relational, Append-only etc
  • Distributing stateful compute, things to take care of
  • Introduction to serverless architecture, what to expect. Services available
  • Building your own stateful serverless compute engine - the Flux example
  • Data engineering for stateful systems - scaling from single node to multi-node cluster on the network

Speaker bio

Regunath is an open source developer, engineer who built Aadhaar and later was responsible for Flipkart platform services. He is currently at HealthFace building data-driven decision systems for healthcare and personal health records. He is also a core contributor on the Flux project discussed in this talk.

Links

Preview video

https://hasgeek.tv/fifthelephant/2013-2/626-latency-and-fault-tolerance-in-oltp-1-5-billion-day-service-calls

Comments

  • 1
    Abhishek Balaji (@booleanbalaji) Reviewer a year ago

    Hi Regunath,

    Please upload draft slides outlining what you intend to cover in your talk and a two-min preview video explaining what the talk is about and what the key takeaway is for participants. We need this information by 29 May to evaluate your proposal.

  • 1
    Zainab Bawa (@zainabbawa) Reviewer a year ago

    Hi Regu, is this a new talk or based on the OLTP fault tolerance talk whose video you have shared here?

    • 1
      Zainab Bawa (@zainabbawa) Reviewer a year ago

      Also, does this talk walk through participants through the problem space only, or are you making a very specific argument? It is unclear from the proposal.

  • 1
    Regunath Balasubramanian (@regunathb) Proposer a year ago

    Hi Zainab, This is a new talk. I interpreted the preview video as a sample for presentation skills of the speaker.

  • 1
    Regunath Balasubramanian (@regunathb) Proposer a year ago

    The talk will present challenges, define scope of the problem and will delive into a real world implementation in the Flipkart Flux project (https://github.com/flipkart-incubator/flux)

Login with Twitter or Google to leave a comment