Rootconf 2019

On infrastructure security, DevOps and distributed systems.

Data handling patterns for diverse needs of Consistency, Scale, Low Latency and Analytics Injestion

Submitted by Regunath Balasubramanian (@regunathb) on Apr 5, 2019

Status: Rejected

Abstract

Large internet companies have rich data, gathered from user interactions and transactions. The user transactions are often served by custom built data stores and solutions targeting needs of Consistency and/or Scale & Low Latency. There is no one-size-fits-all solution for diverse needs of eCommerce applications at Flipkart, such as:

  • Strong consistency and highly durable data stores for requirements like Payments, Orders that scale to thousands of transactions per second
  • Low Latency look-up data stores that scale to millions of requests per second, while managing hot-spotting on subset of the data corpus
  • Timeline consistent view of data across multiple distributed systems such as Search and Pricing
  • Reliably injest/transport data from transactional systems to Analytics platforms

This talk will present the following Data handling patterns:

  • Replica management for Consistent & Partition Tolerant (CP in CAP terminology) data stores
  • Command and Query Responsibility Separation (CQRS) to handle high throughout writes and virtual buckets based replication to manage read hot-spotting on subset of the data corpus
  • Data pre-compute and Time-travel to create Timeline consistent view of data across multiple distributed systems
  • Data sourcing from Database replication mechanisms, Local transactions and async relaying to Analytics platform
  • By-passing network bottlenecks like centralized Load-balancers by instead using data-placement-aware smart local proxies/sidecars

I will share system design and underlying technology details on how these patterns have been used at Flipkart.

Outline

  • Discuss diverse data needs at Flipkart - Consistent, Low Latency, Scale and Analytics
  • Further characterize the requirements into logical solutions while acknowledging formal constraints like the CAP theorem
  • Assess available landscape of data stores, their strengths and shortcomings
  • Introduce patterns and custom solutions to each of the requirements identified
  • Discuss design and underlying technology stack in implementing the patterns and solutions

Requirements

Prior experience designing, building systems of reasonable scale and an appreciation of related data handling problems.

Speaker bio

Regunath works as Principal Architect at Flipkart where he leads system readiness and scaling for large events like the Big Billion Days. He is an Open Source developer and his work can be found here : https://github.com/regunathb

Links

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('You need to be a participant to comment.') }}

{{ formTitle }}
{{ gettext('Post a comment...') }}
{{ gettext('New comment') }}

{{ errorMsg }}