Serving user intent : Facebook style notifications using HBase and Event streams
Submitted by Regunath Balasubramanian (@regunathb) on Friday, 31 January 2014
This talk is about building a low-latency, near real-time Notifications platform for serving user intent using Event based architecture, Complex Event Processing and a data store like HBase. Will also cover how millisecond response times are achieved when accessing data from 100 million rows by interpreting change from immutable events and organizing data as LSM trees.
Relevant and Personalized notifications in near real-time is a great way of serving user intent. The intent may vary - say liking a Facebook update as compared to a price drop for a browsed product on an e-commerce website. The system characteristics and solution patterns in both these instances may be very similar though.
This talk will cover the design of the Flipkart Notifications platform. The techniques and technologies used to serve product related intent can be easily applied to a different domain. This talk will also introduce projects that were Open Sourced while building the platform.
Architecture, Design patterns and technologies used in this system include:
- Pre-creating data that matches user intent - so as to significantly reduce data serving latencies
- Storing immutable events and interpreting change
- Event driven architectures(EDA) and its variant Staged EDA (SEDA) using technologies like RabbitMQ and Mule.
- Complex Event Processing (CEP) using technologies like Esper
- Data stores like HBase that organize data between memory and disk as Log Structured Merged (LSM) trees - leveraging Disk transfer better over Disk seek
- A data serving API that is resilient to failures and latencies - using Hystrix and Netty
The talk uses a typical e-commerce experience where user intent is either implicit or interpreted from actions - for example a user browsing a product of interest, adding an item to a shopping cart or adding it for future reference via a wish-list. In a dynamic e-commerce marketplace, product data (such as price, stock quantity) is constantly changing across millions of listed products even as user intent is being expressed on the website. User intent may be seen as one Event stream while Product attribute changes is another. An intersection of these two streams is the Notification data. An efficient data store that can store and serve tens of millions of such notifications with very low latencies is the Notification service.
The following projects were open sourced before or when building the Notifications Platform :
The talk will also feature a live view of the data serving metrics with millisecond response times.
Just technical curiosity about how those notifications on Facebook or Flipkart are delivered at scale. An appreciation of data stores (SQL and NoSQL) and their characteristics will also help. A big plus if you have spent time trying to solve similar problems.
Architect and Open source committer. My areas of interest are Distributed Systems, Big Data, Text Mining and Data Stores.
My experience as Architect includes:
- Building the World's largest biometric identity platform in Aadhaar
- Customer facing Mobile and Web platforms at India's leading e-Commerce company - Flipkart
Most of my work in recent years has been around OSS - using it to build large scale systems and in contributing projects back to the community. Some of my OSS work is downloaded and used worldwide:
Active projects on github : https://github.com/regunathb