Rootconf Delhi edition

On network engineering, infrastructure automation and DevOps

PubSub Realtime messaging service @ Hotstar

Submitted by Piyush Gupta (@piyushgupta27) on Nov 26, 2019

Section: Full talk (40 mins) Category: Distributed systems Status: Confirmed & scheduled

Abstract

This talk covers our journey of building an MQTT based Pubsub system for 50M concurrent socket connections, the challenges faced and the architecture that powered Hotstar’s realtime social features for IPL 2019.

Outline

The Social & Gaming Team at Hotstar built an interactive Social Feed in VIVO IPL 2019 that appears below the video on the Hotstar mobile apps.

The content in the feed comes from various source, local timer objects, Questions/Answer/Prizes/Rounds/Advertisements/Celeb handles, API calls, user initiated and for a matter of fact, anything that can be shown on the feed in real-time without any scope of caching and without draining clients’ data/battery.

PubSub is a highly scalable and durable messaging infrastructure that serves as a foundation for realtime communication with millions of concurrent users. By providing one-to-many (broadcast or fan-out) use-cases as a starting point, PubSub delivers low-latency, durable messaging from various backend services to all connected users simultaneously with minimal battery and data usage.

Piyush Gupta will talk about his journey of building PubSub Infrastructure. He will stress upon the challenges faced and learnings accrued on this journey of building a system capable of handling 50M peak concurrent connections with 1rps messages sent rate. Over the duration of VIVO IPL 2019, this service ended up sending over 250 Billion+ messages.

Requirements

Jargons Used in the Talk:

  1. Pubsub: Publish–Subscribe is a messaging pattern where senders of messages, called publishers, do not program the messages to be sent directly to specific receivers, called subscribers, but instead categorize published messages into classes without knowledge of which subscribers, if any, there may be.

  2. Connections: (self explanatory)

  3. Topic: UTF-8 string that the broker uses to filter messages for each connected client
  4. Subscriptions: (self explanatory)

  5. Connack Latency: Time for establishing MQTT connection

  6. Pub-to-Sub Latency: Pub-to-sub latency refers to the total time spent by a data event from its publisher to its subscriber including the time taken for broker matching.

  7. EMQx: EMQx is an open source IoT MQTT message broker based on the Erlang/OTP platform.

  8. EMQx Cluster: A cluster of n nodes (or instances) each running EMQ application and grouped via a ClusterName. A message sent to one node in the cluster is transmitted further to all the other nodes in the same cluster.

  9. EMQx Bridge: For an EMQ bridge from Node#A to Node#B for topic T, it will forward all messages received on topic T from Node#A to Node#B.

  10. AWS ELB: Elastic Load Balancer; LB as a service for HTTP/HTTPS and TCP/SSL connections

  11. AWS ELB SurgeQueue Length: ELB has a queue for concurrently incoming requests. SurgeQueue is used to maintain a request queue of 1024 in surge traffic

  12. AWS ELB Spillover Count: Requests that spill over the surgequeue limit is rejected by the ELB and is counted under spillover count

  13. AWS NLB: Network Load Balancer; LB as a service especially for TCP connections.

Speaker bio

Senior Full Stack Developer, Building Social & Gaming for Hotstar

  • Lead projects involving development, architecture and deployment of cloud applications at scale
  • Contribute to Android, iOS, Javascript/React, Python and others from time to time

Links

Slides

https://docs.google.com/presentation/d/1-4-FOuvwEi7kAz7ZlD3RaF4syF2en49GGwFqFlnTDm8/edit#slide=id.g5bf29b27ba_1_64

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('You need to be a participant to comment.') }}

{{ formTitle }}
{{ gettext('Post a comment...') }}
{{ gettext('New comment') }}

{{ errorMsg }}