Review and feedback for Clockwork - a job scheduler built at PhonePe – Clockwork: the backbone of PhonePe’s two billion daily jobs

Jan 2025

20 Mon

21 Tue 06:00 PM – 07:30 PM IST

22 Wed

23 Thu

24 Fri

25 Sat

26 Sun

Tickets

All submissions

Review and feedback for Clockwork - a job scheduler built at PhonePe

Submitted Jan 22, 2025

Details

Review date and time - 21 January 2025, 6 PM - 7 PM
Presenter - Snehashish Roy (Software Engineer at PhonePe)

Why Clockwork was built
PhonePe, a large fintech company in India, needed a scalable and fault-tolerant job scheduler to handle asynchronous tasks like transaction reconciliation, reminders, and merchant settlements. Embedded schedulers were not suitable due to potential data loss and application logic pollution.

Clockwork’s capabilities
Clockwork is a distributed, persistent, multi-tenanted, fault-tolerant job scheduler. It can schedule actions at specific times, handle bursty traffic, support multiple clients, and ensure durability and at-least-once delivery guarantees. It is horizontally scalable and runs on commodity hardware.

Tech stack and implementation
Clockwork uses Apache Zookeeper for consensus, RabbitMQ for queuing, and Apache HBase as the database. The data schema in HBase uses a row key based on client ID, partition ID, and timestamp. A leader elector module assigns partitions to workers using Zookeeper. A job extractor queries HBase for eligible jobs and executes them using RabbitMQ for decoupling.

Challenges and solutions
Challenges included process hot spotting, which was addressed by pre-splitting HBase regions and sharding RabbitMQ and HBase clusters. Rate limiters were implemented to maintain quality of service. Stability was achieved through benchmarking, using quorum queues in RabbitMQ, and adding metrics and events.

🔗 Link to the slides shown at the review -https://docs.google.com/presentation/d/1Zxwgpcz-dYYbt3sZ44pcEuHk0erkAWqbkrr9E7Gh5xU/

Yagnik Khanna’s feedback

Yagnik Khanna provided feedback on the presentation style, structure, and technical content.

Manner: The presentation was detailed but monotonous; Yagnik suggested more voice modulation and storytelling, focusing on the problems faced and PhonePe’s thought process in finding solutions.
Method: The setup was too long and could be compressed; details about functioning of Zookeeper and RabbitMQ were unnecessary. The focus should be on the problems and solutions, with comparisons to other technologies and why certain choices were made.
Matter: Yagnik asked about open source competitors, idempotency, debouncing, retry and observability (metrics). He questioned the relevance of the technology, given the current landscape, and whether it’s being maintained due to legacy.
Overall feedback: Yagnik emphasized the importance of storytelling, focusing on real-life problems and learnings, and justifying the technical choices made. He also suggested being mindful of the presentation’s relevance and leaving room for audience interaction.

Srinivas Devaki’s feedback

Srinivas Devaki summarized that the speaker should focus on creating a balance between the problem, solution, and solution challenges. The speaker should spend more time on the problems faced and the reasoning behind the chosen solutions, rather than explaining the tools themselves. The audience will be more interested in the 30% of the presentation that covers the problems and challenges, as they will likely already have some understanding of the solution.

Srinivas disagreed with the suggestion that the speaker should cut out Zookeeper and RabbitMQ completely. Srinivas suggested mentioning the tech, but focusing on the problem they solve and the challenges faced in implementing the solution. The reasoning is that the audience will be able to infer the solution if the problem is explained clearly, and that they are more interested in the problems and challenges than the tools themselves.

Srinivas pointed out that the decoupling explanation lacks clarity, questioning why it’s unacceptable for publishers but acceptable for consumers to get stuck on HTTP API calls. He also found the scalability explanation lacking, stating that it doesn’t delve into how Clockwork achieves scalability in its components and problem domains, specifically asking about partition scaling for high QPS and how acceptors handle bursty workload.

Srinivas Devaki also provided feedback on three other aspects:

Guarantees to customers: He questioned how the system guarantees a certain payload throughput when scheduling tasks.
Receiver scaling: He pointed out that scalability for a scheduler involves partitions, acceptors, and receivers, and this wasn’t fully addressed.
Multi-tenancy: He felt that the challenges and solutions for achieving multi-tenancy weren’t adequately explained. While pre-splitting in HBase helps with noisy neighbor issues, it doesn’t solve hotspotting if a client schedules many payloads.

Towards the end, Srinivas complimented the system design, especially the use of HBase, and mentioned it’s ideal for a scheduler database. Srinivas also praised the leader election model and decoupling, stating that these are impressive design choices for a complex distributed architecture like a scheduler. Even AWS released a multi-tenant scheduler only a couple of years ago, highlighting the difficulty of the task.

Owing to an issue at work, Harsh Mittal wasn’t able to participate in the review actively.

Audience at the review participated after reviewers gave their feedback. Below is the summary of the same.

Pramod Biligiri’s questions

Pramod had two main questions:

Zookeeper for partition IDs: He questioned the scalability of using Zookeeper to store partition IDs, as it’s typically used for metadata and not for data from a database.
HBase operational experience: He was curious about the operational experience with HBase and whether a third-party packaging or cloud solution was used.

Snehasish Roy responded that Zookeeper works well for their use case because it’s read-optimized and their metadata isn’t write-heavy. They only store around 2,000-3,000 keys, and the read QPS is manageable.

Regarding HBase, they self-manage it and have found it cost-effective. They did face some latency issues due to compaction, which they addressed with a custom compaction manager.

Madhusudhan Sambojhu’s feedback

Details into tools like Zookeeper, RMQ, HBase can be cut down, mentioning reasons of choosing and benefit it provides would suffice and save time.
A standard job scheduler supports unique jobs, batch jobs, lost/dead jobs, retry, was expecting to see some info related to it, but it was missing.
How is Clockwork consumed by clients, is there an HTTP API, or are there any other SDKs, would be nice to see code snippets of how a job is enqueued, callbacks are registered about job status.

Srujan Akumarthi’s feedback

Srujan’s feedback focused on the need for stronger problem definition and justification for technical choices. Srujan emphasized that tech talks should go beyond architecture overviews, which are readily available in blog posts. Instead, the focus should be on the specific problems faced and the thought process behind the solutions chosen. Srujan also questioned the choice of RabbitMQ and HBase over other technologies, highlighting that their combination is not typically used in schedulers.

In response, Snehasish Roy explained that RabbitMQ was chosen over Kafka due to operational simplicity and the lack of a need for replayability. HBase was chosen over MariaDB due to data growth concerns and the need for fast range operations and dynamic rebalancing.

All submissions

Comments

Jan 2025

20 Mon

21 Tue 06:00 PM – 07:30 PM IST

22 Wed

23 Thu

24 Fri

25 Sat

26 Sun

Hybrid access (members only)

Hosted by

Rootconf

We care about site reliability, cloud costs, security and data privacy