Jun 2019
17 Mon
18 Tue
19 Wed
20 Thu
21 Fri 08:45 AM – 05:40 PM IST
22 Sat 09:00 AM – 05:30 PM IST
23 Sun
Kishore N C
At Zapier, we connect over 1000 SaaS applications and enable people to automate their workflows spanning across multiple web applications. To achieve that, we use RabbitMQ to run millions of tasks every day. It can be said to be the backbone of Zapier.
We were using RabbitMQ in clustering mode in Zapier for scalability. We soon realised that RabbitMQ clustering is designed for scalability and not for high availability. If a node failed in the cluster, queues on that node will be lost and it also took out the other nodes from service. Read more here. Although RabbitMQ has a mirroring feature that replicates queues across multiple nodes, it does not distribute load across these nodes since consumers connect only to the master. During a failover, there’s also a chance that previously unacknowledged messages will get redelivered.
In this talk, we will dive into how we architected an alternative clustering solution that treated each RabbitMQ node as a stand-alone node, thereby tolerating node failures without disrupting the other nodes.
Basic understanding of message queues
Kishore works as a Site Reliability Engineer at Zapier. He loves working on distributed systems and gets a kick out of designing for high availability and scale.
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}