Consistent Hashing: Using something simple to scale infinitely and horizontally
We build and run https://cortexmetrics.io, a scalable time-series database that scales to billions of datapoints per day and 100s of Millions of active time-series. The architecture is modelled around Consistent Hashing, a (imo) simple but powerful algorithm to distribute data that lets you scale horizontally. We will first start with motivation (what are we trying to achieve), and introduce consistent hashing and how typical databases use Consistent Hashing with replication to scale. We will then talk about the limitations and challenges.
Finally we will talk about how we actually implemented it in https://cortexmetrics.io and https://github.com/grafana/loki to scale to 100s of millions of series and what outages were caused by it and how we improved on it, to give the attendees a practical takeaway.
NOTE: The meetup on 30 May is too short-notice for me. I would love to do a more prepared and thought out version of the talk in the next DS meetup though.
- Motivation – what are we trying to achieve
- Consistent Hashing, what and brief history (Dynamo Paper)
- Consistent Hashing in Cassandra, how they use it to do replication and repair
- Consistent Hashing in Cortex and Loki (leveraging etcd to do it and how we architected it)
- Challenges and outages and future improvements planned
- Maybe even considerations if you want to do it yourself
Goutham Veeramachaneni is a developer from India who started his journey as an infra intern at a large company where he worked on deploying Prometheus. After the initial encounter, he started contributing to Prometheus and interned with CoreOS, working on Prometheus’ new storage engine.
He is now an active contributor to the Prometheus eco-system and at one point the maintainer for TSDB, the storage engine behind Prometheus 2.0.
He works at Grafana Labs, on Cortex and open-source observability tools.
When not hacking away, he is on his bike adding miles and hurting his bum.