PWL Sep 2020: "The Tail at Scale"
Ideas behind building large scale latency tolerant distributed systems.
Sep 2020
14 Mon
15 Tue
16 Wed
17 Thu 06:00 PM – 07:10 PM IST
18 Fri
19 Sat
20 Sun
Just came across this interesting post-
https://instagram-engineering.com/open-sourcing-a-10x-reduction-in-apache-cassandra-tail-latency-d64f86b43589
Krace has suggested we consider discussing Chord: https://pdos.csail.mit.edu/papers/chord:sigcomm01/chord_sigcomm.pdf
Chord is a protocol and algorithm for a peer-to-peer distributed hash table - https://en.wikipedia.org/wiki/Chord_(peer-to-peer)
Any thoughts, suggestions?
The slides from the session are available here - https://speakerdeck.com/pigol1/the-tail-at-scale-hasgeek-meetup
We already discussed Your Server as a Function: https://www.meetup.com/Papers-we-love-Bangalore/events/268444851/
I guess we can repeat -- but there is a lot of demand for AWS Dynamo paper. Maybe that should be next.
Anyone among participants who wants to discuss Dynamo paper? r be a respondent?
As discussed, AWS Dynamo paper next. Kracekumar has volunteered to present the paper. Meera, Navin, are you folks to ok to present Server as a Function paper on a Thursday evening? After the Dynamo paper is presented?
Navin and Meera - on Server as a Function - to discuss this at the next meeting. Any other suggestions?
Link to "Your server as a function" (PDF) https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.696.9710&rep=rep1&type=pdf
Jaseem also spoke about Facebook's network load balancer Katran, which is detailed in this blogpost- https://engineering.fb.com/open-source/open-sourcing-katran-a-scalable-network-load-balancer/
The paper discusses concepts like enqueuing requests two or more servers and cancel the pending request after the first response. Some of the techniques are available in distributed databases.
Thanks for the questions, @kracekumar I have asked Piyush to look into these questions and address them here/during the session on Thursday.
Cancelling the pending request after the first response is an optimization and not really required to reduce the tail latency. A naive approach would cause 2x load, assuming the request is sent to two replicas. However, if you pay attention to the idea of "Hedged requests", which sends the second request only after a brief delay, limiting the additional load to 5%.
It seems like a simple implementation of these ideas can be plugged to any API server easily, but I'm not aware of any libraries that perform this out of the box.
Hosted by