|
Platform Engineering meet-up - Nov 8 Building Low-Latency ML Inference at Ad-serving ScaleEver wondered what it takes to run a low-latency, high-throughput ML platform serving over 3 million requests per second? In this talk, we’ll dive deep into the engineering challenges of operating at extreme scale — from handling synchronous ML inference in real-time ad serving to ensuring predictable latency under heavy load. We’ll explore the evolution of our architecture — how Jetty-based appl… more
Session type: Talk (30 mins)
|