Submissions
Shivam Gupta

Shivam Gupta

@shivamgupta

Staff Engineer @Inmobi

  • Joined Oct 2025

Platform Engineering meet-up - Nov 8

Building Low-Latency ML Inference at Ad-serving Scale

Ever wondered what it takes to run a low-latency, high-throughput ML platform serving over 3 million requests per second? In this talk, we’ll dive deep into the engineering challenges of operating at extreme scale — from handling synchronous ML inference in real-time ad serving to ensuring predictable latency under heavy load. We’ll explore the evolution of our architecture — how Jetty-based appl… more
  • 2 comments
  • Confirmed & scheduled
  • 06 Oct 2025
Session type: Talk (30 mins)