|
Enterprise AI in Production ₹11 Lakh/Month: How We Took the GPU Out of Face MatchFace matching is one of the highest-volume workloads in identity verification. At IDfy, a single GPU pod handling 1 RPS cost us ₹3,500/day. After moving the model to BF16 inference on Intel CPUs via OpenVINO, the same 1 RPS pod cost ₹350/day. Same TAT, same throughput, same accuracy envelope. At our traffic shape (50 RPS sustained for the peak hour, 10 RPS for the remaining 23), that translates t… more
Submission type: Anchor talk (30 mins)
|