MLOps Conference

MLOps Conference

On DataOps, productionizing ML models, and running experiments at scale.

Pratik Bhavsar


End2End Serverless Transformers On AWS Lambda For NLP

Submitted Jun 27, 2021

Transformers are everywhere! But how to serve them? How do you leverage serverless to get scalability without any worries? Isn’t serverless used for light applications?. How to get the best latencies with your serverless? I will be sharing answers to these questions in my talk.

Slides -

About Pratik

A self-taught data scientist and open-source developer from India. He specialises in making Search & NLP solutions.
He runs a slack data science community and writes at
You can find his previous talks with PyData, WiMLDS & DAIR at
Portfolio -


  1. Paradigms of deployment
    • Live server
    • Batch processing
    • Serverless
  2. Benefits of serverless
  3. Deploying transformer models on Lambda
  4. Exposing API
  5. Versioning lambdas
  6. CI/CD with GitHub actions
  7. Runtime limitations and consequences
  8. Multi-tenant design for lambdas
  9. Conclusion

Key takeaways

Learn to deploy transformers in production
Serverless can be really good for many scenarios
Get huge instant scalability with serverless
Tons of savings in cost and headache


Any level of audience and whole ML community


{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}