The Fifth Elephant 2019

Gathering of 1000+ practitioners from the data ecosystem

Tutorial: Taking deep learning to production with RedisAI

Submitted by Sherin Thomas (@hhsecond) on May 30, 2019

Technical level: Intermediate Section: Full talk Session type: Demo Session type: Tutorial Status: Confirmed & Scheduled


Taking deep learning models to production, and doing so reliably, is one of the next frontiers of DevOps. This talk introduces RedisAI, a joint effort by [tensor]werk and RedisLabs. RedisAI is a Redis module that adds tensors & graphs as Redis data types, enabling execution of deep learning graphs on the CPU and GPU using multiple backends (PyTorch, TensorFlow, and ONNXRuntime) simultaneously, while exposing a full tensor API for scripting. In this talk, we will demonstrate deploying a deep learning model to production in a highly available environment and lay down the roadmap towards 1.0.


Year 2018 was the year of model servers. There were numeroius initiatives for building a reliable, interoperable deep learning deployment toolkits but so far we don’t have an easy tool that can reliably handle the deep learning models from all the frameworks. With the advent of Redis modules and the availability of C APIs for the major deep learning frameworks, it is now possible to turn Redis into a reliable runtime for deep learning workloads, providing a simple solution for a model serving microservice. In this talk we will introduce RedisAI, a joint effort by [tensor]werk and RedisLabs that introduces tensors and graphs as new Redis data types and allows to execute graphs over tensors using multiple backends (PyTorch, TensorFlow, and ONNXRuntime), both on the CPU and GPU. The module also supports scripting with TorchScript, which provides a Python-like tensor language that can be used to facilitate pre- and post-processing operations, like input shaping or output ensembling. In addition, thanks to its support for the ONNX standard, including ONNX-ML, RedisAI is not strictly limited to deep learning, but it offers support for general machine learning algorithms. In this talk, we will demonstrate a full journey from training a model to deploying to production in a highly available environment. Last, we will lay down the roadmap for the future, like automated batching, sharding, integration with Redis data types (e.g. streams) and advanced monitoring. The talk will include sample code, best practices and a live demo.

Who should attend this tutorial:

  • Leads who are managing a DL, ML, and traditional ML teams
  • DL/ML engineers
  • DL/ML researchers who needs to to interact with the engineering team for production deployment
  • DevOps
  • If you plan to do production deployment of DL/ML/traditional ML products


Participants attending this tutorial must install the following software before attending the session:

  1. Laptops with Linux or Mac OS installed. If you have a Windows machine, set up a cloud instance.
  2. Docker
  3. Python 3.6+
  4. pip install PIL
  5. pip install Numpy
  6. pip install redisai==0.3.0
  7. pip install mlut[all]
  8. Jupyter notebook

Speaker bio

I am working as a part of the development team of tensorwerk, an infrastructure development company focusing on deep learning deployment problems. I and my team focus on building open source tools for setting up a seamless deep learning workflow. I have been programming since 2012 and started using python since 2014 and moved to deep learning in 2015. I am an open source enthusiast and I spend most of my research time on improving interpretability of AI models using TuringNetwork. I am part of the core development team of Hangar and RedisAI and a constant contributor to PyTorch source. I also have authored a deep learning book. I go by hhsecond on internet



Preview video


  • Abhishek Balaji (@booleanbalaji) a year ago

    Hi Sherin,

    We want to consider this for The Fifth Elephant. The conference is scheduled for 25-26 July. Since I missed checking with you earlier, would you be available during those dates?

  • Sherin Thomas (@hhsecond) Proposer a year ago

    25th & 26th sound good

    • Abhishek Balaji (@booleanbalaji) a year ago

      Thanks, Sherin

  • Sherin Thomas (@hhsecond) Proposer a year ago

    Abhishek, When will you be able to give me a confirmation about the talk?

    • Abhishek Balaji (@booleanbalaji) a year ago

      Hi Sherin, I’ll communicate the next steps in a couple of days. We’re trying to see how this talk would fit – as a tutorial/talk. The initial feedback I’ve got is that it looks like a walkthrough of a new tech/product.

      • Sherin Thomas (@hhsecond) Proposer a year ago

        The product is still in beta and going General Availability in a month. So yes, walkthrough of a brand new product that can solve basically all the existing deployment problems for DL/ML models. I will have to do some ops work related to the session if it is going to be selected. Sorry to put more onto your plate, I know you must be super busy but I really appreciate if you can let me know this week itself.

        • Abhishek Balaji (@booleanbalaji) a year ago

          Hi Sherin, thanks for the clarification. Do work on the slides as mentioned. We’ll most likely schedule this as a tutorial for 60-90 mins. I’ll confirm this today after checking with reviewers.

  • Sherin Thomas (@hhsecond) Proposer a year ago

    Hey Thanks for the info. Looking forward for the confirmation about tutorial or talk

  • Sherin Thomas (@hhsecond) Proposer a year ago

    Also, quick question: Whey you say tutorial; it’s not hands-on correct. I’ll be guiding people through the tool and the features?

    • Abhishek Balaji (@booleanbalaji) a year ago

      Yep, hands-on workshops are a separate format and are typically 3-6 hours long. The tutorial need not be hands-on, but will be more interactive than a talk and typically 60-90 mins.

  • Sherin Thomas (@hhsecond) Proposer a year ago

    Awesome, I could do that. Let me know and thank you so much for prompt responses in your busy schedule.

  • Abhishek Balaji (@booleanbalaji) a year ago

    Hi Sherin,

    Here’s some feedback on your proposal:

    • Add more comparisons and critical evaluations with other tools. Focus on Redis AI is okay, since it’s open source and available for everyone to use, but the question “Why should someone use RedisAI over other tools?” still holds.

    • We need to know more about the benefits of using this over other deployment options (video also starts with references to tensorflow/serving and the like) . I think it makes sense to have a concrete feature-wise comparison at the start too (especially if it is a talk)

    • Since production is highlighted, are there numbers that can be shared on the performance of this framework? max. model parameters/sizes, requests per second, availability etc…?

    • Also, seems like RedisAI is not ready for production yet? From the mention about future roadmap and it still being in a pre-1.0 state (beta release only)

    • The sample code, live demo and best practices make sense and are essential for people getting started. But I am not sure if they might be too minimal of an example.

    • It would be good to have a feature comparison for a stronger sell and additional complex examples (for example, mention “using multiple frameworks as part of a single pipeline”)

    This tutorial would be perfect if you can walk through multiple frameworks used in the sample RedisAI pipeline.

    • Abhishek Balaji (@booleanbalaji) a year ago

      As to why people should pick RedisAI over other tools/frameworks:

      • the github repo for the project mentions compatibility with other frameworks and minimalist integration requirements as positives
      • it would be good to have a feature comparison for a stronger sell and additional more complex examples (for they mention “using multiple frameworks as part of a single pipeline”)
    • Sherin Thomas (@hhsecond) Proposer a year ago

      Hi Abhishek, I can make changes as you suggested but It would be great if you can let me know what’s the format you people are planning for it. It makes absolute sense to put more multbackend examples if it’s a tutorial but other wise I don’t think I’ll have time for that (25 - 30 minutes is going to be really difficult)
      - I’ll add an advantages slide
      - Advantages slide should second point as well, correct me if I misunderstood your suggestion.
      - There are benchmarks but those are not official yet, I have to check with the RedisLabs team if they are OK for me to share that. Let me get back to you on this
      - General Availability of RedisAI is scheduled in July (or early Augest) which means, it will officially production ready in one or two or three weeks. That being said, I know couple of companies using it in production and some others who are running a very small subset of their production stack in RedisAI Enterprise trial version
      - Tutorial Vs Talk (Feel free to dictate me if you think it’s good to include some regardless of tutorail, talk)
      - Tutorial Vs Talk (Again, feel free to dictate :)

      • Abhishek Balaji (@booleanbalaji) a year ago (edited a year ago)

        Hi Sherin, your tutorial is currently scheduled for an hour. We’re also open to scheduling it for 90 mins

        • Sherin Thomas (@hhsecond) Proposer a year ago

          Hi Abhishek, I am OK for 90 minutes as well as far as the audience are patient enough for that. Considering it’s not hands-on and without any breaks in between, I assume it’s going to be overwhelming for the first-timers which is probably going to be the majority. What do you think?

          • Abhishek Balaji (@booleanbalaji) a year ago

            We are not expecting first timers and and the audience is free to join any session they want. The tutorial has a chance of becoming very overwhelming, but you’ll also be working with a smaller audience which would be more invested in learning about this. Hence working on the pace would help to a large extent.

            • Abhishek Balaji (@booleanbalaji) 12 months ago

              Some more feedback:

              A few comments:

              1. comparison wrt status-quo (e.g., kubernetes ecosystem) is missing (what we or most of the DL teams do right now is have python-based api servers (django or celery) that can be deployed as part of kubernetes ecosystem)

              2. why should one use RedisAI as opposed anything else is not answered

              3. I have a tensorflow model. How do I go about using RedisAI to deploy the model. step by step guide is missing (this can be part of the slides or as part of a notebook).

              all these points can be answered in tutorial style as well (perhaps a 45 min tutorial or a 2 hour session)

              if the author addresses the above 3 concerns, the topic itself is interesting to deep learning community

  • Sherin Thomas (@hhsecond) Proposer 12 months ago

    Hi Abhishek, Thanks for the suggestions. Let me see how can I fit these into the talk (tutorial). I am still fighting with myself to decide the duration. I agree these are some of the questions DL community is seeking answers for, so will have it in my slides.

Login to leave a comment