Anthill Inside 2019

A conference on AI and Deep Learning


Out of Distribution Detection in Deep Learning Classifiers

Submitted by Akhil Lohia (@alohia) on Monday, 29 April 2019

Preview video

Section: Crisp talk Technical level: Intermediate Session type: Lecture


A common problem when using deep neural network models for classification problems is handling out of distribution data. In such scenarios, the classifiers tend to assign the new data point to one of the known classes with high probability, which can lead to unintended and potentially harmful consequences. At MakeMyTrip (MMT), we use deep learning based NLP classifier for understanding intent of utterances received by MMT’s chatbot Myra. Here we discuss how we handle user utterances of intents for which Myra has not been trained.


  • Introduction to in-sample and out-of-sample distributions when using ML models.
  • Example scenario:
    • text is misclassified where true class was not present in the training data.
    • Possible approaches - creating an ‘other’ class requires a lot of training data - not feasible
  • Proposed approach:
    • learn a distributed representation of the target (embeddings) instead of discrete classes.
    • compare the predicted distribution of in-sample versus out-of-sample examples.
  • Create a 2-step classifier which first decides to classify the example or not.
  • Present final results about how such examples are handled in Myra and how it eases the discovery of new intents.

Speaker bio

I am Akhil Lohia, data scientist at MakeMyTrip, where we’re developing Myra, MakeMyTrip’s task bot for assisting millions of our customers with post sale issues such as cancelling & modifying bookings, enquiring about flight status, baggage limits, refund status etc. Here I am presenting a very commonly faced problem in machine learning based classifiers, where such models can give unexpected results for out of sample data, and one approach to deal with this problem based on recent research. I obtained my BS in Economics from IIT Kanpur and MS in Data Science from Barcelona Graduate School of Economics. I have worked on research projects involving RCTs and estimation of structural models related to Indian demographics before joining MakeMyTrip.



Preview video


  • Shashank Dixena (@dixena) 5 months ago

    Great work Akhil, looking forward for the talk.

  • Abhishek Balaji (@booleanbalaji) Reviewer 5 months ago

    Hi Akhil,

    Thank you for submitting a proposal. For us to evaluate your proposal, we need to see detailed slides and a preview video. Your slides must take the following points into consideration:

    • Problem statement/context, which the audience can relate to and understand. The problem statement has to be a problem (based on this context) that can be generalized for all.
    • What were the tools/options available in the market to solve this problem? How did you evaluate these, and what metrics did you use for the evaluation? Why did you decide to build your own ML model?
    • Why did you pick the option that you did?
    • Explain how the situation was before the solution you picked/built and how was the fraud/ghosting after implementing the solution you picked and built? Show before-after scenario comparisons & metrics.
    • What compromises/trade-offs did you have to make in this process?
    • What are the privacy, regulatory and ethical considerations when building this solution?
    • What is the one takeaway that you want participants to go back with at the end of this talk? What is it that participants should learn/be cautious about when solving similar problems?

    As next steps, we’d need to see the detailed and/or updated slides by 21 May, in order to close the decision on your proposal. If we dont receive an update by 21 May, we’d have to move the proposal for consideration for a future conference.

    • Akhil Lohia (@alohia) Proposer 5 months ago

      Hi Abhishek,

      I have updated the slides with more details as you have mentioned.


Login with Twitter or Google to leave a comment