Anthill Inside 2019

A conference on AI and Deep Learning



Khaleeque Ansari


Dataset Denoising : Improving Accuracy of NLP Classifier

Submitted Apr 30, 2019

Reliable evaluation for the performance of classifiers depends on the quality of the data sets on which they are tested. During the collecting and recording of a data set, however, some noise may be introduced into the data, especially in various real-world environments, which can degrade the quality of the data set.
In this talk we will discuss how we at MakeMyTrip are continuously improving performance of our deep learning based NLP classifier by correcting mislabeled data & reducing noise from our huge dataset.


  • Introduction of the problem statement.
  • Identifying mislabeled data.
  • Algorithm to correct mislabeled data.
  • Results/ Performance Improvement.

Speaker bio

I am Khaleeque Ansari, Lead Data Scientist at MakeMyTrip, where we’re developing Myra, MakeMyTrip’s task bot for assisting millions of our customers with post sale issues such as cancelling & modifying bookings, enquiring about flight status, baggage limits, refund status etc.
I have done my Bachelors in Computer Science from IIT Delhi. My principal research interests lie in NLP & have more than 5 years of experience building NLP models for the industry.



{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid access (members only)

Hosted by

Anthill Inside is a forum for conversations about risk mitigation and governance in Artificial Intelligence and Deep Learning. AI developers, researchers, startup founders, ethicists, and AI enthusiasts are encouraged to: more