Continuous online learning for classification tasks
Submitted by Saurabh Arora (@tanish2k) on Tuesday, 7 June 2016
At Airwoot (now acquired by Freshdesk), we model NLP-based margin-based classifiers to filter spam from relevant customer tweets/post on social media. We work with the language of social, and this introduces a challenge of continuously adapting our models to the change in social verbiage. The language of social is dynamic with new hashtags, acronyms and induced spelling mistakes forcing us to update our models frequently. Moreover, the relevance or the noise is not same for every user (very similar to the idea of relevance in email inbox).
So, we built a per-user statistical model to capture the preferences of users. It seems like its not a trivial problem to solve. This requires us to ensemble the global learning (remember the evolution in language) and local learning (basis on the feedback of the user) to classify the conversation. The local model must be able to capture the notion of
concept drift i.e the temporal (and recent) change in data. In this talk, we will showcase how we are able to do continuous online learning using simple but powerful perceptrons.
- Introduction to the problem of online continuous learning
- Examples of Concept drift
- Capturing of concept drift using perceptron
- Fallacies of a local-global model and the need for ensembling
- Towards a robust ensembling manager
Sit back and learn about a powerful learning technique. We will go under the skin for everyone to understand.
I am an entrepreneur and machine learning researcher. I dropped out of my doctoral program in 2012 and founded Airwoot, a company that help businesses deliver customer support on social using lot of mathematical tricks. Airwoot is now acquired by Freshdesk where I continue to build the technology that can teach machines to learn about natural language and emotions.
I completed my masters from DTU Denmark and was pursuing Ph.D from Hasso-plattner Institute, Berlin. In my academic career I contributed research at Macquarie University, Infocomm Research Singapore and the Swedish Institute of Computer Science.