arrow_back Debugging deep nets
Submitted by Rajarshee Mitra (@rajarsheem) on Saturday, 28 May 2016
Section: Full talk Technical level: Intermediate
The proposed talk aims to provide a thorough explanation of language modelling (word and sentence embeddings), application of RNN, LSTMs to text - predicting text, mapping sentence to sentence, chatbots.
Deep Learning has heavily impacted natural language processing. Recent advancements include automatically writing poetries and essays, converting words and sentences to semantic representations called as embeddings which can be used to carry several tasks such as classification (sentiment, category), semantic similarity etc. This is called language modelling.
I propose to start my talk with neural language models - methods, improvements, how these vectors can effectively change the way we look at words and I will give some very interesting analogies to support it. I will give an overview on how these vectors can be used in some traditional problems like NER (detecting entities from text). Then we will introduce the RNN family and it’s application to sequence learning. RNN completely changes the way we deal with text (or sequence) and a whole new research area has opened. The RNN family can outperform the shallow MLPs in most of the basic tasks such as classification, analysing sentiments and ironies. RNNs can predict and generate text effectively. This concept is used in interesting applications like writng Shakespeare like dramas, source code of linux kernel. We will also talk on how RNN is different from MLP, how GRU (or LSTM) is different from RNN, how RNNs can be trained and how we can overcome some problems in vanilla RNNs. RNNs can also be used to embed sentences more effectively (i.e. converting sentences to vecctors), in sequence to sequence learning which is essentially mapping a sentence to another sentence. I will also demonstrate how an encoder and decoder works in sequence to sequence learning. This concept is used in both translating languages and forming conversational models. I will go more in-depth and talk about the state of the art attention models that has very recently arrived and will de-mistify them.
Proposed outline of talk :
1) Language modelling by feed forward nets (word embeddings) - CBOW, skip-gram.
Application to Named Entitiy Recognition.
2) paragraph vectors, sentence similarity.
2) What is RNN ? RNN vs MLP, RNN vs LSTM, GRU. Designing and Training. Backpropagation through time.
Difficulties - Gradient Explosion, Vanishing Gradients. Overcoming them.
3) Basic application of RNN, GRU, LSTM to text - sentence classification and sentiment analysis.
4) Predicting and generating text.
5) Sentence Embedding using LSTM or GRU
6) Attention models and Memory.
7) Sequence to sequence learning - encoding and decoding,
Example - conversational models (chatbots), machine translation.
Knowledge of Feed forward neural nets, one hot encodings.
I am a dedicated NLP practitioner focusing mainly on the intersection of DL to NLP. I am also a research engineer at Snapshopr, Bangalore. Currently working on some problems that appears in language - embedding methods, interpreting and generating text, seq2seq.