Anthill Inside 2019

On infrastructure for AI and ML: from managing training data to data storage, cloud strategy and costs of developing ML models

Propose a session

Previous proposal

NLP bootcamp

Attention based sequence to sequence models for natural language processing

Submitted by Madhu Gopinathan (@mg123) on Friday, 26 April 2019

Section: Workshops Technical level: Intermediate

Abstract

Ilya Sutskever and others introduced sequence to sequence learning with neural networks. Subsequently, Bahdanau and others introduced “attention”, similar to the human ability to focus with high resolution on a certain part, to improve the performance of sequence to sequence models in machine translation. Later, Vaswani and others introduced the transformer model which is built entirely on the idea of “self-attention”. These ideas have proved to be very useful in practice for building powerful natural language processing models (https://ai.googleblog.com/2016/09/a-neural-network-for-machine.html). In this hands on workshop using PyTorch, we will learn to build natural language processing models using these concepts.

Outline

  1. Introduction to sequence models
  2. Why sequence to sequence models? Build a sequence to sequence model on sample data.
  3. What is attention? Enhance the model and understand the value of attention
  4. Transformer architecture: sequence to sequence modeling using self-attention
  5. Build a transformer model on sample data

Requirements

Laptop

Speaker bio

Madhu Gopinathan is currently Vice President, Data Science at MakeMyTrip (MMT), India’s leading online travel company. At MakeMyTrip, he led the development of natural language processing models for Myra, MMT’s task bot for customer service (https://economictimes.indiatimes.com/jobs/rise-of-the-machines-when-bots-take-over-the-workplace/articleshow/66930068.cms).
Madhu holds a PhD in computer science from Indian Institute of Science, on mathematical modelling of software systems,and an MS in computer science from the University of Florida, Gainesville, USA.. He has collaborated with researchers at Microsoft Research, General Motors and Indian Institute of Science leading to publications in prominent computer science conferences.
He has extensive experience developing large scale systems using machine learning & natural language processing and has been granted multiple US patents.

Links

Comments

  • Abhishek Balaji (@booleanbalaji) Reviewer 2 months ago

    Hi Madhu,

    Thank you for submitting this proposal. This looks really good for a workshop and we’ll be in touch with you closer to the event to schedule and work out the finer details.

Login with Twitter or Google to leave a comment