The Fifth Elephant 2019

Gathering of 1000+ practitioners from the data ecosystem

Participate

Yes! Attention is all you need for NLP

Submitted by simrat (@sims) on Sunday, 14 April 2019

Session type: Lecture Session type: Full talk of 40 mins

Abstract

Natural language processing a very tough problem to crack. We humans have our own style of speaking and though many a times we might mean the same thing we say it differently. This makes it very difficult for the machine to understand and process language at human level.

At VMware we believe in delivering the best to our customers - the best of products, the best of services, the best of everything. In order to deliver the best in support we process huge volumes of support tickets to structure free text and provide intelligent solutions. As part of a project, we were required to process a set of support tickets, identify key topics/categories, map them to a very different document set etc. Even though I wouldn’t be able to go into the details of the algorithm we have built, I would like to help built an intuition on how best to go about solving such problems.

For instance consider the problem of identifying topics/categories from your document set. The first and the most obvious approach would be topic modelling. Yeah, we can do topic modelling and also tune it in many ways like using seed keywords for bootstrapping. This works well when we have very different document groups and the keywords are clear distinguishers, but what happens when you have a group of similar documents with keywords used in multiple different contexts. Clearly the topics are contextual and there is a need to go beyond keyword based modelling. In this talk we will understand how can we make machines understand the context, take a sample problem and break down the approach.

P.S The title of the talk is inspired by the paper released by google called “Attention Is All You Need” which introduces Transformers and we will learn more about them and how they learn context efficiently in the talk.

Outline

  • Brief evolution of NLP
  • Challenges in working with free text
  • Why do we need to understand context
  • How can we understand context
    – Overview of Transformers and Self-attention
  • Demonstration of context based sequence-to-sequence modelling with below use cases
    – Document summarization
    – Anomaly detection
  • Adaptation of attention network - heirarchical attention network
  • Key takeaways

Speaker bio

Data scientist with overall 8 years of experience in software development, applied research and machine learning. Currently working at VMware as Lead Data Scientist. Tech enthusiast and stationary hoarder :)

Links

Slides

https://docs.google.com/presentation/d/1HFLuAYt2vde6neyyvzGU-G97jB5v4HtsufUdC8BUFS4/edit?usp=sharing

Comments

  • Zainab Bawa (@zainabbawa) Reviewer 2 months ago

    Seems like you have reworked the framing of the proposal from the last time I read it. Few questions that need addressing:

    1. Why BERT for your use case? Why not something else?
    2. What did you compare BERT with? What were the comparison metrics?
    3. Participants don’t need to know abou the evolution of NLP. They’d be more interested in understanding the context of the problem you are solving and why this problem is important?
    4. What outcomes has the leveraging of BERT produced? What is the before-after scenario?
    5. What trade-offs/compromises were made by deploying BERT for your problem?
    6. How did the concerned teams adapt to the changes? What were the challenges?
    • simrat (@sims) Proposer 2 months ago

      Why BERT for your use case? Why not something else?
      Simrat: Word embeddings from word2Vec or glove are good representation of the words in the mathematical space to capture the semantic but unfortunately it doesn’t capture the context. For instance consider the word bank, which can occur in 2 contexts - financial bank and river bank. Understanding this difference is important and is the primary reason we would like to look beyond keyword based modelling

      What did you compare BERT with? What were the comparison metrics?
      Simrat: We have manually compared the results of topic modelling with BERT’s context dependent topic clustering. We saw clear advantage in leveraging BERT. I will not be able to share details from the actual project but will be able to demonstrate this using toy dataset.

      Participants don’t need to know abou the evolution of NLP. They’d be more interested in understanding the context of the problem you are solving and why this problem is important?

      Contextual embedding from BERT and its transformer based architecture is a new milestone in the NLP domain which supports transfer based learning. Transfer learning in NLP is a revolutionary moment as it will not only reduce training time but will also help build solutions way more smarter than before. In this talk I would like to talk about the evolution of NLP to draw comparison between available technical solutions/architectures, introduce the concept of transfer learning in NLP and BERT architecture with the help of motivating examples.

      What outcomes has the leveraging of BERT produced? What is the before-after scenario?
      Simrat: Context dependant, efficient solution with lesser training
      What trade-offs/compromises were made by deploying BERT for your problem?
      Simrat: BERT is a new architecture hence libraries aren’t matured enough that required us to spend extra effort refining as required by our work

  • Ashwin (@trds) 2 months ago

    hi Simrat

    the key outline of this talk seems to be around adapting and presenting the results of the Attention paper. could you perhaps give this as an introduction, and talk more about what else is ppssible using the Attention framework?

    e.g. applicability to heirarchical attention networks, text abstraction? or possibly something else?

    • simrat (@sims) Proposer 2 months ago (edited 2 months ago)

      This is a great suggestion, I can certainly talk about them

  • Abhishek Balaji (@booleanbalaji) Reviewer a month ago

    Hi Simrat,

    Please upload draft slides and preview video by Jun 20, 2019 based on the feedback shared above.

    • simrat (@sims) Proposer a month ago

      Will do. Thanks!

      • Abhishek Balaji (@booleanbalaji) Reviewer 27 days ago

        Hi Simrat,

        Moving this for evaluation under Anthill Inside.

Login with Twitter or Google to leave a comment