The Fifth Elephant 2017

On data engineering and application of ML in diverse domains

Modeling intent of the user using Probabilistic Machine Learning

Submitted by Sarah Masud (@sara-02) on Wednesday, 7 June 2017

videocam
Preview video

Technical level

Intermediate

Section

Full talk for data engineering track

Status

Submitted

Vote on this proposal

Login to vote

Total votes:  +4

Abstract

Understanding the user’s intent can help the product team dramatically improve the user’s experience. Be it adding the right products to a shopping cart, stocks to the portfolio or packages to a software stack, the user’s intent drives the choices and products added. When designing recommendation systems, modelling intent is non-trivial. The intent behind the action is hidden. This talk is about how the speaker used probabilistic machine learning to model intent.

Outline

Mad Hatter: Can you find a needle in the haystack? Alice: Yup - if I know that it’s an iron needle. Give me the magnet!

Consider the case of a new developer navigating the technology landscape to pick the libraries required to build her software application, or a new bride-to-be planning her bridal outfit on an e-commerce website.
What’s common? The dilemma - what’s the right set of choices that will click; the choices that will help them succeed in their intent.
In the world of Machine Learning, Recommendation Systems are widely used to solve the above problem. But the platform hosts a long tail to choose from. How could I make the recommendation system work?
By modelling intent first.
A two-stage model was built.
1. At the first stage, based on the user’s metadata, unsupervised clustering algorithm was employed to segment the users. This will help answer who the user is 2. For each user type, probabilistic machine learning models were used.

The talk discusses:

  • The common problem when modelling these kinds of problems from start
  • How to handle cold-start scenario? (no data)
  • How to scale the models when data scales - both in volume and velocity ?
  • How to automate intent identification?
  • Why Bayesian models?

A real-time demo of the application that’s hosted on OpenShift will be showcased.

Key Takeaways:

  • Acknowledge the fuzziness in determining the domain
  • Learn how this information can be used to improve the user experience.
  • See an example of ML in enhancing the product experience.
  • A Machine Learning pipeline for solving a problem that’s stated only implicitly.

Speaker bio

Sarah is an engineer at Red Hat where she works on developer-oriented analytic projects. Her bachelor’s thesis on Topics Modeling was presented at Ninth International Conference on Contemporary Computing. She is currently a mentor with the Next Scholars Program and the Global Give Back Circle. With her mentorship work, she hopes to increase the participation of women in tech. She also volunteers her time with Women Who Code, Lean In India, and Systers. She is ever enthusiastic about Data Science, Women in STEM, and Open Source.

Links

Slides

https://docs.google.com/presentation/d/1hP5RBuVLTtz_yGgKXX2jm_7Ly5e_0IKUrXE9T7WQdcc/edit?usp=sharing

Preview video

https://bluejeans.com/s/LrdSG

Comments

  • 1
    Zainab Bawa (@zainabbawa) Reviewer a year ago

    Sarah, please add draft slides or a detailed mind map explaining the content you will cover, and the overall flow of the proposed talk.

    • 1
      Sarah Masud (@sara-02) Proposer a year ago

      @zainabbawa I have added the draft slides, it is a WIP.

Login with Twitter or Google to leave a comment