The Fifth Elephant 2017

On data engineering and application of ML in diverse domains

Sarah Masud


Modeling intent of the user using Probabilistic Machine Learning

Submitted Jun 7, 2017

Understanding the user’s intent can help the product team dramatically improve the user’s experience. Be it adding the right products to a shopping cart, stocks to the portfolio or packages to a software stack, the user’s intent drives the choices and products added. When designing recommendation systems, modelling intent is non-trivial. The intent behind the action is hidden. This talk is about how the speaker used probabilistic machine learning to model intent.


** Mad Hatter: Can you find a needle in the haystack?
Alice: Yup - if I know that it’s an iron needle. Give me the magnet! **

Consider the case of a new developer navigating the technology landscape to pick the libraries required to build her software application, or a new bride-to-be planning her bridal outfit on an e-commerce website.
What’s common? The dilemma - what’s the right set of choices that will click; the choices that will help them succeed in their intent.
In the world of Machine Learning, Recommendation Systems are widely used to solve the above problem. But the platform hosts a long tail to choose from. How could I make the recommendation system work?
By modelling intent first.
A two-stage model was built.
1. At the first stage, based on the user’s metadata, unsupervised clustering algorithm was employed to segment the users. This will help answer who the user is
2. For each user type, probabilistic machine learning models were used.

The talk discusses:

  • The common problem when modelling these kinds of problems from start
  • How to handle cold-start scenario? (no data)
  • How to scale the models when data scales - both in volume and velocity ?
  • How to automate intent identification?
  • Why Bayesian models?

A real-time demo of the application that’s hosted on OpenShift will be showcased.

** Key Takeaways:**

  • Acknowledge the fuzziness in determining the domain
  • Learn how this information can be used to improve the user experience.
  • See an example of ML in enhancing the product experience.
  • A Machine Learning pipeline for solving a problem that’s stated only implicitly.

Speaker bio

Sarah is an engineer at Red Hat where she works on developer-oriented analytic projects. Her bachelor’s thesis on Topics Modeling was presented at Ninth International Conference on Contemporary Computing. She is currently a mentor with the Next Scholars Program and the Global Give Back Circle. With her mentorship work, she hopes to increase the participation of women in tech. She also volunteers her time with Women Who Code, Lean In India, and Systers. She is ever enthusiastic about Data Science, Women in STEM, and Open Source.



{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

All about data science and machine learning