Solving for Bias In E-Commerce Autosuggest

The ninth edition of The Fifth Elephant will be held in Bangalore on 16 and 17 July 2020.

The Fifth Elephant brings together over one thousand data scientists, ML engineers, data engineers and analysts to discuss:

Data governance
Data privacy and engineering for privacy including engineering for Personal Data Protection (PDP) bill.
Data cleaning, annotation, instrumentation and productionizing data science.
Identifying and handling fraud + data security at scale
Feature engineering and ML platforms.
What it takes to create data-driven cultures in organizations of different scales.

**Event details:

Dates: 16-17 July 2020
Venue: NIMHANS Convention Centre, Dairy Circle, Bangalore

Why you should attend:

Network with peers and practitioners from the data ecosystem.
Share approaches to solving expensive problems such as cleanliness of training data, annotation, model management and versioning data.
Demo your ideas in the demo sessions.
Join Birds of Feather (BOF) sessions to have productive discussions on focussed topics. Or, start your own Birds of Feather (BOF) session.

Contact details:
For more information about The Fifth Elephant, call +91-7676332020 or email sales@hasgeek.com

Hosted by

The Fifth Elephant

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

All submissions

Previous Next

Solving for Bias In E-Commerce Autosuggest

Submitted May 30, 2020

80 Million products across 80+ categories is what Flipkart’s Search enables discovery for. And, in a user’s journey of discovering products, she is shown with autosuggest suggestions to choose from while typing a query. These suggestions don’t just help users in choosing a well formed query with minimal typing effort, there is more to it.

This talk briefly touches upon the opportunities that decorating these suggestions brings to us.
After setting the context of how product popularity has led to a never ending loop in the system leading to this bias, I’ll be walking the audience through our journey of solving the problem of less sought categories not visible on autosuggest due to it.

We’ll start discussing our journey with an implementation that randomly chooses store decorations and the unexpected learnings that it gave us. Further ahead, we’ll look at the possible rewards that are relevant to autosuggest and the observations from our first reward based decoration selection algorithm which pretty much solves for the bias but misses to make its mark on the constraints that the problem poses. We’ll look at how looking at rewards as distributions gave further improvement but affected our metrics for quite some time initially. Introducing priors helped us with reducing the initial adjustment period and also showed interesting patterns around the impact of priors on overall convergence. We’ll close the discussion with the learnings at each step in our journey and the future work.

Outline

Problem Background

Autosuggest in search
Role of decoration as a two-way communication channel with the user
What is the bias that we are trying to solve for and why is it there in the first place (with illustrations)
Problem Definition : Goals & Constraints
Issues with the existing reward (continuing the same illustration)

Journey of solving for it

Explore Exploit as a solution
First step towards solving : Random Selection

User Experience View
Observations
Merits of starting with random exploration

Moving towards performance reward based exploration

Choice of reward and its pros and cons
Our way of implementing a performance based exploration algorithm
Convergence Illustrations
Movement in overall store visibility landscape
There was still scope for improvement, so what next?

Need to account for regret along with reward

Visualising store decorations as Beta distributions. sampling on them for decoration selection
Convergence improvement
Movement in overall store visibility landscape
Observations : Slower convergence