The Fifth Elephant 2020 edition

The Fifth Elephant 2020 edition

On data governance, engineering for data privacy and data science

krishan goyal

@krishan1390

Context Aware Autocomplete at Scale at Flipkart

Submitted May 21, 2020

Autocomplete is a feature to provide relevant suggestions to the users at few keystrokes and thus reduce the users typing effort.

Additionally it helps the users to formulate a query corresponding to their intent and advises them correct domain terminology which is essential for ecommerce as it leads to better search experiences

One of the primary challenges of autocomplete is to rank suggestions for short prefixes which is really difficult if you don’t know what the user is looking for.

Eg: If user types “red”, users could be looking for “red chief shoes”, “redmi phones”, “red jackets”, “redmi earphones under 500”, etc.

With limited real estate (top 3-4 suggestions are primarily seen by the user), it becomes extremely important to rank suggestions accurately to improve the experience.

We can understand users intent from recent queries in the session and use it to show more relevant suggestions. Users typically reformulate queries with similar intent because they’re not satisfied with previous search results or want to continue to explore more.

We will go into the details of how to derive the user context and architectural challenges and solution to rank suggestions at low latency at high scale (Ranking of >5 Million documents at 10K QPS for 100Million+ dynamic user profiles and context at a latency requirement of < 20 ms)

We will also explain our Training architecture and how we can use a linear model by doing smarter feature engineering and our learnings along the way

Outline

  1. Problem Background
    a)Users Query Reformulation Patterns
    b)Semantic understanding and Product Taxonomy
    c)Scaling Challenges
  2. Solution Space
    a)Derivation of the ranking function to predict next query based on previous queries
    b)Data Sourcing
    c)Predicting at Scale
  3. Training Architecture
    a)Model and architecture selection
    b)Using prefix level granular data for autocomplete
  4. Feature Engineering
    a)Express requirements which can learn reformulation patterns and identify scope of personalisation
    b)Feature representation
  5. Evaluation
    a)Metrics Improvement
    b)Feature comparison
    c)Model strategies comparison
    d)Sampling Bias
    e)Examples
  6. Future Work

Requirements

Basic probability and ML understanding

Speaker bio

Krishan is a software engineer with Search team at Flipkart, working on improving Autocomplete and scaling the platform

Previously, he has worked at several startups including Moonfrog labs where he increased the multiplayer consistency and availability guarentees of the system and improved various user facing latencies to support 4X concurrent user traffic growth.

At Flipkart he scaled up the autocomplete stack to serve 10X more documents at 5X user traffic during sales and is now working on improving the ranking of autocomplete. He is interested in applying Machine Learning for such problems, and scaling the serving and pipeline processing systems further

Abhinav is a Data Scientist with Search team at Flipkart, working on implementing ML models for various aspects of Autocomplete. Prior to this, he completed his masters from IISc and also worked as software developer in Amazon.

Slides

https://docs.google.com/presentation/d/1e3-Tvb1TIkOWhh50RgQ_NoEk8u8nedtrGOLMub76k7g/edit?usp=sharing

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures