Previous proposalA Time Series Analysis of District-wise Government Spending
Next proposalBuilding Streaming platform using Kafka Streams
Personalized Recommendations for Computational Advertising
Building recommender systems for the task of computational advertising for Walmart.com has been an extraordinary journey. Particularly fascinating is the aspect of designing algorithms that cater to audiences who are at different stages of their purchase journey, or who might not have interacted with the site recently. This coupled with the scalability challenges and the interplay of factors like recency, seasonality, product pricing, trending items makes it very interesting, both from data science and core systems perspective. There is a sharp decline in clickthrough rates for successive ad slots. Hence, achieving high precision is imperative for driving performance for the ad campaigns.
In this talk, we focus on how our recommendation systems have evolved over time and the key lessons we learnt along the way. We share our insights on the relative performance and suitability of the collaborative filtering, graph based recommenders across the retargeting and prospecting efforts. We describe the challenges faced while ingesting item-signals along with user-item affinity in our Spark steaming pipelines and how we optimized those to meet the latency constraints. We also elaborate on the inherent counterfactual nature of recommendations which makes it pivotal to build robust offline evaluation systems and carefully design the A/B experiments. We summarize the observations in the experiments performed on algorithms incentivizing ad coherence v/s diversity and emphasize the role of online evaluation.
Recommender Systems for Computational Advertising
(ii) eCommerce : Diverse audience profiles - Browsers, Cart Abandoners, Dormant users.
Power of Associations
(i) Viewed Also Viewed / Bought Also Bought
(ii) Spark Streaming pipeline architecture.
(iii) Bulk purchase skew.
(iv) Cold start problem.
Graph based recommendations
(i) Random walk model.
(ii) Promoting diversity via reinforced random walks.
(iii) Spark GraphX Pregel API: Flexibility, Challenges.
Key Lessons : Data Science
(i) Impact of Item Attributes.
(ii) Role of context : Seasonality, Upcoming Trends.
(iii) CTR decline with ad slots : Importance of the first few!
(iv) Coherence v/s Diversity : Case Study.
(v) Fallbacks in case of few relevant items : Category affinity.
(vi) Fatigue : Refresh Important!
Key Lessons : Systems
(i) Model Complexity, Additional features => Latency!
(ii) Optimizations : Caching, Reduce shuffles.
(iii) Seamless A/B experiments : Dealing with multiple models.
Role of evaluation
(i) Metrics : MAP/NDCG/CTR
- Counterfactual - How to reduce bias?
- Experiment design.
- Optimal time for running experiments.
- Does the effect of a new change persist over time? Case Study.
Recent experiments with Deep learning
Member of data science team at @WalmartLabs. 4 years experience in tackling diverse large scale machine learning problems in computational advertising domain, with user-level recommendations, bidding and budget optimization being key focus areas. Masters graduate from IISc Bangalore with specialization in Data Mining and Pattern Recognition.