The Fifth Elephant 2015

A conference on data, machine learning, and distributed and parallel computing

Gagan Agrawal


Recommendation System beyond traditional Collaborative filtering

Submitted Jun 12, 2015

I would be sharing my thoughts and experiences at Snapdeal in building more personalized and relevant recommendation system for e-commerce industry by presenting mathematical, technological, machine learning and various other aspects related to it.


Though Collaborative filtering works quite well for companies like NetFlix but here in Snapdeal we are catering 12M huge product catalog and more than 100 categories which again comprised of 20-30 subcategories each. For us only Collaborative filtering doesn’t work well, because of the wide catalog and implicit feedback capturing instead of explicit and hence we developed a recommendation system which considers various other factors beyond collabarative Filtering.

In this session I would be discussing other factors (mentioned below) and their mathematical models that we have considered while building custom recommendation system for generating more personalized and relevant recommendations.

  1. User Category Affinity (to some more granular level)
  2. Content based product similarity
  3. product which goes well with already bought products.
  4. predicting the repurchase of already purchased products.
  5. Suggesting trending products based on user’s affinity.
  6. Capturing user’s feedback (implicit) to our served recommendations and use to improve relevancy.
  7. Collaborative filtering (we have also used this but with some weight-age)

Finally I would be concluding session with technical challenges in building scalable recommendation system with massive datasets and serving these recommendations in realtime.

Speaker bio

Gagan Deep Juneja is a Lead Engineer at Snapdeal and is leading several initiatives related to user personalization. He has close to 7 years of experience in the Software Industry. He has worked on several projects using Java/J2ee and Hadoop as the primary technology. He has been working with Big data technologies such as hadoop, spark, cascading, pig, hive, blur for the past couple of years. He has great interest in Machine learning concepts and working with supervised and unsupervised algorithms to get value out of data. He has an inclination to open source technologies and likes to delve into new frameworks. He is a committer and PPMC member to Apache Blur (incubating). He has spoken at various meetup groups in the past. He is a active blogger, and in his leisure time loves exploring new technologies and keeping himself updated with latest trends.



{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}