**Introduction to recommendation systems with Python**

Submitted by
**Harshad Saykhedkar (@harshss)**
on Thursday, 22 June 2017

*videocam_off*

Technical level: Intermediate

### Abstract

Recommendation systems and algorithms have application in many domains, retail/e-commrce and content recommendation being the obvious ones. Research, software development and general interest in recommendation systems exploded in the last few years, especially after rise of e-commerce and Netflix’s competition on movie recommendations.

The landscape of recommendation algorithms can be little difficult for a beginner to navigate. Despite 100s of papers and dozens of libraries, the whole field actually stands on the back of handful of mathematical ideas and 4-5 landmark papers/algorithms. In this workshop, we will understand this fundamental ideas using simple code snippets.

### Why should you attend?

- If you are curious about how recommendation algorithms work.
- If you want to use recommendation algorithms in your work, but not sure where to start
- If you have a use case and trying to find if the recommendation algorithm is a solution.
- You know the overall idea of recommendation algorithms, but think the maths is not understandable.

### What will you learn?

Answers to the following questions can be gained,

- What are the fundamental ideas behind most of the recommendation algorithms?
- Understand how the maths behind the algorithms as well as engineering solutions is simple and intuitive.
- Actually understand the ideas by implementing in simple Python code.
- If I want to build one at my work, where should I start? What should I study further?
- Where does maths end and engineering challenges begin? How can I solve some of the engineering challenges?
- Do I need distributed computing solution for my X sized data? Why? Or Why Not?

### Outline

Roughly, we will proceed in the following order

- Fundamental ideas: representation, direct and indirect similarity computation, lookups
- Representation, vector spaces and matrices.
- Similarity computation, how do they relate to content based and neighbourhood based models.
- Evaluation of algorithm performance, trade-offs.
- Landmark algorithms in the field that changed the nature of the algorithms.
- Lookups, content based and neighbourhood based models, their differences.
- Engineering aspects, challenges and their solutions. Big data Vs. small data
- Further study
- Open question and answers

### Requirements

This is a workshop. To get full value out of a workshop, it is imperative that you try out the code as you learn. I am keeping the requirements to the minimum (standard Python data stack). You will need the following. Please **do not** wait till the workshop for installations. It will not be possible to help with installation at the venue.

- Laptop with
**fully charged**battery. - Python
**installed**on the laptop. - Install SciPy stack. Installation instructions are given here.
- Install scikit-learn. Installation instructions are given here.

Python 2 Vs. 3 won’t matter as long as you have SciPy and scikit-learn installed. The operating system also doesn’t matter (note that I’ll be using Linux during workshop). Optionally, you can have a Jupyter/IPython notebook installed for trying out code.

### Speaker bio

I work as head of data science at Sokrati, an advertising technology startup based out of Pune. I have 7+ years of experience in data science and started in the field before it was a buzzword :-P. I have built multiple products, handled consulting assignments and delivered solutions using machine learning, R and Python. I hold a Master’s degree in Operations Research from Indian Institute of Technology, Mumbai.

### Links

- I have in past conducted 3-4 workshops at HasGeek events. You can find links on hasgeek.tv, here is one for example https://hasgeek.tv/fifthelephant/2014-workshops/959-real-world-machine-learning.
- Here is link to my sporadically updated LinkedIn profile https://www.linkedin.com/in/harshadss/
- I also occasionally blog here https://harshadss.github.io/post/