The Fifth Elephant 2017

On data engineering and application of ML in diverse domains

ankit kohli

@ankitko

ML For Personalization At Scale @ Nearbuy

Submitted Apr 12, 2017

Here I will try to explain how we use ML to give personalized recommendations to the customers.
Also I will explain how have we setup our Big Data Pipeline using KAFKA , SPARK and HBASE .
The amount of data we process daily and how to we handle anamolies and our learning track .
I will also discuss about vvarious ML Algos that we are using and how to use them in SPARK .
Understanding of Collaborative Filtering ( and it use cases ) and how to use it in SPARK

Outline

Data Pipeline Dicussion
Data Modeling - Avro/ Parquet
Discussion over how data from various source ( Real Time & Batch ) is ingested using Kafka ,
transformed using SPARK and stored in HBASE
Then how data is modeled and fed into ML Pipelines using SPARK
And Then about varioud ML Algos that we run to generate personalizations and how it is used in Nearbuy’s World.
Finally, ways to evaluate your ML Algos.

Various Data Sources -> kafka -> Spark -> Hbase
|
ML Lib Algos - Collaborative Filtering

                                                                                                                     Common Problems that comes in each step 
                                                                                                                     Brief about Kafka , Hbase and in depth about SPARK

Speaker bio

Currently, I am employed as a Software Engineer at Nearbuy.In the past I have worked in Practo , Make My Trip.
Currenly my interest is in Big Data and I am actively involved in building projects to better the customer experience.
Working on Machine Learning to develop Personalization at Nearbuy .
I have overall 7 Years of expereince in technology.

Slides

https://www.slideshare.net/ankitkohli1/customer-personalization-nearbuy

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures