Jul 2018
23 Mon
24 Tue
25 Wed
26 Thu 07:45 AM – 06:15 PM IST
27 Fri 07:45 AM – 05:35 PM IST
28 Sat
29 Sun
##About the conference and topics for submitting talks:
The Fifth Elephant is rated as India’s best data conference. It is a conference for practitioners, by practitioners. In 2018, The Fifth Elephant will complete its seventh edition.
The Fifth Elephant is an evolving community of stakeholders invested in data in India. Our goal is to strengthen and grow this community by presenting talks, panels and Off The Record (OTR) sessions that present real insights about:
**
##Target audience:
You should attend and speak at The Fifth Elephant if your work involves:
##Perks for submitting proposals:
Submitting a proposal, especially with our process, is hard work. We appreciate your effort.
We offer one conference ticket at discounted price to each proposer, and a t-shirt.
We only accept one speaker per talk. This is non-negotiable. Workshops may have more than one instructor.
In case of proposals where more than one person has been mentioned as collaborator, we offer the discounted ticket and t-shirt only to the person with who the editorial team corresponded directly during the evaluation process.
##Format:
The Fifth Elephant is a two-day conference with two tracks on each day. Track details will be announced with a draft schedule in February 2018.
We are accepting sessions with the following formats:
##Selection criteria:
The first filter for a proposal is whether the technology or solution you are referring to is open source or not. The following criteria apply for closed source talks:
The criteria for selecting proposals, in the order of importance, are:
No one submits the perfect proposal in the first instance. We therefore encourage you to:
Our editorial team helps potential speakers in honing their speaking skills, fine tuning and rehearsing content at least twice - before the main conference - and sharpening the focus of talks.
##How to submit a proposal (and increase your chances of getting selected):
The following guidelines will help you in submitting a proposal:
To summarize, we do not accept talks that gloss over details or try to deliver high-level knowledge without covering depth. Talks have to be backed with real insights and experiences for the content to be useful to participants.
##Passes and honorarium for speakers:
We pay an honorarium of Rs. 3,000 to each speaker and workshop instructor at the end of their talk/workshop. Confirmed speakers and instructors also get a pass to the conference and networking dinner. We do not provide free passes for speakers’ colleagues and spouses.
##Travel grants for outstation speakers:
Travel grants are available for international and domestic speakers. We evaluate each case on its merits, giving preference to women, people of non-binary gender, and Africans. If you require a grant, request it when you submit your proposal in the field where you add your location. The Fifth Elephant is funded through ticket purchases and sponsorships; travel grant budgets vary.
##Last date for submitting proposals is: 31 March 2018.
You must submit the following details along with your proposal, or within 10 days of submission:
##Contact details:
For more information about the conference, sponsorships, or any other information contact support@hasgeek.com or call 7676332020.
Hosted by
Rajdeep Dua
@rajdeepd
Submitted Mar 30, 2018
The talk will help developers and data scientists understand how to build ML Pipelines using PredictionIO.
In this talk we will cover how Apache PredictionIO (an open source Machine Learning Server built on top of a state-of-the-art open source stack) helps reduce time from writing a Proof of Concept for a ML model to production ready Model serving micro service with persistent model. We will also show case how Apache PredictionIO helps mix and match multiple models to come up with hybrid Predictions from multiple algorithms.
Define Machine Learning
Relationship between Data Mining and Other Fields and tools
A Classic Recommender Example: What is Missing?
A Classic Recommender Example : Beyond prototyping
How to deploy a scalable service that respond to dynamic prediction query?
How do you persist the predictive model, in a distributed environment?
How to make HBase, Spark and algorithms talking to each other?
How should I prepare, or transform, the data for model training?
How to update the model with new data without downtime?
Where should I add some business logic?
How to make the code configurable, re-usable and maintainable?
How do I build all these with a separate of concerns (SoC)?
Classic Recommender example : Apache Prediction IO
PredictionIO is a machine learning server for building and deploying predictive engines on production in a fraction of the time. Built on Apache Spark, MLlib and HBase
Event Server : Event Server : Collection Data Collecting Date
Example Event
Engine
Functions of an Engine
A. Train predictive model(s)
B. Respond to dynamic query
Deploying on Heroku/AWS/ GCE
Event Server and PIO Engine run as two Applications
Connected to the same PostgreSQL backend
Event Server has Single dynos
Web
PIO Engine has two dynos: Web, Train
Collaborative Filtering and ALS
Collaborative Filtering :
Collaborative Filtering(CF) is a subset of algorithms that exploit other users and items along with their ratings(selection, purchase information could be also used)
Target user history to recommend an item that target user does not have ratings for.
Assumption behind this approach is that other users preference over the items could be used recommending an item to the user who did not see the item or purchase
Matrix Factorization
Both users and items are mapped to a joint latent factor space of dimensionality ‘f’ where user-item interaction is modeled as inner product in this space.
Item i is associated with vector q
(where q measures the extent to which the item possesses the latent factors)
User u is associated with vector p
(where p measures the extent of interest the user has in the item.)
The dot product between q and p captures the interaction between user u and item I : i.e. users interest in item.
Key to model is finding vectors q and p.
Matrix Factorization: Alternative Least Square Method
ALS works by iteratively solving a series of least squares regression problems. In each iteration, one of the user- or item-factor matrices is treated as fixed, while the other one is updated using the fixed factor and the rating data.
User Factors : p
Item Factors : q
The factor matrix that was solved for is, in turn, treated as fixed, while the other one is updated. This process continues until the model has converged (or for a fixed number of iterations).
Demo ALS
Summary
Building ML pipeline is about selecting the algorithm , training and tuning the model. Taking it to production is key to realizing the true power on ML and AI Prediction
Internet connection, projector, microphone
Rajdeep Dua has over 18 years of experience in the Cloud and Big
Data space. Currently, he leads Developer Relations team at Salesforce
India. He also works with the Engineering teams at Salesforce building scalable
AI services, which
uses Hadoop and Spark to expose big data processing tools for
developers. He has worked in the advocacy team for Google’s Big Data
tools, BigQuery. He worked on the Greenplum big data platform at
VMware in the developer evangelist team. He worked closely with a team
on porting Spark to run on VMware’s public and private cloud as a
feature set. He has taught Spark and Big Data at some of the most
prestigious tech schools in India.
He has also presented BigQuery and Google App Engine at W3C conference
in Hyderabad (http://wwwconference.org/proceedings/www2011/schedule/w
ww2011_Program.pdf). He led Developer Relations teams at Google,
VMware, and Microsoft. He has spoken at hundreds of other conferences
on the cloud.
His contributions to the open source community are related to Docker,
Kubernetes, Android, OpenStack, and cloudfoundry. He has teaching
experience in big data at IIIT Hyderabad, ISB, IIIT Delhi, and College
of Engineering Pune.
LinkedIn profile can be found at https://www.linkedin.com/in/rajdeepd.
Twitter : @rajdeepdua
https://drive.google.com/file/d/1nCeFzyOsMggMIg7kbNHaKup_w2XDKCtO/view?usp=sharing
Hosted by
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}