Machine Learning\, Deep Learning and Artificial Intelligence: concepts\, applications and tools.
Anthill Inside Miniconf – Pune Introduction to HasGeek\, Anthill Inside

Analytics without paralysis!
Analytics gets adopted when decision makers are positively influenced by data. But is the corporate world looking at data like this? Are analytics teams telling their stories to grab Decision maker's attention! Can they afford not to? Analytics doesn't need you to solve a technical problem but a "business" problem. And the only way to increase analytics adoption is to story tell. When presenting ideas to decision makers\, realize that it is your responsibility to sell – not their responsibility to buy. Stories are the best way to influence!Data has to climb out of a dashboard & tell a story.

And now AI & Machine learning are changing the way we can storytell. Today technology can work in tandem with human creativity to provide data-driven\, factual and interactive context to a story.

In this talk I will look at examples of how data insights can lead to embedding analytics into the fabric of a company.

(Not so) Straight (!) fun with Linear Regression
We'll test the well known concept of Linear Regression using a live experiment!
We may chance upon 'feature engineering' and 'multiple linear regression' as we pass by.

Speaker bio: I am a programmer with an odd love for maths. I enjoy simplifying heavy math protein into more absorbable amino acids\, only to be assimilated into plump biceps of confidence\, to be flexed when the situation demands.
I want to infect people with the addictive epiphanies from solving math problems.

and btw\, I have been working as a programmer on Data Science projects for the last 6+ years and as a programmer for last 13+ years.

Break

Bayesian methods in data analysis\, an introduction
1. Start with basics of bayesian methods\, few historical anecdotes about the multiple interpretations of probability.
2. Cover practical examples and problem statements which are best analysed with bayesian methods.
3. Show some live coding examples using open source government datasets from fields like econometrics or agriculture or healthcare.
4. Scratch the surface about algorithmic implementations: how the famous 'markov chain monte carlo' MCMC methods work.
5. Quick review of libraries/tools (pymc).
6. If you are excited with the idea\, how can you study further?

Speaker bio: I work as head of data science at onlines\, an advertising technology startup based out of Pune. I have 7+ years of experience in data science and started in the field before it was a buzzword :-P. I have built multiple products\, handled consulting assignments and delivered solutions using machine learning\, R and Python. I hold a Master's degree in Operations Research from Indian Institute of Technology\, Mumbai.

Bayesian methods have been my area of interest for a long time. Over the years\, I have formed few opinions about their usefulness and tried my best to understand the underlying theory\, that I would like to share through this talk.

Machine Learning in Molecular Biology
1. Brush up on high-school biology.
2. Introduction to some of the new biotechnologies that produce data.
3. Mixture models and why feature selection is important in an unsupervised learning kind of a setting\, with an example.
4. An example of a Biological problem than can be formulated as supervised learning.
5. Some pictures of genetically modified creatures from our collaborators (that show machine learning works!).

Speaker bio: I am part of a group of scientists at the National Chemical Laboratory\, Pune\, who use mathematics and computation to understand diverse aspects of Biology. I am a computer scientist by training and work primarily on designing probabilistic models as well as algorithms to learn them\, all with the hope of solving fundamental problems in genomics.

Applications of ML in Ad Tech and Lifecyle of a ML project
i. Intro
1. Bio
2. ML
3. PubMatic
4. IIMB
ii. Lifecycle of Machine Learning project
1. Understanding Problem Statement
2. Research - Understanding Industry\, Domain and Field of study
3. Collecting Data
4. Understanding and preparing data
5. Feature Selection and Imputation
6. Data Sampling
7. Hypothesis testing and Descriptive
8. Model Building
9. Tuning and Validation
10. Presenting the Model
11. Deployment and Verification
iii. Conclusion and key takeaways

Speaker bio: A Machine Learning/AI and Distributed Systems engineer who enjoys solving complex problems and design application and systems to work at scale.Have worked on engineering various complex projects which include building predictive ML project for online advertising\, deriving interseting insights on IPL(Indian Premier League)\, building connectors to offload data to Hadoop and even modifying Hadoop HDFS source code to make Namenode more scalable. I have B.Tech in Computer Science from VIT\, Pune and have specialization in "Big Data Analytics" from IIM Bangalore.

Lunch

How similar are two pieces of text? A moderately broad and deep dive in one of the fundamental topics in NLP. 1. Text Similarity
a. Definition and scope
2. Application Areas
a. Information retrieval
b. Paraphrase detection
c. Natural language inference
d. Plagiarism detection
3. Types of Similarity
4. Techniques
a. Supervised
i. Classical techniques
ii. Deep neural network based techniques
b. Unsupervised
i. Lexical
ii. Semantic
5. Automatic Short Answer Grading
a. Context and motivation
b. Word-similarity based techniques
i. Wisdom of students
c. Siamese LSTM-based supervised ASAG technique
6. Conclusion

Speaker bio: Shourya Roy is the Head and Vice President of American Express Big Data Labs (BDL) which he took up in December 2016. In this role he is responsible for establishing and executing the technical agenda for BDL working closely with the broader Decision Science community and business units. Shourya is leading a team of scientists and engineers in the areas of machine learning\, artificial intelligence\, deep learning and cloud computing.

Prior to joining American Express\, Shourya spent nearly fifteen years in the labs of IBM and Xerox playing several leadership roles in technical research\, research and strategic management\, customer facing business development. Shourya has a proven track record of conceptualize and initialize (by influencing business group leaders)\, design and develop (by participating and leading research teams) and transfer (with software development partners) innovation from research labs to real life operations and business.
Shourya's technical expertise spans Text and Data Mining\, Natural Language Processing\, Machine Learning\, and Big Data in which he is a well-known thought leader in several communities. His work has led to more than 60 publications in premier journals and conferences. He has been granted about 15 patents while tens of patent applications are currently in different stages of patent lifecycle. He is an active member of the ACM and ACL communities - as a part of which he has been associated with multiple conference and workshop organisations.
Shourya holds Ph.D.\, Masters and Bachelors Degrees in Computer Science from IISc Bangalore\, IIT Bombay and Jadavpur University respectively. Shourya also has an MBA from Faculty of Management Studies (FMS)\, Delhi University.
Beyond work Shourya is passionate about meeting and knowing people as well as following and playing multiple sports.

Applied Machine Learning for realtime #FairPlay against Fraud [sponsored]
1. Challenges at Dream11\, India's largest fantasy sports platform
2. Referral and promotional events\, user registration and game play.
3. User data collection and preparing training data
4. Regression and Gradient Boosted Models
5. Scaling up for real-time decision making
6. Business impact and key takeaways

Speaker bio: Aditya Prasad Narisetty is a Sr. Data Scientist @Dream11 building data driven products from fraud prevention\, User & Revenue estimation\, marketing attribution\, data pipelines and real-time M/L intelligence. Earlier\, he was heading the Data Science team at Craftsvilla building recommendation systems\, Data Platform\, Search\, Autosuggestion\, real-time inventory profiling\, and Fashion Recognition using CNNs.

He's an avid speaker in the Mumbai machine learning community presenting at GDG Mumbai'17\, AWS conf'16\, DataNativesX\, HYSEA IIT-H\, Mumbai AI meetup and a couple of other meetups in Mumbai.

Inference in Deep Neural Networks
- Intro DL Networks.
- How do typical Deep Learning Architectures look.
- A small section using example of one CNN and one LSTM on what mathematical operations do they perform.
- Advancements in Hardware
- Intel Knight CPU's
- Nervana
- Volta GPU's
- How exactly the operations are done on garden-variety hardware
- SIMD
- SIMT
- GeMM
- Different type of Architectures
- CPU and GPU's
- How do these work and bottlenecks
- Role Played by Memory access in speeds
- How a lot of times memory is the bottleneck instead of Compute
- Changes in algortihms made to utilise these functionalities
- Example of Google's Inception V3 model
- Two different type of RNN's
- Advice
- How to make your model more efficient at inference.
- Some practical examples

Speaker bio: Saurabh has been working at MAD Street Den\, Chennai as a Machine Learning Engineer since past year and a half\,specifically working on Deep Learning based products. He loves to train Convolutional Neural Networks of all types and sizes for different applications. Apart from CNN's he has special interest in recurrent architectures and discovering their powers. When he is not working on DL based stuff\, he loves to play around with micro-controllers.

Break

Doing Data Science on Cloud
Data scince on Cloud:
Importance of running DS on Cloud?
Options for running ML on cloud platform
- Using native compute and storage only.
- Hosted Data platfrom
- Machine Learning Services
- Congnitive API Services.
Demo : Using Cognitive Services of Google Cloud Platform- GCP vision API.
Options for running scalable DS models on Cloud:(Advantage\, Disadvantage\, Pricing)
- AWS
- Azure Machine learning
- Google Cloud ML
Other providers: IBM bluemix vs vs Domino datalab vs Datajoy
Demo: Running DS models using Tensorflow on Google Cloud ML(Using GPUs).

Speaker bio: Swapnil is right now contributing to Schlumberger Data Science team applying analytics in field of Oil and Natural Gas.Prior to this he was part of Snapdeal Realtime Analytics team as Lead Enginner.
Swapnil in the past has worked as Cloudera Trainer.He belives in learning and sharing his learning across the community.A frequent speaker in meetups and active presenter in conferences.
With more than 8+ years of experience\, Swapnil has contributed in Domains of BFSI\,Ad Serving and eCommerce with Hadoop\,Spark and GCP as primary tech stack.
Past conferences & Meetups:
- https://expe
- me-processing-and-watermarks-using-google-pub-su
- http://www.bigdatainno
- Dr Dobbs conference-Bangalore- April 11-12\,2014

Ekansh Verma is right now working with Schlumberger Data Scince team as Data scientist.He has done his Bachelors\, Biomedical Engineering from IIT Chennai.He has good understanding of Deep Learning concepts. His primary expertise lies in Image classfication.

Build intelligent\, real-time applications using Machine Learning
* Discuss the current-state-of-affairs for deploying Machine Learning models
* Discuss shortcomings of this approach
* Discuss the value of streaming data
* Brief introduction to Apache Kafka and Streaming applications
* Discuss how to use Apache Kafka to use ML models in real-time
* Demonstrate how we use a Demography Prediction model in real-time

Speaker bio: Jayesh leads the Personalisation team at Hotstar. He has been building streaming applications using Apache Kafka for the last 4 years. At Hotstar\, the personalisation team builds Machine Learning models for its 150 million users and delivers it real-time. He can be reached on Twitter at @jayeshsidhwani