The Fifth Elephant is India’s most renowned data science conference. It is a space for discussing some of the most cutting edge developments in the fields of machine learning, data science and technology that powers data collection and analysis.
Machine Learning, Distributed and Parallel Computing, and High-performance Computing continue to be the themes for this year’s edition of Fifth Elephant.
We are now accepting submissions for our next edition which will take place in Bangalore 28-29 July 2016.
We are looking for application level and tool-centric talks and tutorials on the following topics:
- Deep Learning
- Text Mining
- Computer Vision
- Social Network Analysis
- Large-scale Machine Learning (ML)
- Internet of Things (IoT)
- Computational Biology
- ML in healthcare
- ML in education
- ML in energy and ecology
- ML in agriculrure
- Analytics for emerging markets
- ML in e-governance
- ML in smart cities
- ML in defense
The deadline for submitting proposals is 30th April 2016
This year’s edition spans two days of hands-on workshops and conference. We are inviting proposals for:
- Full-length 40 minute talks.
- Crisp 15-minute talks.
- Sponsored sessions, 15 minute duration (limited slots available; subject to editorial scrutiny and approval).
- Hands-on Workshop sessions, 3 and 6 hour duration.
Proposals will be filtered and shortlisted by an Editorial Panel. We urge you to add links to videos / slide decks when submitting proposals. This will help us understand your past speaking experience. Blurbs or blog posts covering the relevance of a particular problem statement and how it is tackled will help the Editorial Panel better judge your proposals.
We expect you to submit an outline of your proposed talk – either in the form of a mind map or a text document or draft slides within two weeks of submitting your proposal.
We will notify you about the status of your proposal within three weeks of submission.
Selected speakers must participate in one-two rounds of rehearsals before the conference. This is mandatory and helps you to prepare well for the conference.
There is only one speaker per session. Entry is free for selected speakers. As our budget is limited, we will prefer speakers from locations closer home, but will do our best to cover for anyone exceptional. HasGeek will provide a grant to cover part of your travel and accommodation in Bangalore. Grants are limited and made available to speakers delivering full sessions (40 minutes or longer).
HasGeek believes in open source as the binding force of our community. If you are describing a codebase for developers to work with, we’d like it to be available under a permissive open source licence. If your software is commercially licensed or available under a combination of commercial and restrictive open source licences (such as the various forms of the GPL), please consider picking up a sponsorship. We recognise that there are valid reasons for commercial licensing, but ask that you support us in return for giving you an audience. Your session will be marked on the schedule as a sponsored session.
- Revised paper submission deadline: 17 June 2016
- Confirmed talks announcement (in batches): 13 June 2016
- Schedule announcement: 30 June 2016
- Conference dates: 28-29 July 2016
The Fifth Elephant will be held at the NIMHANS Convention Centre, Dairy Circle, Bangalore.
For more information about speaking proposals, tickets and sponsorships, contact email@example.com or call +91-7676332020.
Don’t just build a data lake, build data powerhouse.
Companies are now trying to become data oriented and trying to take decision based on data.
First step in moving towards data oriented decision is to collect data. Data Lake has become one of the recent buzz word in Big Data industry. Most of the time companies try to first build a Data Lake which will contain all their data. Most often dumping data into data lake translate into exporting all the data from various RDBMS databases [e.g Orders, Inventory], scraping all the log’s into their data lake. Once we have all the relevant data in data lake, we write various processing applications to extract data out of the source data. Above approach has many problems [e.g. huge upfront cost, missing information not currently tracked e.t.c] associated with it.
In this talk I will be proposing another approach for data driven system where instead of dumping all the data into central location, we identify the events/interactions/facts [ e.g Add to card event, Viewing a product e.t.c] in the company and store them for processing. I will be explaining how this approach becomes much more result oriented and much more agile than the dumping approach.
- Data Lake, the traditional way :
- Explains some current architecture to build data lake.
- Problems associated with the approach.
- Real Life Example.
- What is events/interactions/facts?
- Explaining terminology.
- Reason to track them.
- Business event
- Developer events
- Monitoring events
- Use Case Driven Development:
- Proposed Architecture:
- Benefits of proposed Architecture.
- Business Stakeholder
Akash Mishra is currently working as a Data Engineer at Badoo Trading Limited with more than 4 years experience in building large scale big data application for various client of ThoughtWorks Technologies. He has production experience with various big data technologies like Spark,Hadoop, Mesos e.t.c. He is passionate developer and has deep interest in Distributed Systems. He has co-organised Big Data Meetup for Pune & NCR. He has already given various talks in meetups and Geek Night & contributed to Apache Spark project.