Jul 2013
8 Mon
9 Tue
10 Wed
11 Thu 09:30 AM – 04:30 PM IST
12 Fri 10:15 AM – 05:30 PM IST
13 Sat 10:15 AM – 05:30 PM IST
14 Sun
(Skip ahead to session proposals)
In 2013, commodity hardware and computing capacity for storing and processing large and small volumes of data are easily available on demand. The bigger issues pertain to questions of how to scale data processing, handle data diversity, manage infrastructure costs, decide which technologies work best for different contexts and problems, and build products from the insights and intelligence that the data is presenting to you.
The Fifth Elephant 2013 is a three-day workshop and conference on big data, storage and analytics, with product demos and hacker corners.
The Fifth Elephant 2013 invites proposals on use cases and real-life examples. Tell us what specific problem you faced, which technology/tools worked for your use case and why, how you have developed business intelligence on the data you are collecting, and analytics tools and techniques you employ. Our preference is for showcasing original work with clear take-aways for the audience. Please emphasize these in your proposal.
The conference will have two parallel tracks on 12th and 13th July:
This year we are adding a preliminary day of workshops, on 11th July, to provide attendees more in-depth, hands-on training on open source frameworks and tools (Pig, Hadoop, Hive, etc), commercial solutions (sponsored), programming languages such as R, and visualization techniques and tricks, among others.
We have a demo track for startups and companies who want to showcase their product to customers at The Fifth Elephant 2013 and get feedback. Slots are also open for 4-6 sponsored sessions for companies who want to talk about their technologies and reach out to developers, CTOs, CIOs and product managers at The Fifth Elephant. For more information on demo and sponsored session proposals, write to info@hasgeek.com.
HasGeek believes in open source as the foundation of the internet. Our aim is to strengthen these foundations for future generations. If your talk describes a codebase for developers to work with, we require that it is available under a license that does not impose itself on subsequent work. This is typically a permissive open source license (almost anything that is listed at opensource.org/licenses and is not GPL or AGPL), but restrictive and commercial licenses are also considered depending on how they affect the developer’s relationship with the user.
If you’d like to showcase commercial work that makes money for you, please consider supporting the event with a sponsorship.
Voting is open to attendees who have purchased event tickets. If there is a proposal you find notable, please vote for it and leave a comment to initiate discussions. Your vote will be reflected immediately, but will be counted towards selections only if you purchase a ticket. Proposals will also be evaluated by a program committee consisting of:
Emphasis will be placed on original work and talks which present new insights to the audience.
The programme committee will interview proposers who have received maximum votes from attendees and the committee. Proposers must submit presentation drafts as part of the selection process to ensure the talk is in line with the original proposal and to help the program committee build a coherent line-up for the event.
There is only one speaker per session. Attendance is free for selected speakers. HasGeek will cover your travel to and accommodation in Bangalore from anywhere in the world. As our budget is limited, we will prefer speakers from locations closer home, but will do our best to cover for anyone exceptional. If you are able to raise support for your trip, we will count that towards an event sponsorship.
If your proposal is not accepted, you can buy a ticket at the same rate as was available on the day you proposed. We’ll send you a code.
Discounted tickets are available from http://fifthelephant.doattend.com/
The program committee will announce the first round of selected proposals by end of April, a second round by end-May, and will finalize the schedule by 20th June. The funnel will close on 5th June. The event is on 11th-13th July 2013.
Hosted by
Accepting submissions
Not accepting submissions
S
Supreeth Proposing Building a massively multiplayer online role-playing game (MMORPG) using CloudTo introduce users to building MMORPGs using the cloud. To share experiences of doing the same. more
Storage and Databases
Beginner
|
SG
Shekhar Gulati Building Location Aware Applications using MongoDBThe benefits for attendees Attendees will learn how they can use MongoDB geo spatial indexing capabilities to build location aware applications. more
Workshops
Beginner
|
AV
Arthi Venkataraman Similar entity detection in large dataUnderstand Similar Entity recognition and it’s industrial applicability more
Analytics and Visualization
Intermediate
|
DS
Deepak Shenoy Money Talks: Analyzing Financial Market DataFinancial markets produce a ton of data, but how can we look at them in useful ways, as compared to “looks-great-what-do-I-do-now”. By useful I mean to traders, to fraud-detectors, to investors and even to company management. Learn about the techniques of market data analysis from someone who’s done all the wrong things, sometimes in spectacular fashion. more
Analytics and Visualization
Intermediate
|
MR
Mahesh Rangarajan Big Data Analytics for improving Patient Care systems at hospitalsShare real world experience and learning from implementing large scale patient care systems at hospitals leveraging Big Data platform. more
Analytics and Visualization
Advanced
|
pb
prakash babu Implementing a Large Scale Surveillance System Using Big DataTo Share the Key Learnings,Architecture Patterns and Best Practices in implementing a Large Scale Real time Analytics System leveraging Big Data Technologies. more
Analytics and Visualization
Advanced
|
KK
Kashyap Kompella A 360 degree view of 3-D printing(1) Provide a brief tour of the exciting world of 3D printing & (2) Discuss opportunities and applications in the Indian context more
Analytics and Visualization
Beginner
|
MS
Mayank Sharma Transferring Gigabytes of Data to cloud at 10mbps on your 10mbps linkTCP/IP is known to be a pretty robust, reliable and fair mode of data transport. But what about the actual real throughput when you are transferring GB’s of data on a 10mbps link to cloud which is maybe 15 hops away. more
Storage and Databases
Intermediate
|
PR
Pankaj Risbood Extracting consumer trends in real time using 100 billion tweets.Connected consumers express everything they feel about products, services, brands in social media. How consumer feel and engage with products is very important for retailers to make better merchandizing decisions. more
Analytics and Visualization
Intermediate
|
KK
Karthik Kastury Unlocking the Potential of Data for Everyday Developers and Product ManagersDevelopers love writing code, and shipping new features. Product Managers love observing customer behaviours and taking product decisions! more
Analytics and Visualization
Intermediate
|
PB
Prashanth Babu Big Data, Real-time Processing and StormParticipants will learn: And understand concepts and salient features of Storm. more
Workshops
Beginner
|
Y
Y Uncovering the truth in sales through VisualizationHow can visualization help a sales manager review his team’s performance? more
Analytics and Visualization
Beginner
|
SB
Shanker Balan Build A Cloud With Apache CloudStack For Big DataTo build a Proof-Of-Concept IaaS cloud that can power big data workloads using Apache CloudStack Cloud Management Platform and Apache Hadoop. more
Workshops
Intermediate
|
VS
Viral B. Shah Julia: A fresh approach to technical computing and data scienceJulia is a new high performance, open source, dynamic language for technical computing and especially relevant for the upcoming field of data science. I will describe the rationale and the vision behind julia, key language features, and show some demos so that attendees can get a feel for the language. I will also discuss Julia’s open source development process and the community that keeps adding… more
Analytics and Visualization
Beginner
|
SS
Sameer Segal Big Data at the Base of the PyramidI was actually tempted to give this talk a tongue-in-cheek title like “Big Data with No Data” because “data” at the Base of the Pyramid (BoP) is an ephemeral and elusive thing. more
Analytics and Visualization
Intermediate
|
SS
Sirish M Simha Big Data Product Ideas - Building Interactive BI AnalyticsAs more and more companies are embracing Big Data ( read as Hadoop for now) technologies, there are opportunities for companies and geeks to address the ‘GAPS’ by designing and implementing products on Hadoop platform. more
Analytics and Visualization
Intermediate
|
VB
Vishwanath Belur Big Data Predictive Analysis in SAP HANA with SAP Predictive AnalysisTraditionally, analyzing big volumes of data using statistical techniques was either not possible or was taking a lot of time and hence businesses were losing out the advantage of storing very huge amount of historical data. With SAP HANA, it is possible to perform big data predictive analytics in real time so that businesses can make informed real time decisions. SAP Predictive Analysis simplifi… more
Analytics and Visualization
Intermediate
|
RB
Regunath Balasubramanian Latency and Fault tolerance in OLTP @ 1.5 billion/day service callsUser perceived Availability and Experience is important for any eCommerce site. Achieving this is not easy for distributed systems that run multiple platforms and access multiple resources, data sources. The data sources span MySQL, Key-Value stores and Columnar databases storing OLTP data to the order of tens of millions. This talk describes how Flipkart built its website to manage Latency and F… more
Storage and Databases
Intermediate
|
RK
Raghu Kashyap Big Data is it a fad or future?See how Hadoop has progressed over the last 4 years from an early adapter organization standpoint. Learn about the use cases that are operational and not just a POC more
Storage and Databases
Intermediate
|
RN
Ritesh Nayak Deciphering the organizational DNA - mining internal dataIf you attend this talk, you will : See brilliant visualizations on data inside an enterprise and learn some social network analysis/graph theoretic techniques. more
Analytics and Visualization
Beginner
|
SK
Shailesh Kumar Co-occurrence Analytics: A versatile framework for finding interesting needles in crazy haystacks!In this session we will learn about a new way of thinking about data mining and big data analytics, “Co-occurrence Analytics” - a unified framework for mining latent insights in a wide variety of data of the form: “relationships between entities”. We will show how the framework can be used to discover... more
Analytics and Visualization
Advanced
|
VR
Vishnu H Rao MySQL Robbins - Various Flavors of Files & Buffers it UsesLearn about the various Files & In-Memory Buffers MySQL creates and how it uses them. more
Storage and Databases
Beginner
|
VR
Vishnu H Rao Reporting Using MySQL Multi-Source ReplicationHow can we do real time reporting using MySQL when data is spread across different MySQL instances? more
Storage and Databases
Beginner
|
SS
Srihari Srinivasan It takes two to tango! - Is SQL-on-Hadoop the next big step?To explore the trend of SQL-on-Hadoop. This talk will focus on some of the recent attempts (OSS and Commercial) to get SQL running on Hadoop. more
Storage and Databases
Intermediate
|
Build a Queue Based Concurrent Task Processor (using Python)Learn how to develop a Persistent Queue based Task Processor using simple tools like MongoDB and Python. more
Workshops
Advanced
|
CM
Chandramouli Mahadevan Analyzing Terabytes of Data with Google BigQueryAn attendee would understand how they can use Google BigQuery to analyze very large data sets in a very simple fashion using interactive SQL analysis. We will also offer insight into the implementation of the Dremel engine that powers Google BigQuery. more
Analytics and Visualization
Beginner
|
SR
Sandeep Ravichandran Building a high performance distributed crawlerThis talk describes how we use NoSQL databases like Mongodb, Redis to store a huge amount to data and analyze it using tools like elasticsearch. It also aims to provide insight to leveraging different cloud services to build a high performance cluster for web crawling. more
Storage and Databases
Intermediate
|
K
Koushik Breaking Barriers - Showing the funnyTo have people troll at me and with me when I talk technology and a lot of other things. If you’ve seen comic relief, The tonight show, or comedy central well, I have also seen these shows. Hence I assume you know what is going to come at you. Amen! more
Analytics and Visualization
Intermediate
|
AM
Aaron Morton Apache Cassandra for Fun and ProfitAttendees will lean about data modelling and performance in Cassandra, which patterns to follow and which to avoid. They will also lean how to evaluate the performance of various models and plan for data growth. more
Storage and Databases
Intermediate
|
SK
Shailesh Kumar MapReduce and the "Art of Thinking Parallel"The goal of the session is to take the audience from the “MECHANICS of using MapReduce” (to do simple slicing and dicing of BigData) to the “ART of using MapReduce” to solve more complex problems that at first glance look “unnatural” for MapReduce! more
Analytics and Visualization
Advanced
|
NP
Neeta Pande Big Data Analytics with RAn attendee would understand High Performance and Parallel Computing landscape in R. This area in R is undergoing rapid change and objective of this session is to provide insight into various active contributions in this area. In the session, we would also delve deeper into analyzing moderately large data sets which presents huge opportunity today as a solution to “everything in memory” challenge… more
Analytics and Visualization
Intermediate
|
KS
Karthik Shashidhar 7 Ways to call elections using dataPsephology has been turning more into a science than an art. This session explains how you can call elections using only data, even if you don’t have much domain knowledge. more
Analytics and Visualization
Beginner
|
RR
Ramana Reddy Open Source Business Intelligence - Pentaho BI SuiteParticipants will be taken through the basics of Business Intelligence tools. Various tools available in the market with focus on open source tools. Participants will be shown how to use Pentaho for data integration and reporting purposes. more
Product Demos
Intermediate
|
RV
Rajat Venkatesh Workflow Schedulers: The Heart Beat of a Big Data StackWith use cases of how Qubole customers use the Scheduler product, I’ll talk about: more
Storage and Databases
Intermediate
|
t
t3rmin4t0r HOWTO run a hadoop cluster on a laptopMost of the tutorials involving being consumers of hadoop instead of being developers of the core technology. And even then, doing it without the backing of a company or someone else to foot the bill for your cluster hardware is a problem that’s missing in most FAQs. more
Storage and Databases
Beginner
|
PG
Pradeep Kumar G.S. Low Latency Access of Bigdata using Spark and Shark.This session aims at introducing latest Big data technology which involves in low latency access and in-memory data store using Spark framework. more
Storage and Databases
Beginner
|
AG
Apoorva Gaurav Cloud based low cost, low maintenance, scalable data platformThe session aims at companies and individuals who are contemplating of pluging into big data world but are avoiding it due to upfront technical and monetary investments. Some of the questions it tries to answer are :- more
Storage and Databases
Beginner
|
BM
Bharath Mohan What is Multi-Stream Retrieval?Multi-stream retrieval is about humans querying, exploring and discovering from streams of information. more
Analytics and Visualization
Intermediate
|
AK
Andreas Kollegger Neo4j Graphs: What, When, HowYou will leave with an understanding of what a graph database is, what advantages it can offer, and when to use one. We’ll focus on Neo4j, quickly covering it’s capabilities then looking at some real world use cases from Fortune 500 companies. more
Storage and Databases
Beginner
|
PM
Pranav Modi Uncovering patterns and forecasting with time series dataUnderstand time series analysis and its applications in industry and science. Uncover patterns in data - trends, seasonality, cyclical behavior. more
Analytics and Visualization
Intermediate
|
AK
Abhishek Kona The database cannot be better than the underlying datastructureUnderstand the common underlying datastructures in current storage engines, the trade offs and why this should drive decide which database to use for your next app. more
Storage and Databases
Intermediate
|
LP
Lakshman Prasad Interactive analysis of data live, using Pandas, Matplotlib and IPythonThe session is a live coding session to analyse various datasets using Pandas and plotting them live, in an IPython notebook. more
Analytics and Visualization
Beginner
|
AS
Anand S Visualising networksThis talk shows ways of visualising network data at a large and a small scale. more
Analytics and Visualization
Intermediate
|
SK
Swaroop Krothapalli Can twitter kill Boeing 787 ?Of late, leading brands have realized the potential opportunities Twitter could provide, much beyond the realm of advertising or making product announcements. Customers are increasingly turning towards this micro-blogging service to talk about various products, lodge complaints and discuss everyday events. As a result, the “buzz” that a brand has over twitter is not just a function of its recent/… more
Analytics and Visualization
Beginner
|
RC
Rajan Chandi Why we went 100% NoSQL with Mongodb?Highlighting key differences between application design using relational and NoSQL databases. What are the ways in which NoSQL can save development times and getting the product out faster, while maintaining high performance and future scalability at launch. more
Storage and Databases
Intermediate
|
AK
Andreas Kollegger Neo4j Graph WorkshopEstablish the technical foundation for working with the Neo4j graph database. From install, to fundamental data modeling to querying. more
Workshops
Beginner
|
RS
Russell Sullivan Customizing One Database for Your Multiple Data StructuresAttendees will gain hands-on experience on how to address the challenge of managing multiple types of structured and unstructured data by customizing data-structures to accurately represent their data as it exists and is queried in its natural form, to attain an impedance match between data in the wild and its model. The data structures will then go through a series of customizations to optimize … more
Workshops
Advanced
|
bb
brian bulkowski Evaluating SSD Performance for Databases Handling Real-Time Big DataAttendees will learn how to evaluate the performance of flash-based SSDs for managing high-velocity big data using the open source ACT benchmark tool. more
Storage and Databases
Intermediate
|
HS
Harpreet Singh Analysis of genomics data and linking to phenotype of country population to identify health markersA key factor in determining an individual’s susceptibility to disease as well as response to treatment should include the recognition of both the extrinsic (environmental) and the intrinsic (physiological and genomic) factors. There is a definite need and scope for evolving novel ways to stratify healthy individuals and develop a better understanding of normal phenotypic variation. Human physiolo… more
Storage and Databases
Advanced
|
MK
Mahesh Kumar Predictive Analytics in Social Media and Online Display AdvertisingThe last decade has seen unprecedented growth in the space of online advertising and digital media marketing. The new wave of social media (facebook, twitter, etc.) is making it easier than ever for the marketers to reach right customers at the right time with the right products and offers. However, the marketers, online advertising platforms, and other stakeholders need to be equipped with suita… more
Analytics and Visualization
Intermediate
|
SS
Satnam Singh Smart Analytics in SmartphonesThe objective of this talk is to present the pros and cons of performing data mining on Smartphones vs. Server. I am intending to discuss state of the art use cases that can employ data mining and machine learning techniques on Smartphones. I will present a case study to demonstrate my thoughts. more
Analytics and Visualization
Intermediate
|
PM
Peter Milne Big Data EnlightenmentTo de-mystify the nomenclature of Big Data and NoSQL more
Storage and Databases
Beginner
|
RM
Ramesh Kumar M Demystifying Big Data from Domain Name industryThis session will provide a comprehensive overview of worldwide Domain name registry & Domain name resolution data ,sources and tools.By attending this - Data enthusiasts will learn to look at Domain Name industry data from multiple perspectives , interpret key parameters , visualize/correlate , compare with their own datasets and gain valuable insights and trends for their respective fields. more
Analytics and Visualization
Intermediate
|
EB
Enrico Berti An introduction to Hue, the open source Hadoop UIYou will learn about Hue and its open source tools that allow real time exploration and analytics on Hadoop. more
Analytics and Visualization
Beginner
|
PG
Prabhu Prakash Ganesh Building large scale Analytics PlatformAs companies try to test the waters of big data, they are bombarded by a lot of hype and diverse opinions, so it is easy to be overwhelmed. In this session, I plan to share our experience in building a large scale analytics platform, the choices we made and why. The intention is to help people make decisions for themselves or their organizations. more
Analytics and Visualization
Intermediate
|
AS
Adethya Sudarsanan Open Data Aero - An opportunity for the Airline IndustryParticipants will get to know data’s perspective on airline industry. more
Storage and Databases
Intermediate
|
ER
Erik Rose What Happens When Firefox Crashes?Mozilla tames a rabble of HBase, PostgreSQL, RabbitMQ, elasticsearch, and Python to chew through 50 Firefox crash reports each second. Explore the strengths and weaknesses of these tools and their consequent niches in the greater crash-catching system. Greet the complexities that emerge from the combination, and see how we engineer around them to keep our never-lost-a-crash record pristine. more
Storage and Databases
Intermediate
|
VJ
Varsha Joshi A Billion Snapshots- Principles and Processes in the Census of IndiaThe session will explain how the Census of India 2011 was designed, canvassed, processed, and analysed, to obtain a detailed picture of a huge population of great diversity and complexity. more
Analytics and Visualization
Beginner
|
MT
Mahesh Tiyyagura Find Near Duplicate records in your DataFind customers with multiple mobile nrs in a dataset of 2Mn customer records more
Workshops
Intermediate
|
RC
Rohit Chatter Analytics: Make non-additive metrics additive using HBase & BitmapsDemonstrate use of Bitmaps on HBase. How it can enable non-additive metrics as additive in a very efficient method. more
Analytics and Visualization
Advanced
|
RC
Rohit Chatter Evaluate audience live use cases and Big Data Technology solutionsHave audience describe their problem and we propose a high level solution using Big Data technologies. This could be very useful as lots of people are looking for solutions to their problems and they don’t know how to use it. more
Workshops
Intermediate
|
RC
Rohit Chatter Product Demo: Analyze & Visualize Big Data right off the gridDemonstrate Home grown BI tool that enables reporting and analytics directly off of grid. more
Product Demos
Advanced
|
MT
Mahesh Tiyyagura Streaming live-data to LCD screens in office (using opensource tools and Rs. 4300)We’ll cover ideas, software and hardware that can be used to create and display beautiful real-time visualizations of different metrics/events on to LCD TV’s in your office. more
Analytics and Visualization
Beginner
|
RV
Rajat Venkatesh Analytics using Hadoop ecosystem on AWSThe workshop will go through the steps required to use the AWS ecosystem as an analytics backend. While we will discuss general design patterns - in many cases we will show examples using the Qubole platform. more
Workshops
Intermediate
|
D
Dr.S.Jayaprakash,Ph.D Insurance Fraud Modeling & Business Intelligence FrameworkWe have successfully created a product on Insurance Fraud Modeling Framework backed by the robust Business Intelligence Analytics. Some of the USPs of this model are (a) Quick to Deploy within few weeks (b) Proven Statistical Models (c) Minimum of 10X ROI for Insurance companies (d) Deployment made possible at the fraction of budgets of IT Departments. The principles behind the framework is scala… more
Product Demos
Advanced
|
D
Dr.S.Jayaprakash,Ph.D Grooming Geeks - Analytics & Application in EducationThe role of education in the contemporary world is not mere knowledge but creating a ‘bent of mind’ for the students to adapt themselves to the competitive environment and grow. Not all the science students end up as scientists, similarly engineering students as engineers. A chemical engineer lands up as Java programmer and a science graduate works for a courier firm yet they still work exception… more
Analytics and Visualization
Beginner
|
KP
Kaushik Paranjape Real time analytics on data that spans 100s of GBsAnalytics of stats data is a frequently faced problem by organizations that have online businesses. Doing real time analytics on this data is very important to take right business decisions. Sokrati’s solution for this problem uses sharded columnar databases as data warehouse. Using this technology query results are returned in milli-seconds on data-sets that span 100s of GBs. more
Storage and Databases
Intermediate
|
AG
Anshum Gupta SolrCloud and NoSQLIf Search in NoSQL, NoSQL in Search or either of them separately interest you, this session could be for you. I’ll talk about how NoSQL datastores have evolved and are trying to provide search, Solr is moving towards them as a potentially great ‘big-data’ store. more
Storage and Databases
Intermediate
|
AG
Ashwin Raghav Mohan Ganesh Rnotify - A Scalable Application Level Distributed Filesystem Notifications SolutionRnotify is a Distributed Filesystem Notification Solution. It helps application programmers use Inotify like abstractions to receive notifications from Distributed File Systems. This session will help programmers understand how one can build applications that require high throughout of notifications from File Systems. more
Storage and Databases
Advanced
|
TD
Tim Davies Linked Data - visions & implementationsThis talk will provide an introduction to the history, principles and practice of Linked Open Data. more
Storage and Databases
Beginner
|
TD
Tim Davies Infrastructures and eco-systems for open dataThis talk / workshop will provide an overview of developments in open data across the globe, and will outline some of the technical and organisational challenges to effectively publishing or making use of open data. more
Analytics and Visualization
Beginner
|
rd
rajdeep dua Introduction to Pivotal HD - Hadoop distribution with a SQL compliant query engineThis session will provide an overview of Pivotal HD - Next generation Hadoop distribution which integrates an MPP database running on top of HDFS. This session also will provide introduction to next release of Spring Data Hadoop which integrates with Hadoop 2.x more
Storage and Databases
Intermediate
|
SB
Sushrut Bidwai Data and SalesPetabytes of data is available in public domain and more is created everyday. more
Analytics and Visualization
Beginner
|
PH
Pramod N Haritsa Making Sense of content in domain intense QA/Discussion Forums- A Text Mining ProblemStackOverFlow: Imagine a world without such user collaboration and moderation in maintaining and getting information from discussion forums. In this talk we’ll see how can one make sense of content in QA/Discussion Forums using a palette of text processing techniques. more
Analytics and Visualization
Beginner
|
P
Piyush Need for “Lmetric” : the service for near real-time clickstream events and User behavior analysisSession will cover following things: • Different type of data sources : Collection, trending and Analysis. • Capturing Events: Why structured data emitted from apps for machines is a better approach. • Centralized Collection of events in distributed environments • Easy access to collected data in consumable form for reporting and Analysis more
Storage and Databases
Beginner
|
A
Anand RHadoop: Marrying analytics & large scale data processingWhy Hadoop is not analytics Why “Big Data” is not analytics more
Analytics and Visualization
Beginner
|
SS
Shashi Shekhar Singh Using ElasticSearch to build your Startup's Dashboard - Pros and ConsAt the end of the session, attendees should be able to understand the factors that they need to look at when evaluating whether to use ElasticSearch to build dashboard for their organizations. more
Storage and Databases
Beginner
|
RK
Rahul Kulkarni Build Products, Not Just Algorithms: 10 Examples from the Real WorldWhy are we able to build databases that can store petabytes, algorithms that seem to have an answer for everything and yet are not able to translate these into scalable, mass adopted big data products? more
Analytics and Visualization
Intermediate
|
TD
Tapomay Dey What did we gain out of using Mongodb, Redis and Mysql in a single systemLet’s talk about a multi-faceted system with unique data storage and processing challenges. One of the core challenges in SEM is to setup and maintain a marketing campaign. A good way to do so is to know the inventory in and out and be smart about it. With this system we aim at addressing such challenges in a scalable manner. more
Storage and Databases
Intermediate
|
AS
Anand S Advanced data analysis with ExcelLearn what you can do with Excel, really pushing it to the edge. more
Workshops
Advanced
|
HS
Harshad Saykhedkar Finding order in the chaos : machine learning for web text analytics using RParticipant will gain understanding of the following (through R), more
Workshops
Beginner
|
RK
Rahul Kulkarni Telling Twins Apart: A Cookie's Life And Other Stories From The Ad WorldAd targeting has evolved by leaps and bounds over the past 5 years - right from Google Analytics based remarketing on Google to custom audiences, lookalikes and partner categories on Facebook to audience buys across display networks. Targeting your ads precisely to the right audience personas is nothing but a big data problem with hundreds of thousands of targeting variable combinations possible.… more
Analytics and Visualization
Intermediate
|
AK
Abinasha Karana 15 Billion value at risk computations in 187 millisecondsThe session will touch upon real-time analytic and search applications in Hadoop platform. It covers areas as HBase schema design, hadoop configuration, cluster design and real-time map-reduce pattern. more
Storage and Databases
Intermediate
|
AV
Abhishek Vaid Implementing Named-Entity-Recognizer on Twitter Data and Using it to Cluster Similar Tweets.To understand the scope of extracting Real-Entities from tweets using freely available NER-Engines and POS-Tagging. more
Analytics and Visualization
Intermediate
|
ND
Nikil Doshi Tracking 2B parameters/month in real time - with just MySQL!Whether a startup or a big organization, everyone faces the challenge of keeping track of data in real time. Data availability in real time has significant advantages, particularly in the search engine marketing and analytics space, where most decisions are made on-the-fly. However real-time systems often are assumed to need the latest and greatest in technology. more
Storage and Databases
Intermediate
|
mg
michael gurstein Riding a kneeling elephant: Community Informatics Bridging Data Into CommunitiesTo explore how to link the technical work of software, open data, Big Data and other into the needs of local communities/villages. more
Workshops
Beginner
|
AN
Ankur Nagar Can big data fight poverty and corruption?Big data is proving to be transformative in the private sector, though can it also help solve international development problems? more
Analytics and Visualization
Beginner
|
SS
Srinivasan H Sengamedu The art and science of exploiting near-similar text and imagesBig Data, by its inherent nature, will have near-similar items. Identifying the repetitions and, even better, leveraging them to get your job done is both an art and science. The goal of this talk is to share some experiences with this and to get you excited about this. more
Analytics and Visualization
Intermediate
|
A
Anurag Workshop: Learning ElasticSearch and using it to analyze Aadhaar's Public DatasetsYou have a large data-set, commercial off-the-shelf hardware and, a project deadline that is looming. How do you manipulate the data and extract useful information? And, how steep is the learning curve? more
Workshops
Beginner
|
Data Analysis and Visualization using RTo provide an intermediate-to-advanced usage of R To do a deep-dive into different R modules using public data sets more
Workshops
Intermediate
|
ps
prabhakar srinivasan Audience Segmentation: Data-Science, Big-Data Architecture & SolutionThe objective of this talk is to show how analytics techniques can be used to answer fundamental questions in the area of Audience Segmentation for the Broadcast domain. Various stakeholders like planners, channels, operators, media agencies, advertisers are very keen to know ‘who is watching TV’ so that the products, services, content and advertisements can be better customized and tailored to t… more
Analytics and Visualization
Intermediate
|
MW
Mrinal Wadhwa A hands-on introduction to Apache HadoopTo help you get a fundamental understanding of: What is Hadoop? more
Workshops
Beginner
|
VP
Viraj Paripatyadar How to build a Recommender using Apache MahoutThis session covers creation of a recommender using Apache Mahout for a consumer Web application. After attending this session, application developers will be able to notice the need for using recommenders in their application and will be able to start planning and implementing them for their specific use cases. more
Analytics and Visualization
Beginner
|
AK
Amit Kapoor Telling visual stories with dataData visualisation has enabled us to compress data and express them visually in many interesting new ways. It is often cited that we are trying to tell stories through them. Is that really the case? How can we ensure that the audience / consumer is able to RETAIN - RECALL - RETELL our data-driven stories. more
Analytics and Visualization
Beginner
|
ES
Edouard Servan-Schreiber MongoDB: An OverviewParticipants of this workshop will have a sense of how easy it is to get started with MongoDB and why it makes their development faster and easier. They will also understand how MongoDB allows them to scale. more
Workshops
Beginner
|
ES
Edouard Servan-Schreiber Agility and Innovation vs IT: how new data platforms can overcome this neverending struggleTo explain to participants how NoSQL technology and private cloud practices can help in achieving scalable growth and fast time to market. more
Storage and Databases
Intermediate
|
ES
Edouard Servan-Schreiber Strategic advantages of MongoDBThis session will present the strategic advantages of MongoDB as an operational data store. more
Storage and Databases
Intermediate
|
AK
Ajay Kelkar Telling stories with dataMost people see Analytics as a technical & specialist role. I see Analytics as an intersection between technology & business. I will tell stories from data that I have seen over the last decade. Stories about how data has made a difference to a business & how you can go about doing the same. more
Intermediate
Discussion
|
Hosted by