The Fifth Elephant 2014

A conference on big data and analytics

In 2014, infrastructure components such as Hadoop, Berkeley Data Stack and other commercial tools have stabilized and are thriving. The challenges have moved higher up the stack from data collection and storage to data analysis and its presentation to users. The focus for this year’s conference on analytics – the infrastructure that powers analytics and how analytics is done.

Talks will cover various forms of analytics including real-time and opportunity analytics, and technologies and models used for analyzing data.

Proposals will be reviewed using 5 criteria:
Domain diversity – proposals will be selected from different domains – medical, insurance, banking, online transactions, retail. If there is more than one proposal from a domain, the one which meets the editorial criteria will be chosen.
Novelty – what has been done beyond the obvious.
Insights – what insights does the proposal share with the audience that they did not know earlier.
Practical versus theoretical – we are looking for applied knowledge. If the proposal covers material that can be looked up online, it will not be considered.
Conceptual versus tools-centric – tell us why, not how. Tell the audience what was the philosophy underlying your use of an application, not how an application was used.
Presentation skills – proposer’s presentation skills will be reviewed carefully and assistance provided to ensure that the material is communicated in the most precise and effective manner to the audience.



For queries about proposals / submissions, write to


  1. Data Collection and Transport – for e.g, Opendatatoolkit, Scribe, Kafka, RabbitMQ, etc.

  2. Data Storage, Caching and Management – Distributed storage (such as Gluster, HDFS) or hardware-specific (such as SSD or memory) or databases (Postgresql, MySQL, Infobright) or caching/storage (Memcache, Cassandra, Redis, etc).

  3. Data Processing, Querying and Analysis – Oozie, Azkaban, scikit-learn, Mahout, Impala, Hive, Tez, etc.

  4. Real-time analytics

  5. Opportunity analytics

  6. Big data and security

  7. Big data and internet of things

  8. Data Usage and BI (Business Intelligence) in different sectors.

Please note: the technology stacks mentioned above indicate latest technologies that will be of interest to the community. Talks should not be on the technologies per se, but how these have been used and implemented in various sectors, enterprises and contexts.

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

Ekta Grover


Experimentation to Productization : developing a Dynamic Bidding system for a location aware Mobile landscape

Submitted May 12, 2014

This session is to help structure a Hypothesis based approach to Engineering problems and learning to quickly translate & implement algorithms on weblogs(mobile footprints) data.

This session is about 2 main things –

  1. Introduction to a Real Time Bidder(RTB) & Dynamic bidding in a location based mobile marketing
  2. Three specific problems that we addressed to increase the bottom-line for our clients & how we scaled them.


I will start with the obvious, ie. how do advertisers reach you on mobile, fetching your comprehensive digital footprint, all in 6 milliseconds, or less. Then look through sample digital footprint(weblogs), laying the ground to understand the data, and algorithms to derive statistical relationships.

Towards this, I will then talk about identifying quick wins to deliver outcomes, all through data and in this introduce Hypothesis based Engineering - ie how not to go down a bottomless pit.

I will then spend majority of time talking about 3 problems we adressed at AdNear to increase the bottom line for our clients -

  1. Algorithm to develop a Dynamic bidding system which prices each opportunity to bid, based on the quality of that “specific” inventory - Towards this, I will focus on how we built meta-data for otherwise, not so userful attributes like “user-agent”, data for creatives, besides the obvious “features”

  2. Characterizing user-mobilty patterns to generate user profiles - ie given a cross section of user, how do we map the activities associated with their geographical footprint - and generate & probablistic picture of his activity patterns & affinity towards general activities

  3. Developing a comprehensive app-ranking system : How we use web to increase the information content of the apps to deliver Business outcomes that matter. The system updates the snapshot across multiple dimensions for each of the unique appids in the system every hour, to deliver a self aligning machine learning system at scale

Finally I will close this with the framework we built to measure all this in real time - A/B testing framework, Simulation & Reporting - which supported the Experimentation phase, created stickness that pushed the productization of data into our Production systems, while doing so at Scale.


General familarity with Real time bidding(RTB), Mobile targeting, Information retrieval & ranking systems, is preferble, though the talk will be armed with all that is needed to get though.

Speaker bio

Ekta is Data Scientist with AdNear Pte., where she is designing Dynamic Bidding systems and A/B testing framework for bidding in location based mobile targeting space, to increase the bottom line for clients across Asia-Pacific. She has a background in Quantitative Economics(MS) from Goethe-University, Frankfurt and Computer Science(BS) from Bangalore, India and enjoys Monetizing and leveraging technology to solve abstract Business problems. While at Grad school she became passionately interested in rationality, framing problems and how we human being respond to ambiguous choices, something she sews in technical dimensions with a scientific rigour.

Prior to AdNear, she was with [24]7 Inc., Innovation Labs, where she was responsible for end to end solutioning, statistical analysis and deployment of Analytic models for e-commerce clients and designing intuitive customer experiences. Before that she has worked in roles across Quality Engineering (VMware Inc.), Program Management (SAP Labs) and Experimentation methods, Auctions & Macroeconomics while pursuing her Masters at Goethe University.

She presented a talk at Pycon 2013, Bangalore, selected as a speaker for Pycon APAC, 2014 Taipei. Also accepted to present the same in Grace Hopper’s conference for Women in Computing, 2014 at Pheonix(USA)



{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more