The Fifth Elephant 2014

A conference on big data and analytics

In 2014, infrastructure components such as Hadoop, Berkeley Data Stack and other commercial tools have stabilized and are thriving. The challenges have moved higher up the stack from data collection and storage to data analysis and its presentation to users. The focus for this year’s conference on analytics – the infrastructure that powers analytics and how analytics is done.

Talks will cover various forms of analytics including real-time and opportunity analytics, and technologies and models used for analyzing data.

Proposals will be reviewed using 5 criteria:
Domain diversity – proposals will be selected from different domains – medical, insurance, banking, online transactions, retail. If there is more than one proposal from a domain, the one which meets the editorial criteria will be chosen.
Novelty – what has been done beyond the obvious.
Insights – what insights does the proposal share with the audience that they did not know earlier.
Practical versus theoretical – we are looking for applied knowledge. If the proposal covers material that can be looked up online, it will not be considered.
Conceptual versus tools-centric – tell us why, not how. Tell the audience what was the philosophy underlying your use of an application, not how an application was used.
Presentation skills – proposer’s presentation skills will be reviewed carefully and assistance provided to ensure that the material is communicated in the most precise and effective manner to the audience.



For queries about proposals / submissions, write to


  1. Data Collection and Transport – for e.g, Opendatatoolkit, Scribe, Kafka, RabbitMQ, etc.

  2. Data Storage, Caching and Management – Distributed storage (such as Gluster, HDFS) or hardware-specific (such as SSD or memory) or databases (Postgresql, MySQL, Infobright) or caching/storage (Memcache, Cassandra, Redis, etc).

  3. Data Processing, Querying and Analysis – Oozie, Azkaban, scikit-learn, Mahout, Impala, Hive, Tez, etc.

  4. Real-time analytics

  5. Opportunity analytics

  6. Big data and security

  7. Big data and internet of things

  8. Data Usage and BI (Business Intelligence) in different sectors.

Please note: the technology stacks mentioned above indicate latest technologies that will be of interest to the community. Talks should not be on the technologies per se, but how these have been used and implemented in various sectors, enterprises and contexts.

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

Amit Kapoor


Crafting Visual Stories with Data

Submitted Mar 29, 2014

Data visualisation has enabled us to compress data and express them visually in many interesting new ways. It is often cited that we are trying to tell stories through them. But, the science of data-visual-stories is still very nascent and developing. On the other hand, the art of storytelling through spoken and written words, pictures, comics and movies is very well developed and understood. Lets explore ‘why’ stories work and how can we combine the science and art together to craft visual stories with data.


“I think people have begun to forget how powerful human stories are, ex-changing their sense of empathy for a fetishistic fascination with data, networks, patterns, and total information... Really, the data is just part of the story. The human stuff is the main stuff, and the data should enrich it.” - Jonathan Harris

Stories have been recognized for their power of communication & persuasion for centuries. There is an increasingly realisation that we need to operate at this intersection of data, visual and stories to fully harness the power of big data.

In this session, I will showcase the basic building blocks of storytelling across different mediums - oral storytelling, journalistic written stories, graphic comics and movies. And then explore the idea that we can integerate narrative storytelling lessons from these mediums with our data visualisation to start crafting visual stories with data.

I will summarize basic design principles that can help us in our crafting journey, as we take the data through the layers of abstraction - See the Data | Show the Visual | Tell the Story | Engage the Audience. The focus would be on sharing ‘why’ stories work and aim to unpack the six dimensions of creating a data-visual-story.

  1. Abstraction (data patterns)
  2. Representation (visual encoding)
  3. Framing & Tranisition (perspective, focus)
  4. Messaging (verbal, text annotation)
  5. Flow (arrangement)
  6. Interactivity

I will be using exemplars from my work and other real-world data-stories to explore this topic.


No technical skills needed. Just be open and willing to practice the art of listening, observing and learning.

Speaker bio

I am interested in learning and teaching the craft of telling visual stories with data. I use storytelling and data visualization as tools for improving communication, persuasion and leadership. I am a partner at narrativeviz Consulting where I conduct workshops and trainings for corporates, non-profits, colleges, and individuals. I also teach sessions on storytelling with data as invited expert / guest faculty in data visualization and analytics related courses at IIM Bangalore and IIM Ahmedabad. My background is in strategy consulting in using data-driven stories to drive change across organizations and businesses. I have more than 12 years of consulting experience, first with AT Kearney in India and then with Booz & Company in Europe. I did my B.Tech from IIT, Delhi and PGDM from IIM, Ahmedabad. You can find more about me at and tweet me at @amitkaps



{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

The Fifth Elephant - known as one of the best data science and Machine Learning conference in Asia - has transitioned into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more