The Fifth Elephant 2013

An Event on Big Data and Cloud Computing

Harpreet Singh


Analysis of genomics data and linking to phenotype of country population to identify health markers

Submitted May 10, 2013

A key factor in determining an individual’s susceptibility to disease as well as response to treatment should include the recognition of both the extrinsic (environmental) and the intrinsic (physiological and genomic) factors. There is a definite need and scope for evolving novel ways to stratify healthy individuals and develop a better understanding of normal phenotypic variation. Human physiology is too complex to be pinned down to a few loci has been highlighted in the recent spate of Genome Wide Association Studies. An integrative approach of stratifying and clustering physiological states on the basis of molecular functioning therefore seems pertinent. I will be presenting method of grouping a population based on Ayurveda. Ayurveda has been practiced for over 3500 years across world. The system already has a built-in framework for stratifying healthy individuals who differ in susceptibility to disease and response to drug and environment. In contrast to the empirical approach of contemporary medicine, the Ayurveda therapeutic regimen is tailored to an individual’s physiology.


This is big data application built on Hadoop leveraging Mongo DB having patient database stored in Alfresco DMS. We have already completed getting data for west zone and compiled patient raw and processed dataset. Each patient document consists of more than 1000 fields and consume multiple GB’s. It includes blood profile, background, genome, RNA sequence etc.

Based on Ayurveda doshas Vata, Pitta, Kapha we classify initial dataset. We then hypothesize that same medical problem (like diabetes or sleep deprivation) can be caused in different doshas due to contrasting lifestyle decisions.

We plan to build this as base for personal medicine supporting our claims with genomic data.



Speaker bio

I am currently working as Senior Architect at a large consulting firm. The work, I will be presenting is a big data research project that I am involved with in CSIR, Delhi. I am an accomplished scientist/entrepreneur in neuroscience domain with a record of achievements in scientific and business leadership roles. Over 14 years of industry and research experience including fundamental biomedical research in area of neurology, sleep disorders, oncology and preclinical (toxicology) drug discovery. As an entrepreneur, setup and led a multidisciplinary teams of 5 - 80 members building products for US and German market ( Involved in clinical research and all aspects of clinical development and study management. Collaborated with University of Wisconsin, Madison on number of academic service contracts along with industry collaborations with EGI and Philips.


{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

All about data science and machine learning