What are your users doing on your website or in your store? How do you turn the piles of data your organization generates into actionable information? Where do you get complementary data to make yours more comprehensive? What tech, and what techniques?
The Fifth Elephant is a two day conference on big data.
Early Geek tickets are available from fifthelephant.doattend.com.
The proposal funnel below will enable you to submit a session and vote on proposed sessions. It is a good practice introduce yourself and share details about your work as well as the subject of your talk while proposing a session.
Each community member can vote for or against a talk. A vote from each member of the Editorial Panel is equivalent to two community votes. Both types of votes will be considered for final speaker selection.
It’s useful to keep a few guidelines in mind while submitting proposals:
Describe how to use something that is available under a liberal open source license. Participants can use this without having to pay you anything.
Tell a story of how you did something. If it involves commercial tools, please explain why they made sense.
Buy a slot to pitch whatever commercial tool you are backing.
Speakers will get a free ticket to both days of the event. Proposers whose talks are not on the final schedule will be able to purchase tickets at the Early Geek price of Rs. 1800.
Aadhaar - world's largest biometric identity platform (200 trillion biometric matches per day, 2 PB of data)
Describe the technology needs and solutions behind Aadhaar - the world’s largest biometric identity platform.
- 200 trillion biometric matches per day
- 2 Peta Byte of raw data stored
- 100 million authentication requests per day
- Tera-byte scale data warehouse of 200 million records
- 50 million messages per day
- 100 million database transactions per day
Aadhaar has unique compute and data challenges that exhibit all characteristics of Big Data - Volume, Variety and Velocity. The challenge is to derive Value from these attributes.
A number of technologies have been used to handle massive parallel processing, streaming data reads, data locality computing, low latency reads, data integrity and challenges of dealing with distributed data - best explained by the CAP theorem.
- Hadoop stack : HDFS, HBase, Hive, Pig, Zookeeper
- MySQL : sharded, partitioned, distributed
- SEDA : Mule, RabbitMQ
- Search : MongoDB, sharded Solr
- Compute Grid : Spring, GridGain
- Monitoring : Custom built, Nagios
- Analytics & Visualization
- Deployment footprint : Thousands of CPU cores
- Extensive Data archival, DR
An appreciation of challenges involved in building a biometric database of 1.2 billion people, support for multi-lingual applications, deployment challenges of reaching out to every village and city in the country involving 27,000 installations till date and logistics required to manage enrolments, letter delivery, on-line authentication and financial transactions in the order of millions.
Dr. Pramod Varma is currently Chief Architect at UIDAI. He joined UIDAI in 2009 and has been pivotal in ensuring an open, scalable, and secure architecture is built to meet the needs of Aadhaar project. He leads the overall technology and application architecture and application development within UIDAI Technology Unit and is based in Bangalore.
Before joining UIDAI in July 2009, he was the Chief Technology Architect and Vice President of Research at Sterling Commerce, now part of IBM. He joined Sterling in 2005 when Sterling Commerce acquired Yantra Corporation, a leading supply chain software company based in Boston. He was one of the founders of Yantra and was the Vice President of Technology.
Pramod holds a Masters and Ph.D. degree in Computer Science along with a second Masters in Applied Mathematics.
Regunath is Principal Architect of Aadhaar and Chief Architect at MindTree. He has created IP based solutions on SOA. He is passionate about Open Source and committer on the MindTree Insight project. Regunath has been an invited speaker on forums like OSI Days and iCMG architecture summit. He is a guest columnist at CIOUpdate and blogs frequently on technology subjects.