Big Data Structures

Jul 2016

25 Mon

26 Tue

27 Wed

28 Thu 08:30 AM – 06:25 PM IST

29 Fri 08:30 AM – 06:15 PM IST

30 Sat 08:45 AM – 05:00 PM IST

31 Sun 08:15 AM – 06:00 PM IST

NIMHANS Convention Centre

Big Data Structures

Submitted Apr 24, 2016

Section: Full talk Technical level: Beginner

Analysis of terabyte data sets by heavy data processing are common tasks these days. A data structure is a particular way of organizing data in a computer so that it can be used efficiently. For Big Data, the computer changes to a cluster and also the way of organizing the data is distributed. The usage patterns are changing from being precise changes to being probabilistic. False positive matches are acceptable (with small error rates), false negatives are not. For rapidness, approximations are acceptable, with small percentage of precision.

There are few data structures used for practical results, for specific use-cases, with parameters based on expected data volume and required error probability which I call -- “The Big Data Structures”. This talk highlights on use-case based examples of these Big Data Structures.

Outline

All are use-cases with exising solution and improved solutions

Use case 1: Cardinality
Use case 2: Frequency
Use case 3: Membership
Use case 4: Verification

Speaker bio

Ranganathan has nearly eleven years of experience of developing awesome products and loves to works on full stack - from front end, to backend and scale. Though graduated as civil engineer, he worked with few software companies, tried two startups and at present works for ThoughtWorks as Technology Lead, where he is contributing to open source products. He runs the one of the top technology meetups in Hyderabad - Hyderabad Scalability Meetup. He is very interested in exploring Big data technologies and a regular speaker. He has recently spoke in Apache Big Data Europe 2015, Apache Big Data North America 2016, GIDS 2015, GIDS 2016, and many other meetups and conferences.

The Fifth Elephant 2016