Jul 2016
25 Mon
26 Tue
27 Wed
28 Thu 08:30 AM – 06:25 PM IST
29 Fri 08:30 AM – 06:15 PM IST
30 Sat 08:45 AM – 05:00 PM IST
31 Sun 08:15 AM – 06:00 PM IST
Akash Mishra
Companies are now trying to become data oriented and trying to take decision based on data.
First step in moving towards data oriented decision is to collect data. Data Lake has become one of the recent buzz word in Big Data industry. Most of the time companies try to first build a Data Lake which will contain all their data. Most often dumping data into data lake translate into exporting all the data from various RDBMS databases [e.g Orders, Inventory], scraping all the log’s into their data lake. Once we have all the relevant data in data lake, we write various processing applications to extract data out of the source data. Above approach has many problems [e.g. huge upfront cost, missing information not currently tracked e.t.c] associated with it.
In this talk I will be proposing another approach for data driven system where instead of dumping all the data into central location, we identify the events/interactions/facts [ e.g Add to card event, Viewing a product e.t.c] in the company and store them for processing. I will be explaining how this approach becomes much more result oriented and much more agile than the dumping approach.
Akash Mishra is currently working as a Data Engineer at Badoo Trading Limited with more than 4 years experience in building large scale big data application for various client of ThoughtWorks Technologies. He has production experience with various big data technologies like Spark,Hadoop, Mesos e.t.c. He is passionate developer and has deep interest in Distributed Systems. He has co-organised Big Data Meetup for Pune & NCR. He has already given various talks in meetups and Geek Night & contributed to Apache Spark project.
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}