Rootconf 2016

Rootconf is India's principal conference where systems and operations engineers share real world knowledge about building resilient and scalable systems.

MySQL to Hadoop incremental data ingestion

Submitted by BASAVAIAH THAMBARA (@basavaiaht) on Saturday, 30 January 2016

videocam_off

Technical level

Intermediate

Section

Full talk

Status

Submitted

Vote on this proposal

Login to vote

Total votes:  +15

Objective

Talk focuses on how we are doing data ingestion from MySQL to Hadoop in an incremental fashion to make the data on hdfs more upto date

Description

Utilizing big data processing platform like hadoop is very crucial for any business to build good analytical dashboards to provide business insights. Dumping full database and converting it to avro and loading it to hdfs consumes significant amount of database resources and cannot be done as frequently as we need which lead to stale data in hdfs.In this talk we give details on the incremental design framework to ingest data from MySQL to Hadoop wich involves capturing change data from MySQL database and processing the delta capture to finally merge it with full data set in HDFS.

Requirements

Basics of MySQL replication and SQL is a requirement to understand the technical details of design

Speaker bio

Staff Database engineer at Linkedin,responsible for maintaining MySQL and Oracle databases.

Comments

Login with Twitter or Google to leave a comment