Rootconf 2016

Rootconf is India's principal conference where systems and operations engineers share real world knowledge about building resilient and scalable systems.



MySQL to Hadoop incremental data ingestion

Submitted Jan 30, 2016

Talk focuses on how we are doing data ingestion from MySQL to Hadoop in an incremental fashion to make the data on hdfs more upto date


Utilizing big data processing platform like hadoop is very crucial for any business to build good analytical dashboards to provide business insights. Dumping full database and converting it to avro and loading it to hdfs consumes significant amount of database resources and cannot be done as frequently as we need which lead to stale data in hdfs.In this talk we give details on the incremental design framework to ingest data from MySQL to Hadoop wich involves capturing change data from MySQL database and processing the delta capture to finally merge it with full data set in HDFS.


Basics of MySQL replication and SQL is a requirement to understand the technical details of design

Speaker bio

Staff Database engineer at Linkedin,responsible for maintaining MySQL and Oracle databases.


{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

We care about site reliability, cloud costs, security and data privacy