Submissions for Data Stores track

Guide on how to select datastores to solve different problems

Dinesh Dhakal


Migrating Online Data

Submitted Aug 18, 2021

Relational Databases as well as non relational data stores support a number of high performing, high volume and highly available applications on the Internet. At Linkedin, many important functionalities are powered by an RDBMS (MySQL and Oracle) or a NoSQL Data Store (Espresso). While we’ve developed a reliable process for schema evolution, we have also run into major changes in the fundamental structure of Data at Linkedin, majorly fuelled by the hyper growth phase that the platform has gone through. These structural changes and other performance issues have also required migrating data across schemas, Databases and even different data stores to ensure we keep up with the scale and performance needed to give our members the utmost value.

The key objectives of this talk are -
Discuss the need for Data Migration
Types of Migrations we’ve done at LinkedIn
Strategies for Data Migration for an online Data Store
Planning for the unknowns and Gotchas

Audience Takeaways -

Understand why there might arise a time for Migrating data and it’s the right solution to their problem
How to plan to migrate data within same data store vs across data stores
Learn about the gotchas pitfalls and how to tread them effectively

Session Outline -
Introduction to types of Data Stores (3 mins)
When Data outgrows and underperforms (2 mins)
Do you need to migrate? (3 mins)
Types of Migrations (4 mins)
One shot migration
Trickle Migration
Planning (8 mins)
Understand the data
Define the end goal
Design the components strategically
One must monitor!
Canary, Canary, Canary!
Application support and Cutover
Gotcha! (In Prod!) (3 mins)
Migrating endorsements / Address Book (5 mins)
Conclusion (2 mins)


{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

We care about site reliability, cloud costs, security and data privacy