Continuous Deployment for MySQL Database

Sep 2021

30 Mon

31 Tue

1 Wed

2 Thu

3 Fri 12:00 PM – 03:30 PM IST

4 Sat

5 Sun

Continuous Deployment for MySQL Database

Submitted Jul 27, 2021

Category: 30 min talk

Topic

Managing schema and data management across multiple environments for MySQL can become tedious and error prone as a system scales. Currently Capillary Technologies operates in multiple environments across multiple products with around 125 Mysql Servers, 2500+ tables and 300TB+ data. In this talk, we will present the tools and processes we use to maintain consistency across our database deployments while allowing the developers the agility to make new schema changes frequently.

Description

At Capillary, for Database changes we used a mutation based schema management system with each file containing change for a table (DDL queries containing ALTER, CREATE, DROP clauses or DML queries containing INSERT, UPDATE, DELETE of seed data ). Over a period of time we had issues like

Order of applying changes across environments were not the same - which resulted in data inconsistencies. This drift happens as number of environments increases (dev, QA, pre-production, and multiple production deployments)
Some changes did not progress beyond QA environments
Developers even missed deploying some changes in some environments
With so many other consistency issues, creating a new environment without reasonable amount of human intervention was not possible

The vision we had was to perform Database changes just like our application (where we only know about the version of the container to be deployed). Our current deployment process is such that the developer just needs to commit the expected final state of the schema and the seed data and the deployment tool will make sure of the consistency.

Outline

Mutation and State based management
Schema Deployment
- Small tables
- Large tables in the order of a few TBs(without downtime)
Metadata management
Seed data management
Comparison with pt-online-schema-change
Scale at which the new system is running
Limitations
How can it be used by other companies

Choosing datastores