Choosing datastores

Choosing datastores

Guide on how to select datastores to solve different problems

Make a submission

Accepting submissions till 15 Aug 2021, 11:59 PM

How do you select datastores and be aware of their limitations when applied to the problem at hand? Are there misconceptions you wish someone had cleared for you as you started on your journey of scaling with datastores?

Choosing data stores for your use cases conference will help you understand:

  • Running datastores at scale - and tuning, debugging and operations.
  • Solving specific use cases with a certain datastore.
  • Data modelling and developer experience with datastore.

Senior infrastructure and software engineers from Farfetch, Aerospike, Zeotap, eightfold.ai, LinkedIn and Tesco engineering will share war stories and their learnings with practitioners in the audience.

View schedule at https://hasgeek.com/rootconf/choosing-datastores/schedule

Contact information: Join the Rootconf Telegram group at https://t.me/rootconf or follow @rootconf on Twitter.
For inquiries, contact Rootconf at rootconf.editorial@hasgeek.com or call 7676332020.

Hosted by

Rootconf is a forum for discussions about DevOps, infrastructure management, IT operations, systems engineering, SRE and security (from infrastructure defence perspective). more

George CJ

@georgecj11

Continuous Deployment for MySQL Database

Submitted Jul 27, 2021

Topic

Managing schema and data management across multiple environments for MySQL can become tedious and error prone as a system scales. Currently Capillary Technologies operates in multiple environments across multiple products with around 125 Mysql Servers, 2500+ tables and 300TB+ data. In this talk, we will present the tools and processes we use to maintain consistency across our database deployments while allowing the developers the agility to make new schema changes frequently.

Description

At Capillary, for Database changes we used a mutation based schema management system with each file containing change for a table (DDL queries containing ALTER, CREATE, DROP clauses or DML queries containing INSERT, UPDATE, DELETE of seed data ). Over a period of time we had issues like
* Order of applying changes across environments were not the same - which resulted in data inconsistencies. This drift happens as number of environments increases (dev, QA, pre-production, and multiple production deployments)
* Some changes did not progress beyond QA environments
* Developers even missed deploying some changes in some environments
With so many other consistency issues, creating a new environment without reasonable amount of human intervention was not possible

The vision we had was to perform Database changes just like our application (where we only know about the version of the container to be deployed). Our current deployment process is such that the developer just needs to commit the expected final state of the schema and the seed data and the deployment tool will make sure of the consistency.

Outline

  • Mutation and State based management
  • Schema Deployment
  • Small tables
  • Large tables in the order of a few TBs(without downtime)
  • Metadata management
  • Seed data management
  • Comparison with pt-online-schema-change
  • Scale at which the new system is running
  • Limitations
  • How can it be used by other companies

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Make a submission

Accepting submissions till 15 Aug 2021, 11:59 PM

Hosted by

Rootconf is a forum for discussions about DevOps, infrastructure management, IT operations, systems engineering, SRE and security (from infrastructure defence perspective). more