The Fifth Elephant 2025 Annual Conference CfP
Speak at The Fifth Elephant 2025 Annual Conference
Banty Kumar
Submitted May 26, 2025
Abstract
This talk will explore how Uber leverages automation to deploy and manage MySQL for high-QPS, high-throughput systems, handling over 1 million queries per second while ensuring 99.99% availability and consistently low latencies. At scale, Uber runs more than 2,000 MySQL clusters managing over 2 petabytes of data, supporting a wide range of use cases including infrastructure-critical (tier-0), customer-facing (tier-1), and company-wide data warehousing workloads.
We’ll dive into the architecture of Uber’s MySQL deployment, detailing its core components and the automation strategies used for failure detection and mitigation, designed to operate with minimal to no manual intervention.
The speakers are the authors of this blog.
We’ll also walk through the evolution of MySQL at Uber, from optimizing latencies on single-leader instances to implementing application-level sharding, and ultimately to building a fully automated, production-grade Sharded MySQL system.
Audience
Takeaways
How can we deploy a highly available, high-throughput MySQL cluster? What does the overall architecture look like, including the separation of the control plane (orchestration, configuration, monitoring) and the data plane (actual query and data serving)?
How can automation be leveraged for failure detection and mitigation?
What techniques and tools can enable fast, reliable identification of node or instance failure and how can recovery be orchestrated with minimal to no manual intervention?
What are the scalability limits of a MySQL cluster—and how can we overcome them? At what point does a traditional MySQL setup start to bottleneck, and what strategies can be used to scale beyond those limits?
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}