Lessons learned from building a globally distributed database service from the ground up

This submission has been added to the schedule

Powered by VideoKen

DS

Lessons learned from building a globally distributed database service from the ground up

Submitted May 26, 2017

Section: Full talk for data engineering track Technical level: Intermediate

Description:
Dharma and his team has spent past 7 years to build Azure Cosmos DB (http://cosmosdb.com) - a massively scalable, multi-tenant, globally distributed database service from the ground up. The system they have built is currently operating across more than thirty-four geographical regions, managing hundreds of petabytes of indexed data, and serving 100s of trillions of requests every day from thousands of customers worldwide. The database system allows developers to elastically scale both, throughput and storage across any number of geographical regions on a single table. The service offers guaranteed single-digit millisecond low latency at 99the percentile, 99.99% high availability, predictable throughput, and multiple well-defined consistency models. The system is able to offer comprehensive SLAs for latency, availability, throughput and consistency and is used extensively within Microsoft and is available to external Azure customers since 2015. In this session, Dharma will describe the internals of the system design and various design trade-offs they had to make. He will also share his experiences from operating a globally distributed database service worldwide and maintaining comprehensive Service Level Agreements (SLAs).

Takeaways:
The lessons I have learnt from building a globally distributed database can be applied to many distributed systems.

Some of the takeaways are:

Well-defined, relaxed consistency models are really powerful in solving real world scenarios
A system designed for cloud can be made to run really cheap if it is designed with resource governance in mind
What does it mean to build multi-tenant applications? What are the challenges?
Applications running on cloud deserve a globally distributed database.
A globally distributed database != database with DR
and many more..

Intended audience:
Application developers of all types, distributed systems practitioners, data engineers, system integrators and consultants.

Outline

What does it mean to build a database that leverages the strengths of cloud?
Horizontal partitioning
Elastically scaling throughput (vs. storage) worldwide
Resource governance and fine grained multi-tenancy
Global distribution of data for low latency
Global distribution of data for high availability
Navigating the speed of light
Navigating the CAP theorem
Consistency Models - finding the right shade of grey!
Why hosting on-premises databases (SQL or NoSQL) cannot offer the lowest TCO and best SLAs?
What does it take to offer and maintain comprehensive SLAs for consistency, latency and throughput and availability.
Operating a globally distributed database service, worldwide
Insights from the production workloads
Conclusions

https://speakerdeck.com/dharmashukla/cosmos-db-at-fifth-elephant-2017

Requirements

Familiarity with databases, cloud and challenges to build a scalable applications.

Speaker bio

Dharma Shukla is a Distinguished Engineer at Microsoft. Dharma is also the founder of Azure Cosmos DB (http://cosmosdb.com) - a globally distributed, multi-tenant database service on Azure. Prior to working on the current system, his work spanned a range of distributed systems and databases at Microsoft and other places.

Links

Slides

https://speakerdeck.com/dharmashukla/cosmos-db-at-fifth-elephant-2017

The Fifth Elephant 2017

Lessons learned from building a globally distributed database service from the ground up

Outline

Requirements

Speaker bio

Links

Slides

Comments