Rootconf 2016

Rootconf is India's principal conference where systems and operations engineers share real world knowledge about building resilient and scalable systems.

Chaos Engineering and design patterns for building highly available services

Submitted by @diptanu (@diptanu) on Thursday, 25 February 2016

videocam_off

Technical level

Advanced

Section

Full talk

Status

Confirmed & Scheduled

View proposal in schedule

Vote on this proposal

Login to vote

Total votes:  +3

Objective

The talk introduces Chaos Engineering to the audience, and talks about how complex distributed systems fail in large scale internet services. The talk also goes into discussing design patterns for making higly resilient distributed systems which can heal from transient failures.

Description

Complex Distributed Systems are hard to operate and has very complex failure modes. In this talk, we are going to discuss how we can build confidence in large scale distributed systems by introducing random but controlled failures in them in production and understand how services de-generate and work towards healing and recovering from failures automatically. We will also discuss patterns and various techniques for designing highly available and resilient distributed systems.

Speaker bio

Diptanu is a Senior Engineer at HashiCorp, and works on large-scale distributed systems, cluster schedulers, service discovery and highly available and high throughput systems on the public cloud. He is a core committer to the Nomad cluster manager which has a parallel and distributed scheduler and supports heterogeneous virtualized workloads.

Prior to HashiCorp, Diptanu worked in the Cloud Platform group at Netflix, where he worked on the core platform infrastructure that powered the Microservices infrastructure of Netflix. He worked on Apache Mesos and wrote a cluster scheduler for running clusters of Docker containers on AWS, and also contributed to various reactive IPC and service discovery infrastructure projects.

Comments

Login with Twitter or Google to leave a comment