Rootconf 2016

Rootconf is India's principal conference where systems and operations engineers share real world knowledge about building resilient and scalable systems.

Arjun Shenoy

@arjunshenoy

Simoorg - A failure induction framework

Submitted Jan 19, 2016

Those attending will get the basic idea of the concept of failure induction, An architectural overview of Simoorg the failure induction framework developed at LinkedIn and the features provided by Simoorg.

Outline

Simoorg (or Failure Inducer) is a failure induction tool (similar to chaos monkey) developed at LinkedIn. The main rationale behind developing the tool was to have an extensible and easy to code framework for inducing failures. Failure Inducer works by introducing a particular set of failures to a healthy cluster and then reverting them after a period of time during which it the state of the cluster was logged. The failures can be set as deterministic (set to happen at a specific point of time) or non-deterministic (can happen at any given random time). By default a set of basic failures such as Full GC, Graceful Restart, Ungraceful shutdown, Disk IO(Read/Write/Seek) failure are provided, If needed more failures can be added and configured easily.

The main topics dealt with will be:

  1. The introduction to Simoorg.
  2. The key features that separates it from chaos monkey.
  3. A brief walk through of the architecture.
  4. A demo of the tool.

Speaker bio

Arjun Shenoy is a Site Reliability Engineer working with the Distributed Data Systems team at LinkedIn India.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

We care about site reliability, cloud costs, security and data privacy