arrow_back Preparing for failure - resilient system architecture
Simoorg - A failure induction framework
Submitted by Arjun Shenoy (@arjunshenoy) on Tuesday, 19 January 2016
Those attending will get the basic idea of the concept of failure induction, An architectural overview of Simoorg the failure induction framework developed at LinkedIn and the features provided by Simoorg.
Simoorg (or Failure Inducer) is a failure induction tool (similar to chaos monkey) developed at LinkedIn. The main rationale behind developing the tool was to have an extensible and easy to code framework for inducing failures. Failure Inducer works by introducing a particular set of failures to a healthy cluster and then reverting them after a period of time during which it the state of the cluster was logged. The failures can be set as deterministic (set to happen at a specific point of time) or non-deterministic (can happen at any given random time). By default a set of basic failures such as Full GC, Graceful Restart, Ungraceful shutdown, Disk IO(Read/Write/Seek) failure are provided, If needed more failures can be added and configured easily.
The main topics dealt with will be:
1. The introduction to Simoorg.
2. The key features that separates it from chaos monkey.
3. A brief walk through of the architecture.
4. A demo of the tool.
Arjun Shenoy is a Site Reliability Engineer working with the Distributed Data Systems team at LinkedIn India.