Rootconf 2016

Rootconf is India's principal conference where systems and operations engineers share real world knowledge about building resilient and scalable systems.

Goblin - Automated Resiliency Testing

Submitted by Shailesh Hegde (@shlsh) on Monday, 18 January 2016

videocam_off

Technical level

Intermediate

Section

Crisp talk

Status

Confirmed & Scheduled

View proposal in schedule

Vote on this proposal

Login to vote

Total votes:  +10

Objective

To discuss resiliency testing challenges in large scale cloud deployments and how to automate them (think Chaos Monkey, but with a few key differences).

Description

This talk will cover the following:

  • What is resiliency of a large-scale distributed system ?
  • Challenges in resiliency testing of a large-scale distributed system which uses third party applications and protocols such as RabbitMQ/AMQP, Caching/NoSQL/Couchbase/Cassandra, service discovery/zookeeper, media (SIP, RTP, H323, PSTN, audio/video)
  • Gotcha! What you think won’t fail, but fails
  • Describe the Goblin framework (working to open source it in Q1 2016) that induces faults, runs tests, verifies results, recovers the system, all in a controlled manner
  • How to use Goblin for live group testing as well as nightly automated runs
  • Extending Goblin to other systems

Requirements

Working in Linux based cloud environments

Speaker bio

Currently working as a Lead QA engineer at BlueJeans Network. Part of the core team that built Goblin.

Links

Slides

https://goo.gl/VdGhnR

Comments

  • 1
    Virendra Singh Bhalothia (@bhalothia) 2 years ago

    All the best, Shailesh!

  • 1
    Philip Paeps (@trouble) Reviewer 2 years ago

    What is the status of the open sourcing effort? Will it be open source by April?

  • 1
    Shailesh Hegde (@shlsh) Proposer 2 years ago

    Philip, Yes. I expect it to be open source by April.

Login with Twitter or Google to leave a comment