Failure is an Option

Jul 2011

4 Mon

5 Tue

6 Wed

7 Thu

8 Fri

9 Sat 09:30 AM – 05:30 PM IST

10 Sun

Dharmaram College, Bengaluru

All submissions

Previous Next

This submission has been added to the schedule

Failure is an Option

Submitted Jun 29, 2011

Section: Development Technical level: Beginner Session type: Lecture

To talk about failure in the cloud, how likely it is and how different the remedial measures are compared to a data-centre setup.

Outline

The fundamental assumption of the cloud is that someone else runs your machines, buys your disks and routes your network. Unfortunately that means that there is really no way you can tell when a machine will fail, when a storage setup will error out and when your network connectivity will choke.

Even in such an error prone and unreliable setup, it is still possible to get a reliable system with great uptime by making your applications more agile. Most importantly, recovery from failure takes a different approach from a fixed node setup by an always roll-forward dynamic system.