Continuous Monitoring and Faster Service Restoration (CM and FSR)
Submitted by Aravind GV (@agv) (proposing) on Wednesday, 13 January 2016
This is a proposal requesting for someone to speak on this topic. If you’d like to speak, leave a comment.
How to Continous Monitorning of 100’s of services and take auto remediaton of services if it failes.
Given the vast heterogeneity of applications in every company it is important to provide a stable consolidated solution for auto-restart. With our CM and FSR to improve restoration of its applications through automation. At this point the focus is on reducing time to recover, automated dependency management and eliminating human errors rather than self healing (i.e. crawl, walk, run). Primary objectives are to achieve RTO < 15 mins or to reduce the current time to restore by at least 80%. This standard framework needs to adopt across a variety of applications, be secure & compliant and provide for verification of availability of capabilities integrated into a dashboard.
Aravind G V Currently working in Intuit India Product Development Ltd as Staff Application Operation Engineer.
Experienced DevOps Engineer dedicated to automation and optimization. Understands and manages the space between operations and development to quickly deliver code to customers. Has experience with the Cloud, as well as DevOps automation development for Linux systems. Brings maturity, enthusiasm, and a drive to learn new technologies along with real world experience