May 2012
21 Mon
22 Tue
23 Wed
24 Thu
25 Fri
26 Sat 10:00 AM – 05:30 PM IST
27 Sun
Rootconf is HasGeek’s first annual conference for sysadmins and devops to share experience and knowledge, to teach and learn, and to meet colleagues and friends.
More information at rootconf.in. Tickets are available from rootconf.doattend.com.
Gaurav
Submitted May 21, 2012
Being able to monitor a distributed system for various system/application level statistics using popular open source tools
Active real-time monitoring is one of the most basic prerequisites for designing a scalable distributed system. The easier it is to track/add custom metrics across the distributed system, the easier it is to get a clear idea of the current system performance, identify bottlenecks, implement design changes to scale in a certain direction.
Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and grids. Nagios is a popular IT infrastructure monitoring tool which we use for managing email/sms alerts. This talk is on how we use and integrate these open source tools to make a customized system with ease of integration and centralized metric gathering that helps us get a clear picture of the current state of the server farm, parallelly execute commands across a selection of these servers, and get notified of any erroneous state as and when it happens.
I am a linux enthusiast who works with the Platforms & Systems team at Capillary Technologies. Develop and optimize for scalability, various apps in the cloud.
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}