May 2012
21 Mon
22 Tue
23 Wed
24 Thu
25 Fri
26 Sat 10:00 AM – 05:30 PM IST
27 Sun
Gaurav
Being able to monitor a distributed system for various system/application level statistics using popular open source tools
Active real-time monitoring is one of the most basic prerequisites for designing a scalable distributed system. The easier it is to track/add custom metrics across the distributed system, the easier it is to get a clear idea of the current system performance, identify bottlenecks, implement design changes to scale in a certain direction.
Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and grids. Nagios is a popular IT infrastructure monitoring tool which we use for managing email/sms alerts. This talk is on how we use and integrate these open source tools to make a customized system with ease of integration and centralized metric gathering that helps us get a clear picture of the current state of the server farm, parallelly execute commands across a selection of these servers, and get notified of any erroneous state as and when it happens.
I am a linux enthusiast who works with the Platforms & Systems team at Capillary Technologies. Develop and optimize for scalability, various apps in the cloud.
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}