arrow_back CouchDB: All your JSON are belong to us
Big Data in a Small Startup arrow_forward
Logstash & Elasticsearch - Give meaning to your logs, and more
Submitted by Mohit Chawla (@alcy) on Sunday, 27 May 2012
Big Data Infrastructure & Processing
There is a lot of information available in your server/app logs. And a lot of noise, too. Either you can treat all of this as a dry lifeless source of information and only using them when troubleshooting/debugging or you can do interesting things with them, making sense out of them, and use them as an important data source to drive decisions for your infrastructure/app, pro-actively.
Logstash is a pluggable system for handling events, and logs are treated as events. It supports using multiple inputs for event sources and multiple outputs, and allows you to filter information, structure your logs, mutate them, drop them, add metadata and more. Elasticsearch is an Open Source (Apache 2), Distributed, RESTful, Search Engine built on top of Apache Lucene. It is also one of the available outputs for logstash, so we can do all sorts of interesting things with'em logs/events !
Currently a sysad at Directi, started using/hacking on logstash months back, when I also came across elasticsearch and using them in our infra, where currently we are trying to use all the information from our mail server logs to help find patterns for spam prevention, detect possible suspicious activity from users ( or the network ) and general debugging/troubleshooting as well.
http://github.com/alcy/logstash is my fork of logstash, I added support for stomp input/output using onstomp, xmpp input/output and a bunch of docs/wiki pages.
http://github.com/alcy/Tag-Gen is a fun project of mine that indexes my github/bookmarks using elasticsearch and clusters results using Carrot2.