The Fifth Elephant 2012

Finding the elephant in the data.

What are your users doing on your website or in your store? How do you turn the piles of data your organization generates into actionable information? Where do you get complementary data to make yours more comprehensive? What tech, and what techniques?

The Fifth Elephant is a two day conference on big data.

Early Geek tickets are available from fifthelephant.doattend.com.

The proposal funnel below will enable you to submit a session and vote on proposed sessions. It is a good practice introduce yourself and share details about your work as well as the subject of your talk while proposing a session.

Each community member can vote for or against a talk. A vote from each member of the Editorial Panel is equivalent to two community votes. Both types of votes will be considered for final speaker selection.

It’s useful to keep a few guidelines in mind while submitting proposals:

  1. Describe how to use something that is available under a liberal open source license. Participants can use this without having to pay you anything.

  2. Tell a story of how you did something. If it involves commercial tools, please explain why they made sense.

  3. Buy a slot to pitch whatever commercial tool you are backing.

Speakers will get a free ticket to both days of the event. Proposers whose talks are not on the final schedule will be able to purchase tickets at the Early Geek price of Rs. 1800.

Hosted by

The Fifth Elephant - known as one the best #datascience and #machinelearning conference in Asia - is transitioning into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more

Mohit Chawla

@alcy

Logstash & Elasticsearch - Give meaning to your logs, and more

Submitted May 27, 2012

There is a lot of information available in your server/app logs. And a lot of noise, too. Either you can treat all of this as a dry lifeless source of information and only using them when troubleshooting/debugging or you can do interesting things with them, making sense out of them, and use them as an important data source to drive decisions for your infrastructure/app, pro-actively.

Outline

Logstash is a pluggable system for handling events, and logs are treated as events. It supports using multiple inputs for event sources and multiple outputs, and allows you to filter information, structure your logs, mutate them, drop them, add metadata and more. Elasticsearch is an Open Source (Apache 2), Distributed, RESTful, Search Engine built on top of Apache Lucene. It is also one of the available outputs for logstash, so we can do all sorts of interesting things with’em logs/events !

Speaker bio

Currently a sysad at Directi, started using/hacking on logstash months back, when I also came across elasticsearch and using them in our infra, where currently we are trying to use all the information from our mail server logs to help find patterns for spam prevention, detect possible suspicious activity from users ( or the network ) and general debugging/troubleshooting as well.

http://github.com/alcy/logstash is my fork of logstash, I added support for stomp input/output using onstomp, xmpp input/output and a bunch of docs/wiki pages.

http://github.com/alcy/Tag-Gen is a fun project of mine that indexes my github/bookmarks using elasticsearch and clusters results using Carrot2.

Links

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

The Fifth Elephant - known as one the best #datascience and #machinelearning conference in Asia - is transitioning into a year-round forum for conversations about data and ML engineering; data science in production; data security and privacy practices. more