Large scale business stats aggregation using Kafka

Jul 2017

24 Mon

25 Tue

26 Wed

27 Thu 08:15 AM – 10:00 PM IST

28 Fri 08:15 AM – 06:25 PM IST

29 Sat

30 Sun

MLR Convention Centre, Whitefield, Bengaluru,

All submissions

Previous Next

Large scale business stats aggregation using Kafka

Submitted Mar 30, 2017

Technical level: Intermediate

At Indix we collect and process lot of data. We monitor the correct behaviour of our system through collection of business metrics. Over the time, we moved most of our system from batch map-reduce jobs to kafka stream tasks. Hence we had to move the stats to be more real time. So we built a system called Abel, which aggregates millions of events that it gets and collects stats for the same.

Outline

Stats as seperate MR jobs
Pros and Cons for MR jobs
Trying to use riemann for stat collection
Pros and Cons for Riemann
Generalizing the Stats abstraction with Semigroups
Semigroup properties
Semigroup examples
Emission of stats with Abel
Explosion of keys
Performance

Speaker bio

I have been working at Indix for last 4.5 years and before that was part of Thoughtworks. I have worked on almost every part of Indix have been through the journey of how it evolved.