Large scale business stats aggregation using Kafka
At Indix we collect and process lot of data. We monitor the correct behaviour of our system through collection of business metrics. Over the time, we moved most of our system from batch map-reduce jobs to kafka stream tasks. Hence we had to move the stats to be more real time. So we built a system called Abel, which aggregates millions of events that it gets and collects stats for the same.
- Stats as seperate MR jobs
- Pros and Cons for MR jobs
- Trying to use riemann for stat collection
- Pros and Cons for Riemann
- Generalizing the Stats abstraction with Semigroups
- Semigroup properties
- Semigroup examples
- Emission of stats with Abel
- Explosion of keys
I have been working at Indix for last 4.5 years and before that was part of Thoughtworks. I have worked on almost every part of Indix have been through the journey of how it evolved.