A Beast to Process Kafka Events
Building an event processing library comes with own baggage.
No Data loss takes highest priority, then comes Performance and Scalability. Scaling the tool for millions of messages/minute with architecture than with language. Also ensuring it is generic so we can deploy for any schema/table by config change.
Will walk you through the journey of building this library and learnings through the process.
We’ve built our own event processing library to consume events from kafka, and pushes to bigquery. All of our micro services are event sourced. We’ve high load of 21K messages/second for few topics, and hundreds of topics.
In this talk, will cover the learnings,
- why we built our custom event processing tool Beast
- customising code for each input/output combination and old way of deployment
- limitations with existing systems for our usecase
- Ensuring no data loss
- How could we test the application for data loss
- How could we monitor data loss in bigquery?
- How we achieved performance which handles high throughput with acceptable latency.
- Architecture (processing with Queues), (why we didn’t pick redis?)
- why we couldn’t use go language
- How we achieved scalability using kubernetes.
- load testing
- chaos testing
Enhancements (ease of deployment)
- parser to generate config from proto
- auto update the table schema for new fields in proto
The learnings are generic irrespective of the language.
- Basic Understanding of Kafka or Pub/Sub tools
- Basic usecase for Biquery
- Basics of building applications in java
These will make the session more effective
Dinesh Kumar is a software developer, passionate about building products for impact. He works at Gojek, handling backend services which serves millions of users. Go enthusiast, active volunteer and co-organiser in go community. Artist at times.