Production Report - Using Apache Flink as a microservice for stateful asynchronous processing
Jagadish Bihani
@None
This talk highlights why we chose flink as a microservice for stateful asynchronous event processing and challenges we faced in production, how we solved those and recommendations for productionization of the applications using Apache flink.
Key takeaways:
- Architecture pattern of using Flink/similar platform as a microservice for statuful async event processing
- Flink fault tolerance concepts in-depth understanding
- Production issues/challenges faced and insights on how to solve (& also prevent) them
Basic understanding of stream processing will be an advantage.
Outline
- Brief summary of what is flink and important terminologies
- Flink as a microservice for asynchronous stateful event stream processing
- Challenges in doing it in a conventional way
- Prerequisite concepts
- Fault tolerance and checkpointing
- Scalable partitioned state
- State Backend - Rocksdb
- Asynchronous checkpointing details
- Production Experiences
- Flink taskmanager failover time tuning
- Failure detection mechanism
- Tuning Akka Deathwatch
- How state leaks happen and how to prevent and monitor them
- How to clear old state (result of state leak) of running system, without taking downtime
- How state size and checkpointing can cause processing delays and how to tune it
- Recommendations & Summary
Speaker bio
Software architect at Helpshift. Have worked on streaming processing,various backend architectures and end-end data pipelines before. Have a good understanding of systems side of software as well. More details can be found on https://www.linkedin.com/in/jagadish-bihani-1335a04a/
Slides
http://slides.com/jagadishbihani/apache-flink-production-report/fullscreen
{{ errorMsg }}