The Fifth Elephant 2018

The seventh edition of India's best data conference

Building big data pipelines on kafka and kubernetes

Submitted by Abhishek Agarwal (@abhishek-appd) on Saturday, 31 March 2018

videocam
Preview video

Technical level

Intermediate

Section

Full talk

Status

Submitted

Vote on this proposal

Login to vote

Total votes:  +22

Abstract

At Appdynamics, we have been trying to push the limits to which we can scale the metric ingestion. Toward this goal, we have been taking logical pieces out of monolithic application and re-architecting these pieces to handle large scale.

Initially, we decided to adopt a stream processing platform to port these new pieces to, but later we realized that not all of these pieces are typical real time streaming application. Some of these pieces are regular web services but they also have common concerns such as orchestration, fault tolerance, resiliency, up-scaling/down-scaling etc. In fact, most organizations have this same problem but usually end up deploying different infrastructures for different category of applications. However, operational simplicity and a lean infrastructure were importants concerns for us and motivated us to take a different route.

In this talk, I am going to talk about how modelling our real time data pipelines as asynchronous microservices has allowed us to use same kubernetes infrastrucutre for both data pipelines and regular web services. I will talk on how this unification immensly simplifies our deployment and operations work and keeps our services lean.

Outline

  1. Problem statement - Why we started on this path
  2. Initial vision - Re-architecting the application with stream processing platforms
  3. Separation of concerns - application concerns (at least once) vs infra concerns (scaling)
  4. Course correction - Moving from stream processing platform to an orchestration platform
  5. Ingestion pipeline - Ingestion is the first service we scaled using kafka and kubernetes
  6. Lessons - What were the challenges and how we overcame them
  7. Q/A

Speaker bio

Abhishek is Staff Software Engineer at Appdynamics India Pvt Ltd and works on real time stream processing infrstructure at Appdynamics. He is also a core member of the team which is responsible for transitioning appdynamics product from a monolith to asynchronous microservices driven architecture. Previously, he has worked in InMobi user/data platform team. He is also a PMC member of Apache storm.

Slides

https://docs.google.com/presentation/d/1Ao1ae14k0qQq8LKWZXCv0UOeL-Ki_JeqE94udFNiFTM/edit?usp=sharing

Preview video

https://youtu.be/exkbTAP7XB4

Comments

  • 1
    Zainab Bawa (@zainabbawa) Reviewer 8 months ago

    You have to submit draft slides and preview video for us to evaluate your talk.

    • 1
      Abhishek Agarwal (@abhishek-appd) Proposer 8 months ago

      Uploaded the preview video and draft slides.

      • 1
        Zainab Bawa (@zainabbawa) Reviewer 7 months ago

        Got it. Your proposal is under review.

Login with Twitter or Google to leave a comment