Rootconf 2017

On service reliability

Burzin Engineer

@burzinengineer

Mesos, KVM and the story of Infrastructure at PhonePe

Submitted Jan 24, 2017

PhonePe is a mobile payment solution based on NPCI UPI. Phonepe infrastructure runs on a combination of docker containers, VMs and baremetals. This talk focuses on an internal cloud solution which helps manage the various components using mesos.

Outline

Brief Introduction to PhonePe Infrastructure 5m

  • Layout of applications on Mesos with Marathon Framework exclusively on containers, exlusively implementing business logic, written in java or nodejs.
  • Applications on VMs (Egs: loadbalancers )
  • Applications on Baremetal servers (Egs: Databases)
  • Core Infrastructure components (routers,firewalls,tunnels,dns,dhcp etc)

The Problem 5m

  • Devops needs to manage an environment that consists of containers, virtual machines and Baremetals seamlessly.
  • Need to manage resources like CPU, Memory, private IPs
  • Constraints like tenancy, apps on SSD, PCI etc.
  • Multiple Operating Systems/Versions

Mesos Frameworks 12m

The big picture 12m

  • PhonePe Cloud Implementation
  • Multiple Environments(staging,integration)
  • Multiple DataCenters
  • How does everything i.e., the application mesos cluster, the virtual machine cloud and the baremetals tie into a single seamless infrastructure
  • Short update on integration of DNS

Q & A 5m

Requirements

NA

Speaker bio

Burzin Engineer is the cofounder and Chief Reliability Officer at PhonePe.

Slides

https://speakerdeck.com/krishnanvrphonepe/rootconf2017

Comments

Login to leave a comment

  • BE

    Burzin Engineer

    @burzinengineer Submitter

    The talk will mostly address the following

    • phonepe cloud KVM plugin with tight integration into mesos

    • we will not be discussing failover at this point, maybe a topic for next year?

    • we have private P2P lines between DC's and also a IPSEC backup link over internet

    • the question of kubernetes v/s mesos is an interesting one.

    We chose mesos because it has proven scale for 10's of thousand of servers. kubernetes is a cluster manager for containers (only?) while mesos is a distributed system kernel that will make your cluster look like one giant computer system to all supported frameworks and apps that are build to be run on mesos. Yet kubernetes is one (amongst others) framework that can be run on mesos.
    As far as I know its not easy or possible to build your own frameworks on top of kubernetes ( not 100% on this).

    Mesos abstracts underlying hardware (e.g. bare metal or VMs) away and just exposes the resources. It contains primitives for writing distributed applications (e.g. Spark was originally a Mesos App, Chronos, etc.) such as Message Passing, Task Execution, etc. Thus, entirely new applications are made possible. Apache Spark is one example for a new (in Mesos jargon called) framework that was built originally for Mesos. This enabled really fast development - the developers of Spark didn't have to worry about networking to distribute tasks amongst nodes as this is a core primitive in Mesos.

    To my knowledge, Kubernetes is not used inside Google in production deployments today. For production, Google uses Omega/Borg, which is much more similar to the Mesos/Marathon model.

    They are both good and depends on your comfort level and skill. We started small, didn't know how big we would grow or how fast, so we went with mesos, there is no right or wrong here

    Posted 8 years ago

Hosted by

We care about site reliability, cloud costs, security and data privacy