Vishnu Naini

Vishnu Naini

@vishnunaini

Santanu Sinha

Santanu Sinha

@santanusinha

Drove: A Simple, Performant, and Operations-Friendly Container Orchestrator

Submitted Oct 23, 2024

We shall discuss Drove, a simple container orchestrator developed at PhonePe that focuses on efficient resource utilization, container performance, straightforward compliance and security models, and ease of management. At PhonePe, containers running on Drove clusters, deployed across our multiple Data Centers and cloud, handle millions of requests per second and power all services and apps across the different business lines.

Drove is focussed towards reliable and easy management of stateless and containerised services and tasks at large scale.

We shall cover how Drove helps us manage isolated clusters running interdependent microservices across compliance firewalls, the Drove Gateway, a traffic gateway utilizing Nginx auto-configuration, the Drove coredns plugin for service discovery, as well as Epoch, our simple time-based job scheduler built on Drove.

Agenda

  • Why we developed Drove
  • Applications and tasks in Drove
  • The Drove cluster
  • Drove Gateway
  • Drove CoreDNS plugin
  • Epoch - our time based job scheduler built on Drove
  • Operations and maintenance (including caveats on docker and podman)
  • Observability of the drove cluster and containers
  • Introduction to the drove repository and documentation

Takeaways

  • Drove provides a simplified orchestration system to deploy and execute service containers across a cluster of machines. It is a logical successor to Apache Mesos/Marathon based container orchestration system.
  • It does not strive to solve all requirements for building the complete infrastructure layer in a DC/Cloud. Instead, some other requirements, such as service discovery, auto-scaling, and key-value storage, are handled by existing platforms at PhonePe.. It is decidedly simpler than systems like kubernetes and openshift, provides a simple deployment architecture with very few moving components while steering clear of adding too much bloat to the system.
  • Battle tested on production with clusters running thousands of containers that handle millions of requests per second across hundreds of hosts per cluster spanning multiple data centers
  • Open sourced end to end

Audience

  • Site Reliability and DevOps Engineers
  • Engineering leaders
  • Cloud architects and engineers

For further resources visit https://phonepe.github.io/drove-orchestrator/

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

We care about site reliability, cloud costs, security and data privacy

Supported by

Platinum Sponsor

Nutanix is a global leader in cloud software, offering organizations a single platform for running apps and data across clouds.

Platinum Sponsor

PhonePe was founded in December 2015 and has emerged as India’s largest payments app, enabling digital inclusion for consumers and merchants alike.

Silver Sponsor

The next-gen analytics engine for heavy workloads.

Sponsor

Community sponsor

Peak XV Partners (formerly Sequoia Capital India & SEA) is a leading venture capital firm investing across India, Southeast Asia and beyond.

Venue host - Rootconf workshops

Thoughtworks is a pioneering global technology consultancy, leading the charge in custom software development and technology innovation.

Community Partner

FOSS United is a non-profit foundation that aims at promoting and strengthening the Free and Open Source Software (FOSS) ecosystem in India. more

Community Partner

A community of Rust language contributors and end-users from Bangalore. We have presence on the following telegram channels https://t.me/RustIndia https://t.me/fpncr LinkedIn: https://www.linkedin.com/company/rust-india/ Twitter (not updated frequently): https://twitter.com/rustlangin more