Rootconf Mini 2024 (on 22nd & 23rd Nov)

Geeking out on systems and security since 2012

Sandesh Kumar Gupta

Sandesh Kumar Gupta

@sandeshgupta59

Building Intelligence and resilience for highly available managed DbaaS platforms

Submitted Oct 28, 2024

Objective

At Flipkart, we have seen the huge adoption of the home grown managed platforms running as multi-cloud setup by all the engineering teams working at massive scale, and DbaaS platforms are protagonists of this story. It becomes paramount that these platforms can maintain high resilience, high availability to deliver sustained performance and continuous optimisations to handle adoption at scale. In this talk, we delve into:

  • What does it take to measure the availability in realtime as well as platform resiliency aspects for disaster recovery strategy with multi-cloud presence.
  • How does these platforms intend to deliver further values with continous optimisation and recommendations to improve productivity of platform maintainers.
  • Our attempts to capitalize language models to extract intelligence for multiple platform level optimisations.

What will be the scope of this session? Key problems, challenges being addressed and lessons learned

Appropriate format for this session - 40 mins talk

Any managed DbaaS platform is not mere multiple dbs running, rather a stitched and tuned ecosystem of various components working in harmony.

DbaaS resiliency trade-offs with multi-cloud setup during important sale events like Big Billion Days.

Productivity improvements, Oncall reduction and Opex tightning

Target audience and Takeaways for this session. Problem/pain to be solved for the audience.

  • This talk with particularly pique interests of the engineers who are fed-up with maintaining various self-managed stacks and keen on evaluating platform first approach. [(1) Building blocks]
  • Backend engineers evaluating systemic availability computation, DR usecases and productivity measures [(2) Availability computation]
  • Architects looking to resilient multi cloud design and optimise opex working with datastores. [(3) Resiliency and 4) Optimisations]
  • AI-for-DB or AI-for-Platforms connoisseurs [(5) Intelligence in-house]
  • It’s also highly relevant for developers working on large-scale distributed systems requiring extreme backend scaling with high throughput low latency performance needs. [Across]

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

We care about site reliability, cloud costs, security and data privacy

Supported by

Platinum Sponsor

Nutanix is a global leader in cloud software, offering organizations a single platform for running apps and data across clouds.

Platinum Sponsor

PhonePe was founded in December 2015 and has emerged as India’s largest payments app, enabling digital inclusion for consumers and merchants alike.

Silver Sponsor

The next-gen analytics engine for heavy workloads.

Sponsor

Community sponsor

Peak XV Partners (formerly Sequoia Capital India & SEA) is a leading venture capital firm investing across India, Southeast Asia and beyond.

Venue host - Rootconf workshops

Thoughtworks is a pioneering global technology consultancy, leading the charge in custom software development and technology innovation.

Community Partner

FOSS United is a non-profit foundation that aims at promoting and strengthening the Free and Open Source Software (FOSS) ecosystem in India. more

Community Partner

A community of Rust language contributors and end-users from Bangalore. We have presence on the following telegram channels https://t.me/RustIndia https://t.me/fpncr LinkedIn: https://www.linkedin.com/company/rust-india/ Twitter (not updated frequently): https://twitter.com/rustlangin more