RK
Rajesh KSV
@rajeshksv37
Breaking Up with Legacy Monitoring: A Seamless Auto Migration Story Supercharged by GenAI
Submitted Apr 20, 2025
Topic of your submission:
Observability
Type of submission:
30 mins talk
I am submitting for:
Rootconf Annual Conference 2025
Migrating a large-scale legacy monitoring system is notoriously painful—riddled with complexity, downtime risks, and resistance from users. In this talk, we’ll share the behind-the-scenes journey of one of the most seamless monitoring migrations done at scale
We transitioned from a legacy in-house, closed standards, monitoring solution to a modern, high-performance system built on Prometheus without disrupting developers, without downtime, and without breaking dashboards or alerts. While most companies take years to complete such a migration, we pulled it off in just a few quarters, migrating thousands of dashboards and hundreds of thousands of alerts, all fully automated
The talk will cover the engineering decisions, tooling, and automation that enabled this zero-touch migration of data, dashboards and alerts. We will also focus on how GenAI made the last mile automation faster and efficient
Key Takeaways
- Data Migration
Strategies adopted to migrate PBs of data from old stack to new stack - Grafana Dashboard Migration
Strategies adopted to migrate tens of thousands of Grafana Dashboards from OpenTSDB to PROMQL - Alerts migration
Strategies adopted to migrate hundreds of thousands of complex alerts and how GenAI made the last mile automation faster and efficient.
Intended audience
Packed with actionable insights, lessons learned, and best practices, this session is a must-attend for anyone (SREs, Architects) looking to embark on a similar transformation journey.
About us
Rajesh is an Architect for Flipkart’s Observability team with over a decade of experience designing and building high-scale, distributed systems for monitoring, alerting, and logging. He played a key role in transforming Flipkart’s legacy monitoring stack into a modern observability platform—focusing on open standards, scalability, reliability, and cost efficiency.
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}