Lessons scaling operations to everyone @indix

This submission has been added to the schedule

Lessons scaling operations to everyone @indix

Submitted Nov 7, 2017

Technical level: Intermediate

At Indix we collect and process lots of data. As our data size grew so were the operational difficulties surronding them. When we were a team with small number of developers and a single ops person, using a centralised configuration management system made a lot of sense. Any changes would go through him resulting in a very less overhead. As we scaled our team that single person became a bottleneck for different teams. He wasn’t able to cope up with new things that individual teams wanted to try for their respective set of challenges. This lead to individual teams stepping up to manage their own infrastructure parts on AWS.

That’s when we realised we needed a more de-centralised, respective team owned configuration management across the organization. While most teams welcomed the change, some teams with no prior operational experience found this transition very hard. After a lot of attempts we started using a self-serviced, resource based scheduling for individual services for these teams.

Self managed infrastructure is a dream to any operational team in an organization, but it does come with it’s fair share of challenges.

In this talk, I’ll cover our problems, mistakes and learnings over the years on scaling the operations to everyone at Indix and how some our technology choices were influenced by them.

Outline

Introduction
Problems, mistakes and learnings

When we were 5+ member team
- Focus on keeping infrastucture sane
When we were 30+ member team
- Focus on scaling the infra knowledge and not the ops team
When we were 60+ member team
- Focus on abstracting away the infra knowledge

Tech Radar(s)
Operability Checklist
Current un-solved challenges

Speaker bio

Ashwanth Kumar is a Principal Software Engineer working in Data Ingestion Team @indix. While he’s not fiddling with distributed Systems and data. He contributes to Open Source and helps organises meetups and tech events in the City. He writes Scala at work and Go at home.

Links

List of all the talks - https://github.com/ashwanthkumar/talks
OSS - https://github.com/ashwanthkumar/

Slides

https://speakerdeck.com/ashwanthkumar/lessons-scaling-operations-to-everyone-at-indix

Miniconf on Cloud Server Management (Chennai)