The art of troubleshooting distributed systems
Submitted by Damini Satya (@daminisatya) on Sunday, 7 April 2019
Section: Crisp talk Technical level: Intermediate Section: Full talk (40 mins) Category: Distributed systems
Implementing and running a distributed system, poses unique challenges for Systems/DevOps Engineers. Troubleshooting and debugging issues in distributed systems is a tedious and complex process. The standard approach of gaining insight into system activity by analysing system logs alone is not enough. In this talk we demystify the complex process by presenting some approaches, best practices to tame the beast. These are based on our learnings in running distributed systems at internet scale.
Please refer to the document for the talk outline - https://docs.google.com/document/d/1hfiY77Lh1CfYa1AWH1nKh3mmX5X5hSvhQlsuwn5rj7Q
Damini Satya is a software engineer at Salesforce building tools for infrastructure automation internally. Previously, she was a speaker at GHC 2018 with a talk titled “Elsa, A conversational agent aimed at improving women’s mental health”, which garnered huge applause from the attendees both at the conference and on social media. She also spoke at GHC 2017 and GHC India 2016 on a wide variety of technical topics. Apart from her presence at GHC, she also presented tech talks at conferences like ReactConf & FOSSASIA. A passionate developer and with a desire to mentor students, she transitioned from her role as a student in Google summer of Code (GSoC) 2016 with the FOSSASIA organization working on a peer to peer scraper system, Loklak, and became a mentor for the organization during GSoC 2017. She is an active open source contributor (Kubernetes) and a part of various open source communities while continually aiming to bringing more women into contributing to open source software.