Rootconf Delhi edition

On network engineering, infrastructure automation and DevOps

The Urban Myth Of Full Uptime

Submitted by Mohammad Gufran (@notgufran) on Nov 19, 2019

Section: Full talk (40 mins) Category: SRE Status: Rejected

Abstract

Strategies to achieve high uptime at scale. The points this talk is going to cover are:

  1. A real-life case study
  2. Cloud Architecture
  3. Immutable infrastructure
  4. Infrastructure as code
  5. Secrets Management
  6. Service Discovery
  7. Container management and scheduling
  8. Blue Green Deployment
  9. Observability

Outline

  • About Me, My Company and My Situation
    • Set context for the rest of the headlines
    • Touch up on the legacy setup and infrastructure so that people can put the upcoming points in contrast with it
  • Causes of our downtime
    • Architecture
    • Poor Provisioning Practices
      • Hardware
      • Configuration
    • Lack of Monitoring
    • Missing Backups, DR and BC
    • Poor Technical Choices
      • Storing Data on single node
      • Scaling storage with LVM
      • Node local cache for distributed apps
      • Cyclic API calls
    • Security
      • Checked in secrets
      • Publicly accessible resources
      • Outdated and vulnerable versions of tools
    • Lack of Documentation and Testing
    • Takeaway - Typical problems faced in a poorly architected infrastructure
  • Architecture
    • What’s wrong with it
    • Designing immutable infrastructure
  • Poor Provisioning Practices
    • What’s wrong with it
    • Provisioning immutable resources with Terraform
    • Deploying and Configuring services in immutable fashion
  • Monitoring
    • What’s wrong with it
    • Implementing Observability
  • Backups, DR and BC
    • What’s wrong with it
    • Automated backups with redundant copies
  • Poor Technical Choices
    • What’s wrong with it
    • Fixing the mistakes made so far
  • Lack of Documentation and Testing
  • Summary

Speaker bio

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('You need to be a participant to comment.') }}

{{ formTitle }}
{{ gettext('Post a comment...') }}
{{ gettext('New comment') }}

{{ errorMsg }}