Cloud Costs Optimization
Rootconf For members

Cloud Costs Optimization

Practical case studies, experience reports and tooling use cases from startups and enterprises

Tickets

Loading…

The evolution of the cloud in the last decade has radically simplified infrastructure deployment. Undoubtedly, major providers like AWS, Azure, GCP, Alibaba Cloud have removed entry barriers for developers to either launch a new product or to scale infinitely. However, no software company can keep scaling without taking a look at the cost of infrastructure as a percentage of its revenue.

Many companies today are struggling with cloud costs eating up a significant portion of their gross margins. In this conference, we will discuss the tools, techniques and best practices for monitoring and controlling cloud costs.

Key takeaways for participants

  1. Best practices for monitoring and auditing cloud costs.
  2. Techniques and design choices for controlling cloud costs.
  3. Case studies and experience reports.

Who should participate

  1. CTOs and VP engineering of startups in the growth phase, who are expecting their cloud costs to go up in the near future; interested in case studies and tools.
  2. Engineering Manager of a large organization who is interested in controlling the costs for their division; interested in case studies and tools.
  3. Senior engineers who are interested to know about architectural choices and how they affect the cloud costs.

Speaking

If you are interested in speaking at the conference, submit your talk idea here. The editors - Anand C and Raghdip Panesar - will review your talk description and give feedback.
Guidelines for speaking, speaker honorarium policy, and travel grant policy details are published here.

About the editors

This conference is curated by Anand Chitipothu and Raghdip Singh Panesar.
Anand has been building software, managing servers and infrastructure for over two decades. He has curated the Scaling from First Principles series of discussions with Rootconf.
Raghdip is Staff Engineer - Network at Google. Prior to joining Google Raghdip spent nine years at Flipkart as network architect and senior staff engineer.

Become a Rootconf member to join

This is a community-funded conference. It will be held in-person. Attendance is open to Rootconf members only. Support the conference with a membership to join. If you have questions about participation, post a comment here.

Sponsorship

Sponsorship slots are open for:

  1. Cloud providers who want to evangelise their efforts on optimizing the costs of cloud, and showcase customer success stories.
  2. Tool providers.
  3. Companies seeking tech branding for hiring.
    Email sponsorship queries to sales@hasgeek.com

Contact information

Join the Rootconf Telegram group at https://t.me/rootconf or follow @rootconf on Twitter.
For inquiries, contact Rootconf at +91-7676332020.

Hosted by

Rootconf is a community-funded platform for activities and discussions on the following topics: Site Reliability Engineering (SRE). Infrastructure costs, including Cloud Costs - and optimization. Security - including Cloud Security. more

Supported by

Sponsor

Businesses are more digital today than ever before. They need to build, deploy and run real-time services in order to stay ahead of the curve. The notion of real-time is not just a nice-to-have anymore. It’s an expectation. It is what sets a merely good user experience apart from a great one. A rea… more

Sponsor

Delivering the expert solutions for MySQL, MongoDB, PostgreSQL, TiDB, and other distributed databases. Carving the real performance with the existing infrastructure and tuning the prodcution systems which leads to right instance sizing and enhanced performance in the production environment. DB solu… more

Partner

FOSS United is a non-profit foundation that aims at promoting and strengthening the Free and Open Source Software (FOSS) ecosystem in India. more

This video is for members only

Gurudatt Bhobe

@gurudatt_idfy

How we cut cloud costs at IDfy and slept well at night :-)

Submitted Jun 30, 2023

Background

About IDfy

IDfy is a leader in the digital onboarding and verification space.
We enable our customers to seamlessly onboard employees, customers, vendors, users and more while preventing fraud at the same time.
We are a 80 member tech team which contribute to multiple products on the platform.

About our Tech Platform

We

  • are multi-cloud capable and manage single and multi-tenant deployments in production.
  • run approximately 400 services across our platform primarily hosted on Google Cloud.
  • rely heavily on Kubernetes, containers and several other cloud specific services.
  • operate multiple kinds of workloads including ML training and inference workloads utilizing GPUs

About this presentation

We will talk about why and how we went on our cost optimization journey and where we are today.

Cost optimization at IDfy

The initial impetus came about when we saw month-on-month increase in infrastrucure costs without a linear increase in volumes. This got us thinking that something had to change.
So in late June 2022, we got around to optimizing our costs which paid off inside of 2 months.

The entire set of activities can be summarized under

  • People
  • Process
  • Tools

People

  • Empowering team members to take decisions where and how to optimize and then quickly execute changes in days
  • Defined ownership for continued cost monitoring and optimization

Process

  • We had a clear set of guiding principles for areas of optimization
    • A baseline target was set
    • Quick improvements and changes with high monetary impact were prioritized
    • Then the more complex changes were picked up
  • Knowing when to stop (not over-stretching so as to impact other deliverables)
  • Setting up practices for optimal setups
  • Continued monitoring

Tools

  • Breaking down cost dashboards
  • Really understanding where costs come from (Service/SKU level breakdown)
  • Profiling

Putting it all into action

  • Our costs really came from a few cloud resources

    • Compute
    • Databases
    • Logging
    • Managed Services
  • Basic principles

    • Stop what’s not needed, when not needed (Staging environments)
    • Reduce replicas (fewer pods, but nothing is impacted)
    • Cut the fat (overprovisioned cpu and memory from the good days)
    • Consolidate (shared resources)
    • Reduce reliance on costly managed services
    • Optimize (queries)
    • Clean up (unnecessary logs)
  • A few steps down the road

    • Get a better deal from your cloud provider (discounts, cheaper performant hardware, etc)
    • Committments (pay upfront for what you know you will need)
    • Deeper view into costs (slice, dice and compare)
    • Basic autoscaling
  • Today

    • Nuanced Autoscaling
    • Profiling and optimization before deploying to prod
    • Continuous monitoring (with set ownership of monitoring and reporting)
    • A process of delegated ownership and action with a monthly checkpoint with leads

Outcomes

  • A cost reduction of close to 40% inside of 2 months
  • Some good practices and principles that have stuck with us
  • A sense of ownership around cost and continuous optimization mindset
  • Per unit cost that has stayed the same or reduced since Sep 2022

Presentation

Link

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Rootconf is a community-funded platform for activities and discussions on the following topics: Site Reliability Engineering (SRE). Infrastructure costs, including Cloud Costs - and optimization. Security - including Cloud Security. more

Supported by

Sponsor

Businesses are more digital today than ever before. They need to build, deploy and run real-time services in order to stay ahead of the curve. The notion of real-time is not just a nice-to-have anymore. It’s an expectation. It is what sets a merely good user experience apart from a great one. A rea… more

Sponsor

Delivering the expert solutions for MySQL, MongoDB, PostgreSQL, TiDB, and other distributed databases. Carving the real performance with the existing infrastructure and tuning the prodcution systems which leads to right instance sizing and enhanced performance in the production environment. DB solu… more

Partner

FOSS United is a non-profit foundation that aims at promoting and strengthening the Free and Open Source Software (FOSS) ecosystem in India. more