The evolution of the cloud in the last decade has radically simplified infrastructure deployment. Undoubtedly, major providers like AWS, Azure, GCP, Alibaba Cloud have removed entry barriers for developers to either launch a new product or to scale infinitely. However, no software company can keep scaling without taking a look at the cost of infrastructure as a percentage of its revenue.
Many companies today are struggling with cloud costs eating up a significant portion of their gross margins. In this conference, we will discuss the tools, techniques and best practices for monitoring and controlling cloud costs.
- Best practices for monitoring and auditing cloud costs.
- Techniques and design choices for controlling cloud costs.
- Case studies and experience reports.
- CTOs and VP engineering of startups in the growth phase, who are expecting their cloud costs to go up in the near future; interested in case studies and tools.
- Engineering Manager of a large organization who is interested in controlling the costs for their division; interested in case studies and tools.
- Senior engineers who are interested to know about architectural choices and how they affect the cloud costs.
If you are interested in speaking at the conference, submit your talk idea here. The editors - Anand C and Raghdip Panesar - will review your talk description and give feedback.
Guidelines for speaking, speaker honorarium policy, and travel grant policy details are published here.
This conference is curated by Anand Chitipothu and Raghdip Singh Panesar.
Anand has been building software, managing servers and infrastructure for over two decades. He has curated the Scaling from First Principles series of discussions with Rootconf.
Raghdip is Staff Engineer - Network at Google. Prior to joining Google Raghdip spent nine years at Flipkart as network architect and senior staff engineer.
This is a community-funded conference. It will be held in-person. Attendance is open to Rootconf members only. Support the conference with a membership to join. If you have questions about participation, post a comment here.
Sponsorship slots are open for:
- Cloud providers who want to evangelise their efforts on optimizing the costs of cloud, and showcase customer success stories.
- Tool providers.
- Companies seeking tech branding for hiring.
Email sponsorship queries to email@example.com
Unleashing Cost Optimization : Freshworks' Journey from Startup to Scale
In this talk, we will delve into the fascinating realm of cost optimization and walk through the journey of Freshworks as it scales its operations. We will cover various strategies and techniques employed to optimize costs at each stage of growth, from the early startup phase to achieving scale and efficiency.
Growth Phase : Startup to Scale
As Freshworks started growing, we encountered the need for architectural transformations to accommodate increasing demands. We will discuss our tech stack evolution from a single product to a blue-green deployment approach, highlighting the architectural changes made at each stage and the driving factors behind them. These architectural changes brought about a substantial impact on our overall cost structure, necessitating the need for effective cost optimization strategies.
To achieve cost optimization, we implemented several techniques, including booting up instances in stages, partial shutdown of instances, and adopting time-based instances that align resource allocation with workload demands. Additionally, we will deep dive into our Reserved Instance (RI) coverage strategy, which enables significant cost savings.
Navigating the Transition Phase : Agile Solutions
As the products scaled, gaining visibility into cost allocation became crucial. We will share insights on how we tagged resources to track cost distribution and implemented Cloud Custodian policies to identify and manage unused resources in lower environments. Furthermore, we will discuss our successful adoption of spot instances, leveraging their cost benefits.
Infrastructure Improvements : Best Practices and Strategies
Migration from Opsworks to Kubernetes played a pivotal role in optimizing compute costs. We will discuss the cost savings achieved through this transition and elaborate on our data transfer cost optimization efforts, including the use of VPC endpoints and interface endpoints for AWS services. Preventing inter-Availability Zone data transfer and rectifying faulty VPC configurations further contributed to substantial cost reductions. Moreover, we will share how intelligent routing and inter-product/platform communication optimization techniques minimized data transfer costs.
We will deep dive into our log optimization strategies, including enabling compression on log shipping agents and implementing retention policies for logs stored in CloudWatch and S3. Additionally, we will discuss VPC flow logs optimization, highlighting the learnings from custom formatting, dropping unnecessary fields, and the Parquet conversion process.
EBS optimization techniques such as migrating from GP2 to GP3, right-sizing volume sizes, and configuring IOPS settings will be examined. Furthermore, we will discuss how Keda autoscaling and resource optimization in Kubernetes and RDS improved efficiency and cost effectiveness.
Cost Control and Proactive Monitoring
Proactive monitoring played a vital role in cost optimization. We will share insights on how we have tailored our optimizations by automating processes for lower environments and implementing targeted alerts for production. We will also deep dive into how lambda functions are implemented alongside our Cloud Custodian policies, and how we harness the capabilities of tools like Mailer to deliver notifications via SNS, email, or Slack.
We will also share insights on how we set up alerts for cost anomalies and budget utilization, leveraged AWS Cost Explorer for insights and debugging cost anomalies, and utilized Power BI dashboards to gain comprehensive visibility into cost trends. Regular architectural reviews became a cornerstone of our cost optimization process.
Looking Ahead : Future Initiatives
We will discuss potential future initiatives, including the adoption of SQS long polling, exploration of Graviton and Intel instances for enhanced performance and cost benefits, and the implementation of a cell-based architecture to optimize communication within Availability Zones. Additionally, we will explore the automation of resource rightsizing as a focus area for further improvement and consider spot instance adoption in production environments.
In conclusion, this talk will provide valuable insights into the cost optimization journey of Freshworks, highlighting strategies, techniques, and lessons learned as we transitioned from startup to scale.
Draft Presentation (In Progress) : https://docs.google.com/presentation/d/1CYyFzRzTp_8hNz_pV6-DN4L4al_hJ5VJaQG9yKvJo6w/edit?usp=sharing