Srinivas Devaki

Srinivas Devaki

@srioptiowl

Art of Caching: Ways, Wins, Woes, Weird, Wisdom

Submitted Sep 20, 2024

TLDR

An advanced exploration of war stories from building caching systems at a decacorn.

Description

There is a reason why the following quote is famous:

“There are only two hard things in Computer Science: cache invalidation and naming things.” — Phil Karlton

Caching is everywhere; it could be a simple system of storing only the most popular books in a shop and keeping all other books in the backroom, or the sophisticated ways we build caching in software systems. This ranges from a simple cache layer on top of a database to using complex data structures like bloom filters as a cache for more expensive checks to the storage layer.

This is a low to medium engagement talk on war stories in building caching systems.

Topics Covered

Ways

Exploring various forms of caching systems and characteristics of a caching system.

Examples: Multiple cache layers, efficiency gains, eviction policies, metrics, consistency.

Wins

While a cache seems like a simple concept, there are numerous innovative ways a caching system gets deployed at scale.

Examples: Saving on latency, saving on cost, serving as a fallback, predictable response times, improved resiliency.

Woes

Despite the benefits, caching systems can introduce complexities and challenges that require careful consideration.

Examples: Cache Stampede, Bi-Modal Behavior, Timeout Propagation Issues, Cache Poisoning, Key Collisions

Weird

Unusual and unexpected behaviors that emerged when working with caching systems at scale.

Examples: Caching Increasing Response Time, Infinite Loops on Cache Invalidation, Negative Caching, Self-Immolating Caches, Cache Misses Triggering Batch Jobs

Wisdom

Best practices and lessons learned from building and maintaining caching systems.

Examples: Effective Key Management, Monitoring and Profiling, Handling Cache Errors Gracefully, Avoiding Misuse of Caching Systems, Data Serialization Strategies

Target Audience & Prerequisites

The talk targets slightly experienced folks (1–2+ years of experience in backend) or even those who are early in their career if they have spent some time in systems thinking.

In terms of prerequisites, AWS has built a set of pretty good resources around caching: https://aws.amazon.com/caching/. The Systems Design Roadmap also provides a path to start more research around caching: https://roadmap.sh/system-design.

For those interested in a more advanced perspective, this paper is a very good entry point around patterns and behaviors observed when managing large-scale caching systems (both in terms of infrastructure and various use cases): https://www.usenix.org/conference/osdi20/presentation/yang

Speaker Intro

Srinivas is the founder of Opti Owl, a cloud cost optimization startup dedicated to enhancing system performance and reducing expenses. Formerly an SRE Team Lead at Zomato—a decacorn processing over 3 million orders daily with more than 300 microservices and over 1,000 engineers—Srinivas built systems and processes that helped Zomato grow from 10,000 orders per day to over 3 million orders per day in a resilient way. Passionate about sharing deep insights into the complexities of caching, Srinivas brings valuable real-world experience to the discussion.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Bengaluru Systems Meetup

Supported by

Sponsor

Peak XV Partners (formerly Sequoia Capital India & SEA) is a leading venture capital firm investing across India, Southeast Asia and beyond.