Nov 2023

13 Mon

14 Tue

15 Wed

16 Thu

17 Fri 09:00 PM – 10:25 PM IST

18 Sat 09:50 PM – 10:25 PM IST

19 Sun 09:00 PM – 09:40 PM IST

Nov 2023

20 Mon 09:00 PM – 10:10 PM IST

21 Tue 09:00 PM – 09:45 PM IST

22 Wed 09:00 PM – 09:40 PM IST

23 Thu

24 Fri 10:00 AM – 05:20 PM IST

25 Sat

26 Sun

Bangalore International Centre (BIC), Bengaluru,

Tickets

Select Tickets
Payment
invoice
Attendee details

Membership

Rootconf annual membership

Rootconf membership is valid for one year - 12 months. The member get the following benefits:

Participation in all online peer review sessions.
Access to all recordings from online reviews.
Priority access to all offline meet-ups and online workshops hosted by Rootconf during the one year period.
Access to Rootconf’s annual conference on 16 and 17 May 2025 in Bangalore - in-person and virtually (via live stream).

Corporate Members-only benefits (bulk ticket purchase):

Transfer of memberships across individuals in the organization.

Memberships can be cancelled within 1 hour of purchase.

₹4300

Sale at this price closes on December 31, 2025

Bulk purchase (10+)

Bulk purchase (5+)

Total ₹0

Cancellation and refund policy

Memberships can be cancelled within 1 hour of purchase

Workshop tickets can be cancelled or transferred upto 24 hours prior to the workshop.

For further queries, please write to us at support@hasgeek.com or call us at +91 7676 33 2020.

All submissions

This submission has been added to the schedule

Deep dive into analyzing high cardinality metrics

Submitted Sep 30, 2023

Introduction

Metrics are the fundamental unit of time series databases (TSDB). They consist of labels which denote dimensions e.g. http_status_code, url, etc. Critical insights require metrics to have both breadth (e.g. large number of labels) and depth (e.g. each label having a large number of unique values). This makes up metric cardinality. Higher cardinality == deeper insights. This talk will assume Prometheus-like systems for reference.

Problem

In today’s world of microservices, distributed services having dimensions like tenant, region, and service invariably lead to high cardinality. Unchecked cardinality growth can lead to adverse effects like higher resource consumption, slow load up of dashboards, alerting queries failing, observability systems going blank, reduced retention time, etc.

Avoiding these situations at the enterprise scale requires solutions to understand what is causing high cardinality and how to manage it. This is often an afterthought which ignores the tooling to answer these questions:

How to find your TSDB cardinality limits?
How do you know when your system is approaching these limits?
When it does approach them, how to find out which metrics are contributing to it?
How to dissect these metrics to find the labels which led to cardinality explosion?
What actions need to be taken to fix this?

Solution

The default solutions hover around finding labels with high cardinality and dropping them. But I have seen that in customer production environments, blindly dropping labels gives a false sense of belief of fixing the problem without actually solving anything. I wrote a tool to analyse high cardinality systems, get insights out of them to answer these questions:

What is the overall state of my system - are there any metrics approaching limits?
Which labels of which metrics have the probability of causing cardinality explosion?
If you choose to drop/aggregate these labels - what will the end state look like?
How not to think about dropping labels as the default solution e.g. there are corner cases where dropping the label with highest cardinalty has zero impact on reduction.

Benefits

The audience of this talk will have the following take aways:

Fundamental knowledge to question assumptions on approaching systems with high cardinality metrics.
A handy open source cardinality debugger/explorer well tested in customer production environments to analyze your Prometheus like TSDB systems and have the right numbers upfront which will help you choose where to invest your time - updating your instrumentation code, changing your metric agent configuration, etc.

All submissions

Comments

Zainab Bawa

@zainabbawa Editor & Promoter
Hi Preeti, summarizing the feedback from the editors:

High cardinality metrics is a problem in mostly environments that are on the path of scaling. This will be a value add talk, with practical applications for the audience.
The talk, though from Last9.io, is not talking about Last9's offerings, but an open source tool developed by them. Seems very interesting and useful.

Posted 1 year ago
Share
Copy link
Email
Twitter
Facebook
Linkedin

Preeti

@preetidewani Submitter
Hi Sitaram Shelke,
Thank you for getting back to me.
1. Can you please include the links to the tool you mentioned ?
- I am preparing the document of the tool and polishing it, will share it in next 2 weeks.
1. Also please mention the required preliminary knowledge needed by the audience to understand what you will be talking about.
- Basic knowledge of SRE is good enough.
1. Will there be a live/recorded demo of the tool you mention?
- Yes, I am preparing a demo for the tool.
Can you please help me with the next steps?
Posted 1 year ago
Share
Copy link
Email
Twitter
Facebook
Linkedin
- SS
  
  Sitaram Shelke
  
  @sitaram Editor
  Hi Preeti,
  Please update here once you have prepared the document. I'd suggest not to stress to much on polishing, as we are still reviewing the proposal at this stage.
  
  Posted 1 year ago
  
  Share
  Copy link
  Email
  Twitter
  Facebook
  Linkedin

SS

Sitaram Shelke

@sitaram Editor
Hello Preeti
Thanks for the detailed writeup.
The topic would suite nicely about the scaling and performance track.
1. Can you please include the links to the tool you mentioned ?
2. Also please mention the required preliminary knowledge needed by the audience to understand what you will be talking about.
3. Will there be a live/recorded demo of the tool you mention?
  I think it would be important for audience to learn something more than they could visit the documentation of the tool and apply the technique as is.
Posted 1 year ago
Share
Copy link
Email
Twitter
Facebook
Linkedin

Nov 2023

13 Mon

14 Tue

15 Wed

16 Thu

17 Fri 09:00 PM – 10:25 PM IST

18 Sat 09:50 PM – 10:25 PM IST

19 Sun 09:00 PM – 09:40 PM IST

Nov 2023

20 Mon 09:00 PM – 10:10 PM IST

21 Tue 09:00 PM – 09:45 PM IST

22 Wed 09:00 PM – 09:40 PM IST

23 Thu

24 Fri 10:00 AM – 05:20 PM IST

25 Sat

26 Sun

Hybrid access (members only)

Hosted by

Rootconf

We care about site reliability, cloud costs, security and data privacy

SRE Conf 2023

Membership

Corporate Members-only benefits (bulk ticket purchase):

Deep dive into analyzing high cardinality metrics

Introduction

Problem

Solution

Benefits

Comments

Zainab Bawa

@zainabbawa Editor & Promoter

Preeti

@preetidewani Submitter

Sitaram Shelke

@sitaram Editor

Sitaram Shelke

@sitaram Editor