Object storage for new use cases through Indexes on lakehouses

Nov 2024

18 Mon

19 Tue

20 Wed

21 Thu

22 Fri 09:00 AM – 05:10 PM IST

23 Sat

24 Sun

Bangalore International Centre, Bengaluru

Tickets

All submissions

Previous Next

Object storage for new use cases through Indexes on lakehouses

Submitted Oct 24, 2024

Submission type: 40 min talk Track in which your submission fits: Systems engineering

ABSTRACT:

Object storage has been around for a long time. While it is a cheap and scalable storage option, it has been traditionally limited to use cases such as storing unstructured data, or as a blob storage for binary data. With data footprints growing at an exponential rate, object storage is being used for a class of use cases that were previously thought to be impossible. While the most well-known example is the use of object storage for SQL analytics on structured data, engineering teams are exploring its use for use cases involving logs and metrics, IOT and sensor data, geospatial data, vector search, etc.

In this talk, we will explore an option where a few techniques can help Object Storage process large amounts of data with low latency and high concurrency requirements. Specifically, we will explore the world of Indices, table format and Vectorized data fetching that can help achieve this goal.

We will also talk about heterogeneous data and how they can be unified at retrieval time to build a fit-for-purpose resultset, and how indexing helps achieve these.

We will show some benchmarks around the experiments that we have been running around information retrieval at scale, on various cloud platforms.

KEY TAKEAWAYS:

Learn about database internals around Indices (we will go into some depth on 2 types of database indices)
Get some insights on some of the limitations that may hit you when you try to access data at scale from Object Stores
Get introduced to some useful write patterns that can help simplify retrieval

AUDIENCE:

Data Engineers - Individuals who are building data infrastructure and platforms typically handle large-scale data processing and relevant workloads.
Cloud Architects - Those who build the ideal strategy for various use cases that require information retrieval or analytics on large datasets stored in an Object Store.
Database internals developers/enthusiasts - Anyone who builds databases or is interested in building one or even just curious about how Lakehouse engines work their way around large data.

All submissions

Previous Next

Comments

Nov 2024

18 Mon

19 Tue

20 Wed

21 Thu

22 Fri 09:00 AM – 05:10 PM IST

23 Sat

24 Sun

Hybrid Access Ticket

Hosted by

Rootconf

We care about site reliability, cloud costs, security and data privacy

Supported by

Platinum Sponsor

Nutanix Technologies India Private Limited

Nutanix is a global leader in cloud software, offering organizations a single platform for running apps and data across clouds.

Platinum Sponsor

PhonePe Private Limited

PhonePe was founded in December 2015 and has emerged as India’s largest payments app, enabling digital inclusion for consumers and merchants alike.

Silver Sponsor

e6data

The next-gen analytics engine for heavy workloads.

Sponsor

Swiggy

Community sponsor

Peak XV Partners

Peak XV Partners (formerly Sequoia Capital India & SEA) is a leading venture capital firm investing across India, Southeast Asia and beyond.

Venue host - Rootconf workshops

Thoughtworks

Thoughtworks is a pioneering global technology consultancy, leading the charge in custom software development and technology innovation.

Community Partner

FOSS United Foundation

FOSS United is a non-profit foundation that aims at promoting and strengthening the Free and Open Source Software (FOSS) ecosystem in India. more

Community Partner

Rust Bangalore

A community of Rust language contributors and end-users from Bangalore. We have presence on the following telegram channels https://t.me/RustIndia https://t.me/fpncr LinkedIn: https://www.linkedin.com/company/rust-india/ X/Twitter: https://x.com/IndiaRust more

Rootconf Mini 2024 (on 22nd & 23rd Nov)

Object storage for new use cases through Indexes on lakehouses

Comments