The Fifth Elephant 2024 Annual Conference (12th &13th July)

Maximising the Potential of Data — Discussions around data science, machine learning & AI

Vinish Reddy

@vinish

Apache XTable (Incubating): Interoperability across table formats

Submitted Jun 3, 2024

Apache Hudi, Delta Lake, and Iceberg are leading open-source projects that offer decoupled storage with transactional and metadata layers, known as table formats in cloud storage. These formats store data in open columnar formats like Parquet and include metadata for schema, commit history, partitions, and column statistics. Selecting a table format can be challenging due to the unique features of each project. Enter XTable—an open-source project that ensures seamless interoperability between table formats. Instead of creating a new format, XTable provides abstractions for translating metadata, enabling data to be written in any format and converted for use by various compute engines. This session will showcase XTable’s solution to the challenges of format selection and interoperability in lakehouse workloads, including a live demonstration of XTable in action.
https://github.com/apache/incubator-xtable

Speaker: Vinish Reddy
https://www.linkedin.com/in/vinish-reddy-pannala-868702108/

Who is the audience for this talk?
Data Engineers building data lake/lakehouse in their organisations.

What is the problem you are trying to solve ?
Inter-operability across table formats, catalogs and query engines.

What is the scope of this talk i.e., what content will you cover in this talk?
We will start with intro about table formats and inter-operability, then deep dive into XTable. This will be followed by a live demo. In the end future road map for the OSS project will be discussed.

How will participants benefit from your talk?
Particaptns can learn invaluable insights in building data lakes and table formats, hoping that it also excites them to contribute to Apache OSS projects.

Outline

  1. What are Table Formats ? What is Inter-Operability ?
  2. Intro to Apache XTable(Incubating)
  3. Deep dive on XTable and problems that can be solved using XTable.
  4. Demo and XTable in action.
  5. RoadMap and Future goals for the OSS project.

Impact

More info can be found here.
https://xtable.apache.org/
https://siliconangle.com/2023/11/15/onehouse-open-sources-onetable-data-tool-support-google-microsoft/

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures

Supported by

Gold Sponsor

Atlassian unleashes the potential of every team. Our agile & DevOps, IT service management and work management software helps teams organize, discuss, and compl

Silver Sponsor

Together, we can build for everyone.

Workshop sponsor

Datastax, the real-time AI Company.

Lanyard Sponsor

We reimagine the way the world moves for the better.

Sponsor

MonsterAPI is an easy and cost-effective GenAI computing platform designed for developers to quickly fine-tune, evaluate and deploy LLMs for businesses.

Community Partner

FOSS United is a non-profit foundation that aims at promoting and strengthening the Free and Open Source Software (FOSS) ecosystem in India. more

Beverage Partner

BONOMI is a ready to drink beverage brand based out of Bangalore. Our first segment into the beverage category is ready to drink cold brew coffee.