The Fifth Elephant 2024 Annual Conference (12th &13th July)

Maximising the Potential of Data — Discussions around data science, machine learning & AI

Adarsh Mysore Thimmappa

Jira cloud data extraction @ scale

Submitted Jun 3, 2024

Cloud data extraction is a subset of the broader data engineering field that involves the process of retrieving or pulling data from cloud-based applications and services for analysis, reporting, or storage in a centralized data repository.
Atlassian’s data extraction solution has evolved significantly over the years to meet the demands of enterprise-grade customers. Initially started with full tenant database copying, the process has transitioned to batch extracting data over RESTful APIs via dedicated data extraction services and now utilizes Atlassian’s own streaming solution, the Lithium platform. This evolution allowed for efficient extraction, transformation, and ingestion of data at scale. Two types of data extraction are common: Full Data Extract (FDE) and Partial/Selective Data Extract (PDE). FDE involves copying the entire tenant database, while PDE offers flexibility by allowing the selection of specific data during extraction. This evolution in cloud data extraction architecture has ensured a robust and efficient solution for enterprise-grade customers who has to deal with a lot of data.

A couple of enterprise grade use cases to highlight the purpose of scalable data extraction solution include

  • Move data from one instance of the cloud to another cloud
  • Backup data from one cloud instance to restore data into same or another cloud instance on-demand basis.

Outline

  • The early days of data extraction
  • Evolution of cloud data extraction approach over the last couple of years
  • Challenges associated with the evolution of data extraction
  • Path to achieve cloud data extraction at scale

Impact

Scalable data extraction approach resulted in handling the large data set without compromising performance, optimizing resource utilization among other benefits.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures

Supported by

Gold Sponsor

Atlassian unleashes the potential of every team. Our agile & DevOps, IT service management and work management software helps teams organize, discuss, and compl

Silver Sponsor

Together, we can build for everyone.

Workshop sponsor

Datastax, the real-time AI Company.

Lanyard Sponsor

We reimagine the way the world moves for the better.

Sponsor

MonsterAPI is an easy and cost-effective GenAI computing platform designed for developers to quickly fine-tune, evaluate and deploy LLMs for businesses.

Community Partner

FOSS United is a non-profit foundation that aims at promoting and strengthening the Free and Open Source Software (FOSS) ecosystem in India. more

Beverage Partner

BONOMI is a ready to drink beverage brand based out of Bangalore. Our first segment into the beverage category is ready to drink cold brew coffee.