Tickets

Loading…

Mukund Tripathi

Ensuring Data Quality with Data Contracts and OpenLineage

Submitted Jun 10, 2024

Abstract

In the modern data landscape, ensuring data quality and integrity is paramount. This conference will explore the concept of Data Contracts as a schema registry, incorporating data quality (DQ) checks and leveraging OpenLineage to capture compliance failures. By implementing Data Contracts, organizations can enforce strict data quality standards and track lineage to understand the impact of any discrepancies. This approach not only enhances data reliability but also provides clear visibility into data workflows, facilitating better decision-making and accountability.

  1. Introduce Data Contracts: Explain the concept of Data Contracts and their role as a schema registry to enforce data standards.
  2. Implement Data Quality Checks: Demonstrate how to incorporate data quality checks within Data Contracts to ensure data integrity.
  3. Leverage OpenLineage: Explore the use of OpenLineage for capturing and visualizing data lineage, highlighting the impact of DQ compliance failures.
  4. Practical Applications: Showcase real-world examples and case studies where Data Contracts and OpenLineage have improved data governance and quality.
  5. Future Trends: Discuss the future of data quality management and the evolving role of schema registries and lineage tracking in data ecosystems.

Audience

  • Data Engineers: Professionals responsible for designing, building, and maintaining data pipelines.
  • Data Scientists: Individuals focused on extracting insights from data and ensuring the quality of their analyses.
  • Data Analysts: Analysts who rely on high-quality data for accurate reporting and decision-making.
  • Data Governance Teams: Teams focused on ensuring data policies, standards, and compliance are met within an organization.
  • IT Managers: Managers overseeing data infrastructure and operations.
  • Compliance Officers: Professionals ensuring data practices comply with relevant regulations and standards.

Agenda

Part 1: Introduction and Fundamentals

  • The Importance of Data Quality in Modern Organizations
  • Understanding Data Contracts: Definition and Benefits
  • Integrating Data Quality Checks within Data Contracts
  • Introduction to OpenLineage: Concepts and Architecture
  • Capturing Compliance Failures with OpenLineage
  • Real-world Applications of Data Contracts and OpenLineage

Part 2: Advanced Techniques and Case Studies

  • Implementing Data Contracts in Your Organization: Best Practices
  • Automating Data Quality Checks: Tools and Techniques
  • Visualizing Data Lineage: Tools and Strategies
  • How customer X Improved Data Quality with Data Contracts
  • Setting up and leveraging OpenLineage for Impact Analysis and Operational Metadata for customer Y
  • The Evolving Role of Schema Registries and Lineage Tracking in Data Ecosystems

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid Access Ticket

Hosted by

All about data science and machine learning

Supported by

Gold Sponsor

Atlassian unleashes the potential of every team. Our agile & DevOps, IT service management and work management software helps teams organize, discuss, and compl

Silver Sponsor