The Fifth Elephant 2025 Annual Conference (18 & 19 July)
Less hype. More engineering.
Jul 2025
7 Mon
8 Tue
9 Wed
10 Thu
11 Fri
12 Sat
13 Sun
Jul 2025
14 Mon
15 Tue
16 Wed
17 Thu
18 Fri
19 Sat 08:45 AM – 05:50 PM IST
20 Sun
Submitted May 16, 2025
Ignoring data quality introduces significant risks such as flawed insights and poor business outcomes. This 30-minute talk moves beyond simplistic validation, offering a journey through a multi-tiered data quality assurance approach. We’ll start with foundational checks: schema validation, data volume monitoring, and defining value ranges (high/low thresholds) for immediate outlier detection. These establish baseline reliability.
Next, the presentation explores advanced techniques for ensuring high-quality data. It covers anomaly-based checks for unusual pattern identification. We’ll address “gradual drift”—subtle data distribution changes impacting models or analytics. The talk also covers inter-dataset consistency (e.g. different datasets matching the expectations that we’d have in terms of overlap ) and strategies for data uniqueness and deduplication.
This talk is designed to equip attendees with a clear understanding of how to layer these different checks to create a comprehensive and resilient data quality framework. The presentation will cover practical implementation considerations within modern data stacks and showcase illustrative examples using popular open-source data quality frameworks.
If you’ve struggled with poor quality data leading to data cascades, this talk is for you.
Anay Nayak is a Solution Consultant at Sahaj Software with over 19 years of experience driving innovation and success in the design and delivery of large-scale enterprise projects across diverse domains. Over the last 6+ years, he has been actively working on building data platforms and integrating data science models to deliver reliable and actionable business insights.
Hosted by
Supported by
Gold Sponsor
Gold Sponsor
Bronze Sponsor
Community partner
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}