This livestream is restricted
Already a member? Login with your membership email address
Jul 2025
14 Mon
15 Tue
16 Wed
17 Thu
18 Fri
19 Sat 08:45 AM – 05:55 PM IST
20 Sun
Submitted May 16, 2025
Ignoring data quality introduces significant risks such as flawed insights and poor business outcomes. This 30-minute talk moves beyond simplistic validation, offering a journey through a multi-tiered data quality assurance approach. We’ll start with foundational checks: schema validation, data volume monitoring, and defining value ranges (high/low thresholds) for immediate outlier detection. These establish baseline reliability.
Next, the presentation explores advanced techniques for ensuring high-quality data. It covers anomaly-based checks for unusual pattern identification. The talk also covers inter-dataset consistency (e.g. different datasets matching the expectations that we’d have in terms of overlap). We’ll see how we can use custom metrics to monitor data drift.
This talk is designed to equip attendees with a clear understanding of how to layer these different checks to create a comprehensive and resilient data quality framework. The presentation will cover practical implementation considerations within data stacks and showcase examples using popular open-source data quality frameworks.
If you’ve struggled with poor quality data leading to data cascades, this talk is for you.
Anay Nayak is a Solution Consultant at Sahaj Software with over 20 years of experience driving innovation and success in the design and delivery of large-scale enterprise projects across diverse domains. Over the last 6+ years, he has been actively working on building data platforms and integrating data science models to deliver reliable and actionable business insights.
Hosted by
Supported by
Gold sponsor
Gold sponsor
Bronze sponsor
Bronze sponsor
Community sponsor
Community Partner
Community partner
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}