MLOps Conference

MLOps Conference

On DataOps, productionizing ML models, and running experiments at scale.

Tickets

Loading…

lavanya TS

@lavanyats

Fairness in ML: How do we build unbiased ML workflows?

Submitted Jun 29, 2021

Biases often arise in automated workflows based on MachineLearning models due to erroneous assumptions made in the learning process. Examples of such biases involve societal biases such as gender bias, racial bias, age bias and so on.

In this 15 minute talk, we hope to cover prominent sources of such biases that lead to ML models producing unwanted outcomes. We will also look at ways to detect and measure such biases in our production workflows.

Outline:

Introduction: Why do we care about biases? (1 min)

Sources of biases (3 minutes)
→ Specification Bias
→ Sampling Bias
→ Measurement Bias
→ Annotator/ Label Biases
→ Inherited Biases from other ML models

Metrics to measure biases(5 minutes)
(Will cover classification use-case only - since it might be hard to do more here)
→ TPR across Groups
→ FPR across Groups
→ Accuracy across groups
→ Demographic Parity

How to avoid biased ML workflows(3 mins)

→ Techniques for Debiasing Data going into model
→ Techniques involving Post Processing model outcome

If time permits: (3 mins)
Biases in word embeddings: Case study: Examples from word2vec embeddings

Closing Remarks:
What is fair and what is not is contextual.
Importance of Human inputs/judgement in designing debiasing techniques in ML workflows

Slides in Progress
Here is a more detailed article with the slide deck that was presented on this topic: https://machinelearninginterview.com/topics/machine-learning/bias-and-fairness-in-ml-pipelines/

Supported by

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid access (members only)

Hosted by

Jump starting better data engineering and AI futures

Supported by

Scribble Data builds feature stores for data science teams that are serious about putting models (ML, or even sub-ML) into production. The ability to systematically transform data is the single biggest determinant of how well these models do. Scribble Data streamlines the feature engineering proces… more

Promoted

Deep dives into privacy and security, and understanding needs of the Indian tech ecosystem through guides, research, collaboration, events and conferences. Sponsors: Privacy Mode’s programmes are sponsored by: more