Solving for explainability of fraud detection models

Aug 2023

7 Mon

8 Tue

9 Wed

10 Thu

11 Fri 09:00 AM – 06:00 PM IST

12 Sat

13 Sun

Bangalore International Centre (BIC), Bengaluru

Tickets

All submissions

Previous Next

This submission has been added to the schedule

This video is for members only

Solving for explainability of fraud detection models

Submitted Jun 16, 2023

Problem :
At the TnS(Trust and Safety) team at Swiggy, building powerful fraud detection models that operate at high precision while still capturing maximum fraud has been the uber goal. Our system currently operates at a high level of complexity through various interventions, modelling techniques, and semi-supervised training methods while maintaining robustness.
For the final downstream model, we have always relied on tree-based learners over neural networks. Since are data is primarily tabular in nature, tree-based learners outperformed DNNs significantly on the winning metrics. While tree-based learners are great performers in terms of the final metrics that we’re looking to optimise, it has a few challenges:
1. It inherently restricts us from trying out more complex data structures like images or sequential data, we have tried to integrate such signals through a separate model whose final score is fed into the tree based learner but it significantly adds to complexity of the system.
2. A major press point for Fraud models historically has been a lack of explainability in predictions. We have experimented with LIME and SHAP-based approaches to build an explainable overhead but they’re computationally expensive to run for each record.

Solution:
While tree-based methods for a deployable model have all these challenges, what works in their favour is that they have historically outperformed DL-based methods by a significant margin. This changes with TabNet, in the original paper(Ref), authors claim that TabNet can match or even outperform tree-based methods while also giving sample-level explainability, which we can also visualise. We explored a tabnet based model for our approach and found it to be on par with tree-based counterpart(xgboost). TabNet also allowed us to compute and store feature level attention within the model logs without any computational overhead.

Outline:
In the presentation, we’ll be going through the following in depth.
Current pipeline and solution
Challenges in depth
Motivation for TabNet and what it unlocks
Experimental results and conclusion

All submissions

Previous Next

Comments

Nischal HP

@nischalhp Editor
Hello Meghana Negi,

Thank you for this proposal, the outline looks great. If I could add a small addition to the proposal, it would be, to talk about how Explainability is used by the team at Swiggy who are consumers of the model and the approaches and why explainability has such a big impact on business? Otherwise, it looks great.

We will get back to you shortly with the next steps :)

Posted 1 year ago
Share
Copy link
Email
Twitter
Facebook
Linkedin

Aug 2023

7 Mon

8 Tue

9 Wed

10 Thu

11 Fri 09:00 AM – 06:00 PM IST

12 Sat

13 Sun

Hybrid access (members only)

Hosted by

The Fifth Elephant

Jump starting better data engineering and AI futures

Supported by

LlamaIndex

E2E Networks Limited

E2E Cloud is India's first AI hyper scaler, a cloud computing platform providing accelerated cloud-based solutions at maximum optimization and lowest pricing

The Fifth Elephant 2023 Monsoon

Solving for explainability of fraud detection models

Comments

Nischal HP

@nischalhp Editor