Privacy Attacks in Machine Learning Systems - Discover, Detect and Defend
My name is Upendra Singh. I work at Twilio as an Architect. As a part of this talk proposal I would like to shed some light on the new kind of attacks machine learning systems are facing nowadays - Privacy Attacks. During the talk we will explain and demonstrate how to discover, detect and defend Privacy related vulnerabilities in our machine learning models. Will also explain why it is so critical to have solid Model Governance to manage the risks associated with these kinds of vulnerabilities. One of the main objectives of model governance is to manage risks associated with machine learning models. How safe and secure machine learning models are? And we are not talking about model security from an exposed api point of view but from a privacy point of view. Let’s try to understand what we exactly mean by that.
Fueled by large amounts of available data and hardware advances, machine learning has experienced tremendous growth in academic research and real world applications. At the same time, the impact on the security, privacy, and fairness of machine learning is receiving increasing attention. In terms of privacy, our personal data are being harvested by almost every online service and are used to train models that power machine learning applications. However, it is not well known if and how these models reveal information about the data used for their training. If a model is trained using sensitive data such as location, health records, or identity information, then an attack that allows an adversary to extract this information from the model is highly undesirable. At the same time, if private data has been used without its owners’ consent, the same type of attack could be used to determine the unauthorized use of data and thus work in favor of the user’s privacy.
Apart from the increasing interest on the attacks themselves, there is a growing interest in
uncovering what causes privacy leaks and under which conditions a model is susceptible to
different types of privacy-related attacks. There are multiple reasons why models leak information. Some of them are structural and have to do with the way models are constructed, while others are due to factors such as poor generalization or memorization of sensitive data samples. Training for adversarial robustness can also be a factor that affects the degree of information leakage.
Data Protection regulations, such as GDPR, and AI governance frameworks require personal data to be protected when used in AI systems, and that the users have control over their data and awareness about how it is being used. For projects involving machine learning on personal data, it is mandatory from Article 35 of GDPR to perform a Data Protection Impact Assessment (DPIA). Thus, proper mechanisms need to be in place to quantitatively evaluate and verify the privacy of individuals in every step of the data processing pipeline in AI systems.
In this talk we will focus on:
1. What are different types of attacks on machine learning systems?
attacks against integrity, e.g., evasion and poisoning backdoor attacks that cause misclassification of specific samples,
attacks against a system’s availability, such as poisoning attacks that try to maximize the misclassification error
attacks against privacy and confidentiality, i.e., attacks that try to infer information about user data and models(in this talk we will focus, demonstrate and discuss about these types of attacks)
- How to do threat modeling in any ML project from a privacy point of view for Machine Learning Models? Here we will define and explain terminology which we will use for the rest of the discussion.
- What are different types of attacks on machine learning models impacting privacy?We will demonstrate such an attack in action using a demo.
The attacks are categorized in following groups:
Membership Inference attack: This type attack tries to determine whether an input sample was part of the training set
Reconstruction attack: This type of attack tries to recreate one or more training samples and/or their respective training labels.
Property Inference attack: This type of attack tries to extract dataset properties which were not explicitly encoded as features or were not correlated to the learning task.
Model extraction attack: This is a type of black box attack where the attacker tries to extract information and potentially fully reconstruct a model.
- What are the causes(in the design of the architecture of machine learning models) which lead to different types of attacks possible on machine learning models?
- How are these attacks implemented? How are these attacks implemented under different kinds of learning(centralized, distributed) settings? Hands on demo of the same and the techniques used. It is critical to understand how these attacks are implemented in order to avoid them. Just as a network security expert has to think like a hacker and understand the craft of hacking to better design network security we need to have a similar mindset while designing machine learning models to avoid privacy vulnerabilities.
- How to detect whether your existing machine learning models are susceptible to such attacks and quantifying the same for the DPIA(Data Protection and impact assessment)? Will provide a hands on demo for the same and the technique to quantify the same.
- How to defend against different kinds of attacks by applying state of the art techniques? For example:
Differential Privacy Techniques
Prediction vector tampering
I believe in explaining by doing. Whole talk will be sprinkled with hands on demonstrations wherever possible to explain the concept better.