India's Non-Personal Data (NPD) framework

Knowledge repo, archives and collaborations

Make a submission

Up next

NPD V2: Opt-out from de-anonymization is not a sufficient step for empowering citizens with consent.

Zainab Bawa


This summary is drawn from Suchana Seth’s response to NPD V1 and NPD V2 comparison, presented on 23 January. Suchana Seth is a co-founder of The Mindful AI Lab.

Below is the summary of the gaps with respect to consent, accountability and rights which Suchana presents from a public interest technologist point of view and from an organizational standpoint.

Context and problem framing: NPD V2 proposes an option to opt out of data anonymization. To assume that the average citizen is going to be able to give meaningful consent to the way that their data is shared, or de-anonymized, or consumed, or inferences that will be made from it, is a very strong assumption. If the government’s intention is to offer genuine privacy protection, then anonymization and consent proposals have to be examined very closely.

The similarity problem, and why consent is broken: The idea of consent is its network. If a group of people who are very similar to me, in a lot of different respects, and they choose to share their data while I don’t. Based on those similarities, a lot of inferences can be made about me that I’d rather not have had them made. So the fact that I withheld my consent is not as meaningful any longer in the context of the fact that other people similar to me have given consent for their data to be used. The idea of meaningful consent is impossible, in reality.

NPD’s proposal of High Value Datasets (HVDs) and consent: It is difficult to foresee all the different ways in which data can be used. So when I give my consent about my data being used, it always ideally should be time limited and purpose limited.
When I give consent, it is for a specific purpose. With this idea of the creation of High Value Datasets (HVDs), it means that once the HVD is created, there are other parties consuming or making inferences from that data. The power of my consent essentially gets eroded, because I only consented for a specific purpose. In NPD V2, the purpose is being extended beyond, with the HVD provision. Therefore, the idea of consent being sufficient protection for data and privacy is, in effect, quite weak.

Lack of data audits: NPD mentions the word audit in just two instances, and one of them is in the context of global regulatory frameworks. In the last three to four years, we have seen instances of machine learning biases being uncovered. These have almost always come from independent third party investigations by journalists and researchers in academia. There are many ways that potential biases can creep into machine learning algorithms, both because the underlying data that is used for training them could be biased and unrepresentative of the true communities that that data is drawn from, but also because the algorithm itself that might learn biased representations despite the training data being fairly balanced. What this means in practice is that there needs to be a process for organizations to do both internal audits and uncover biases like this, but also for trusted third parties to be able to come in and audit the ML processes and algorithms.
The second version of NPD does not place sufficient emphasis on the practices or the policies or the accountability mechanisms that can ensure that such audits happen, and that organizations follow the best practices that are required of them in order to protect privacy. This aspect of auditability, and the lack of emphasis on it, is of deep concern from an accountability point of view.

Lack of institutionalization of accountability and oversight mechanisms for protecting community rights: Both versions of NPD refer to protecting community rights in the data. But rights’ protection does not happen without accountability and oversight mechanisms. There isn’t clarity around how these mechanisms are going to be institutionalized, and how they are going to be operationalized. This is a big gap.
There is the related idea of co-designing of accountability mechanisms, rather than externally imposed accountability mechanisms - co-designed by all stakeholders and not just by a narrow set of stakeholders. NPD V2 does not refer to any of these ideas.

Operationalizing community rights: There are interesting examples across the world where people are trying to do this. For example, in cities like Amsterdam and Helsinki, they have a public AI register which is a way for civil society stakeholders to participate on an ongoing basis in terms of reviewing what are the different AI applications or different machine learning models that are deployed by the government in various contexts. These kind of public AI registers serve as the beginning of the springboard for an oversight or accountability mechanism.
It is good to incorporate ideas such as these to ensure more participation from civil society stakeholders because once the (NPD) law goes into effect, and HVDs become available for consumption, and developers and companies actually start building applications on top of this, what is going to be the ongoing audit or review or oversight process? As a community, and as industry stakeholders, we should all think about this very carefully.

Different kinds of audits: In response to a question about the different kinds of audits, Suchana explained the following:

  1. There are public interest algorithm audits that are intended to uncover potential sources of biases and machine learning algorithms from a public interest point of view.
  2. Audit from a compliance perspective, which one can think of as an adversarial audit, like how it is done in the financial sector. Here, you have an authorized set of government bodies or organizations which evaluate an audit you from a compliance perspective.
  3. Middle ground for trusted third-party audits for organizations that want to be proactive about making their AI and their machine learning practices more responsible. This is a technical as well as an ethical audit.
  4. There isn’t a dichotomy between ethical practices and technical best practices. The two are adjacent to each other. There is a point where failure to follow technical best practices can be considered negligent, bordering on criminal, depending on the gravity of the situation. And then there are ethical best practices for AI that are about adopting technical best practices and solutions that have already been proposed in the literature, or adopted by the rest of the industry.

It is a question of deciding how want to bring organization your closer to those best practice benchmarks.